[ANNOUNCE] Development progress of Apache Flink 1.11

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[ANNOUNCE] Development progress of Apache Flink 1.11

Piotr Nowojski-4

Hi community,

It has been more than 6 weeks since the previous announcement and as we are approaching the expected feature freeze we would like to share the Flink 1.11 status update with you. 

Initially we were aiming for the feature freeze to happen in late April (now), however it was recently proposed to be postponed by a couple of weeks to mid May. [0]

A lot of people in the community are working hard to complete promised features and there is good progress. We even have managed to already complete a couple of features. We have updated the features list from the previous announcement and we have highlighted features that are already done and also the features that are no longer aimed for Flink 1.11 release and will be most likely postponed to a later date.

Your release managers,
Zhijiang & Piotr Nowojski 


Features already done and ready for Flink 1.11

  • PyFlink

    • FLIP-96: Add Python ML API [54]

    • FLINK-14500: Fully support all kinds of Python UDF [55]

  • Runtime

    • FLIP-67: Support for cluster partitions [20]

    • FLIP-92: Add N-Ary input stream operator in Flink [24]

    • [FLINK-10742] Let Netty use Flink's buffers on downstream side [28]

    • [FLINK-15911][FLINK-15154] Support Flink work over NAT [39]

    • [FLINK-15672] Switch to Log4j2 by default [34]


Features not targeted for Flink 1.11 anymore:

  • SQL / Table

    • FLIP-91 Introduce SQL client gateway and provide JDBC driver [4]

    • FLIP-107: Reading table columns from different parts of source records  [7]

  • ML / Connectors

    • FLIP-72: Pulsar source / sink / catalog [49]

    • Update ML Pipeline API interface to better support Flink ML lib algorithms

  • PyFlink

    • Support running python UDF in docker workers

  • Runtime

    • [FLINK-15786] Use the separated classloader to load connectors’ jar [37]

    • Calculate required shuffle memory before allocating slots

  • State Backend:

    • Support getCustomizedState in KeyedStateStore [47]


Features still in progress for Flink 1.11:

  • SQL / Table

    • FLIP-65: New type inference for Table API UDFs [2]

    • FLIP-84: Improve TableEnv’s interface [3]

    • FLIP-93: Introduce JDBC catalog and Postgres catalog [5]

    • FLIP-105: Support to interpret and emit changelog in Flink SQL [6]

    • [FLINK-14807] Add Table#collect API for fetching data [8]

    • Support query and table hints

  • ML / Connectors

    • FLIP-27: New source API [9]

    • [FLINK-15670] Wrap a source/sink pair to persist intermediate result for subgraph failure recovery [10]

  • PyFlink

    • FLIP-106, FLIP-114: Expand the usage scope of Python UDF [12][50]

    • FLIP-112: Debugging and monitoring of Python UDF [11]

    • FLIP-97, FLIP-120: Integration with most popular Python libraries (Pandas) [51][52]

    • FLIP-121 Performance improvements of Python UDF [53]

  • Web UI

    • FLIP-98: Better back pressure detection [13]

    • FLIP-99: Make max exception configurable [14]

    • FLIP-100: Add attempt information [15]

    • FLIP-102: Add more metrics to TaskManager [16]

    • FLIP-103: Better TM/JM log display [17]

    • [FLINK-14816] Add thread dump feature for TaskManager [18]

  • Runtime

    • FLIP-56: Support for dynamic slots on the TaskExecutor [19]

    • FLIP-76: Unaligned checkpoints [21]

    • FLIP-83: Flink e2e performance testing framework [22]

    • FLIP-85: Support cluster deploy mode [23]

    • FLIP-108: Add GPU to the resource management (specifically for UDTF & UDF) [25]

    • FLIP-111: Consolidate docker images [26]

    • FLIP-116: Unified memory configuration for JobManager [56]

    • [FLINK-9407] ORC format for StreamingFileSink [27]

    • [FLINK-10934] Support per-job mode for Kubernetes integration [29]

    • [FLINK-11395] Avro writer for StreamingFileSink [30]

    • [FLINK-11427] Protobuf parquet writer for StreamingFileSink [31]

    • [FLINK-11499] Extend StreamingFileSink BulkFormats to support arbitrary roll policies [32]

    • [FLINK-14106] Make SlotManager pluggable [33]

    • [FLINK-15674] Consolidate Java and Scala type extraction stack [35]

    • [FLINK-15679] Improve Flink’s ID system [36]

    • [FLINK-15788] Various Kubernetes improvements [38]

    • [FLINK-16408] Bind user code class loader to lifetime of a slot [40]

    • [FLINK-16428] Network memory management for backpressure [41]

    • [FLINK-16430] Pipelined region scheduling [42]

    • [FLINK-16605] Specify upper bound for number of allocated TaskManagers [57]

  • State Backend:

    • [FLINK-5763] Make savepoint self-contained / relocatable [43]

    • [FLINK-8871] Complete checkpoint cancellation messages [44]

    • [FLINK-12692] Support disk spilling in HeapKeyedStateBackend [45]

    • [FLINK-15012] Cleanup of leftover files in HDFS/OSS/S3 [46]

    • Enable local recovery by default

    • [FLINK-15532] Enable strict capacity limit for memory usage for RocksDB [48]

  • Other:

    • FLIP-42: Restructure documentation (partially in Flink 1.12) [1]



[0] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Exact-feature-freeze-date-td40624.html

[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation

[2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-65%3A+New+type+inference+for+Table+API+UDFs

[3] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878

[4] https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway

[5] https://cwiki.apache.org/confluence/display/FLINK/FLIP-93%3A+JDBC+catalog+and+Postgres+catalog

[6] https://cwiki.apache.org/confluence/display/FLINK/FLIP-105%3A+Support+to+Interpret+and+Emit+Changelog+in+Flink+SQL

[7] https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Reading+table+columns+from+different+parts+of+source+records

[8] https://issues.apache.org/jira/browse/FLINK-14807


[9] https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface

[10] https://issues.apache.org/jira/browse/FLINK-15670

[11] https://cwiki.apache.org/confluence/display/FLINK/FLIP-112%3A+Support+User-Defined+Metrics+in++Python+UDF

[12] https://cwiki.apache.org/confluence/display/FLINK/FLIP-106%3A+Support+Python+UDF+in+SQL+Function+DDL

[13] https://cwiki.apache.org/confluence/display/FLINK/FLIP-98%3A+Better+Back+Pressure+Detection

[14] https://cwiki.apache.org/confluence/display/FLINK/FLIP-99%3A+Make+Max+Exception+Configurable

[15] https://cwiki.apache.org/confluence/display/FLINK/FLIP-100%3A+Add+Attempt+Information

[16] https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager

[17] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=147427143

[18] https://issues.apache.org/jira/browse/FLINK-14816

[19] https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation

[20] https://cwiki.apache.org/confluence/display/FLINK/FLIP-67%3A+Cluster+partitions+lifecycle

[21] https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints

[22] https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework

[23] https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode

[24] https://cwiki.apache.org/confluence/display/FLINK/FLIP-92%3A+Add+N-Ary+Stream+Operator+in+Flink

[25] https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink

[26] https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification

[27] https://issues.apache.org/jira/browse/FLINK-9407

[28] https://issues.apache.org/jira/browse/FLINK-10742

[29] https://issues.apache.org/jira/browse/FLINK-10934

[30] https://issues.apache.org/jira/browse/FLINK-11395

[31] https://issues.apache.org/jira/browse/FLINK-11427

[32] https://issues.apache.org/jira/browse/FLINK-11499

[33] https://issues.apache.org/jira/browse/FLINK-14106

[34] https://issues.apache.org/jira/browse/FLINK-15672

[35] https://issues.apache.org/jira/browse/FLINK-15674

[36] https://issues.apache.org/jira/browse/FLINK-15679

[37] https://issues.apache.org/jira/browse/FLINK-15786

[38] https://issues.apache.org/jira/browse/FLINK-15788

[39] https://issues.apache.org/jira/browse/FLINK-15911

[39] https://issues.apache.org/jira/browse/FLINK-15154

[40] https://issues.apache.org/jira/browse/FLINK-16408

[41] https://issues.apache.org/jira/browse/FLINK-16428

[42] https://issues.apache.org/jira/browse/FLINK-16430

[43] https://issues.apache.org/jira/browse/FLINK-5763

[44] https://issues.apache.org/jira/browse/FLINK-8871

[45] https://issues.apache.org/jira/browse/FLINK-12692

[46] https://issues.apache.org/jira/browse/FLINK-15012

[47] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-customize-state-in-customized-KeyedStateBackend-td32771.html

[48] https://issues.apache.org/jira/browse/FLINK-15532

[49] https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector

[50] https://cwiki.apache.org/confluence/display/FLINK/FLIP-114%3A+Support+Python+UDF+in+SQL+Client

[51] https://cwiki.apache.org/confluence/display/FLINK/FLIP-97%3A+Support+Scalar+Vectorized+Python+UDF+in+PyFlink

[52] https://cwiki.apache.org/confluence/display/FLINK/FLIP-120%3A+Support+conversion+between+PyFlink+Table+and+Pandas+DataFrame

[53] https://cwiki.apache.org/confluence/display/FLINK/FLIP-121%3A+Support+Cython+Optimizing+Python+User+Defined+Function

[54] https://cwiki.apache.org/confluence/display/FLINK/FLIP-96%3A+Support+Python+ML+Pipeline+API

[55] https://issues.apache.org/jira/browse/FLINK-14500

[56] https://cwiki.apache.org/confluence/display/FLINK/FLIP+116%3A+Unified+Memory+Configuration+for+Job+Managers

[57] https://issues.apache.org/jira/browse/FLINK-16605


Reply | Threaded
Open this post in threaded view
|

Re: [ANNOUNCE] Development progress of Apache Flink 1.11

Till Rohrmann
Thanks for the update Piotr.

Cheers,
Till

On Fri, Apr 24, 2020 at 4:42 PM Piotr Nowojski <[hidden email]> wrote:
Hi community,

It has been more than 6 weeks since the previous announcement and as we are
approaching the expected feature freeze we would like to share the Flink
1.11 status update with you.

Initially we were aiming for the feature freeze to happen in late April
(now), however it was recently proposed to be postponed by a couple of
weeks to mid May. [0]

A lot of people in the community are working hard to complete promised
features and there is good progress. We even have managed to already
complete a couple of features. We have updated the features list from the
previous announcement and we have highlighted features that are already
done and also the features that are no longer aimed for Flink 1.11 release
and will be most likely postponed to a later date.

Your release managers,
Zhijiang & Piotr Nowojski

Features already done and ready for Flink 1.11

   -

   PyFlink
   -

      FLIP-96: Add Python ML API [54]
      -

      FLINK-14500: Fully support all kinds of Python UDF [55]
      -

   Runtime
   -

      FLIP-67: Support for cluster partitions [20]
      -

      FLIP-92: Add N-Ary input stream operator in Flink [24]
      -

      [FLINK-10742] Let Netty use Flink's buffers on downstream side [28]
      -

      [FLINK-15911][FLINK-15154] Support Flink work over NAT [39]
      -

      [FLINK-15672] Switch to Log4j2 by default [34]


Features not targeted for Flink 1.11 anymore:

   -

   SQL / Table
   -

      FLIP-91 Introduce SQL client gateway and provide JDBC driver [4]
      -

      FLIP-107: Reading table columns from different parts of source
      records  [7]


   -

   ML / Connectors


   -

      FLIP-72: Pulsar source / sink / catalog [49]
      -

      Update ML Pipeline API interface to better support Flink ML lib
      algorithms
      -

   PyFlink
   -

      Support running python UDF in docker workers
      -

   Runtime
   -

      [FLINK-15786] Use the separated classloader to load connectors’ jar
      [37]


   -

      Calculate required shuffle memory before allocating slots
      -

   State Backend:
   -

      Support getCustomizedState in KeyedStateStore [47]


Features still in progress for Flink 1.11:

   -

   SQL / Table
   -

      FLIP-65: New type inference for Table API UDFs [2]
      -

      FLIP-84: Improve TableEnv’s interface [3]
      -

      FLIP-93: Introduce JDBC catalog and Postgres catalog [5]
      -

      FLIP-105: Support to interpret and emit changelog in Flink SQL [6]
      -

      [FLINK-14807] Add Table#collect API for fetching data [8]
      -

      Support query and table hints


   -

   ML / Connectors
   -

      FLIP-27: New source API [9]
      -

      [FLINK-15670] Wrap a source/sink pair to persist intermediate result
      for subgraph failure recovery [10]


   -

   PyFlink
   -

      FLIP-106, FLIP-114: Expand the usage scope of Python UDF [12][50]
      -

      FLIP-112: Debugging and monitoring of Python UDF [11]
      -

      FLIP-97, FLIP-120
      <https://docs.google.com/document/d/1rUZHxS7rguLi4oJNEAu6xcRJcW7ldIxAdewoIxoO5w8/edit#heading=h.ghlv7e457i4>:
      Integration with most popular Python libraries (Pandas) [51][52]
      -

      FLIP-121 Performance improvements of Python UDF [53]
      -

   Web UI
   -

      FLIP-98: Better back pressure detection [13]
      -

      FLIP-99: Make max exception configurable [14]
      -

      FLIP-100: Add attempt information [15]
      -

      FLIP-102: Add more metrics to TaskManager [16]
      -

      FLIP-103: Better TM/JM log display [17]
      -

      [FLINK-14816] Add thread dump feature for TaskManager [18]
      -

   Runtime
   -

      FLIP-56: Support for dynamic slots on the TaskExecutor [19]
      -

      FLIP-76: Unaligned checkpoints [21]
      -

      FLIP-83: Flink e2e performance testing framework [22]
      -

      FLIP-85: Support cluster deploy mode [23]
      -

      FLIP-108: Add GPU to the resource management (specifically for UDTF &
      UDF) [25]
      -

      FLIP-111: Consolidate docker images [26]
      -

      FLIP-116: Unified memory configuration for JobManager [56]
      -

      [FLINK-9407] ORC format for StreamingFileSink [27]
      -

      [FLINK-10934] Support per-job mode for Kubernetes integration [29]
      -

      [FLINK-11395] Avro writer for StreamingFileSink [30]
      -

      [FLINK-11427] Protobuf parquet writer for StreamingFileSink [31]
      -

      [FLINK-11499] Extend StreamingFileSink BulkFormats to support
      arbitrary roll policies [32]
      -

      [FLINK-14106] Make SlotManager pluggable [33]
      -

      [FLINK-15674] Consolidate Java and Scala type extraction stack [35]
      -

      [FLINK-15679] Improve Flink’s ID system [36]
      -

      [FLINK-15788] Various Kubernetes improvements [38]
      -

      [FLINK-16408] Bind user code class loader to lifetime of a slot [40]
      -

      [FLINK-16428] Network memory management for backpressure [41]
      -

      [FLINK-16430] Pipelined region scheduling [42]
      -

      [FLINK-16605] Specify upper bound for number of allocated
      TaskManagers [57]


   -

   State Backend:
   -

      [FLINK-5763] Make savepoint self-contained / relocatable [43]
      -

      [FLINK-8871] Complete checkpoint cancellation messages [44]
      -

      [FLINK-12692] Support disk spilling in HeapKeyedStateBackend [45]
      -

      [FLINK-15012] Cleanup of leftover files in HDFS/OSS/S3 [46]
      -

      Enable local recovery by default
      -

      [FLINK-15532] Enable strict capacity limit for memory usage for
      RocksDB [48]
      -

   Other:
   -

      FLIP-42: Restructure documentation (partially in Flink 1.12) [1]



[0]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Exact-feature-freeze-date-td40624.html

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation

[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-65%3A+New+type+inference+for+Table+API+UDFs

[3]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878

[4]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway

[5]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-93%3A+JDBC+catalog+and+Postgres+catalog

[6]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-105%3A+Support+to+Interpret+and+Emit+Changelog+in+Flink+SQL

[7]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Reading+table+columns+from+different+parts+of+source+records

[8] https://issues.apache.org/jira/browse/FLINK-14807

[9]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface

[10] https://issues.apache.org/jira/browse/FLINK-15670

[11]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-112%3A+Support+User-Defined+Metrics+in++Python+UDF

[12]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-106%3A+Support+Python+UDF+in+SQL+Function+DDL

[13]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-98%3A+Better+Back+Pressure+Detection

[14]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-99%3A+Make+Max+Exception+Configurable

[15]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-100%3A+Add+Attempt+Information

[16]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager

[17]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=147427143

[18] https://issues.apache.org/jira/browse/FLINK-14816

[19]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation

[20]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-67%3A+Cluster+partitions+lifecycle

[21]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-76%3A+Unaligned+Checkpoints

[22]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-83%3A+Flink+End-to-end+Performance+Testing+Framework

[23]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode

[24]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-92%3A+Add+N-Ary+Stream+Operator+in+Flink

[25]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink

[26]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image+unification

[27] https://issues.apache.org/jira/browse/FLINK-9407

[28] https://issues.apache.org/jira/browse/FLINK-10742

[29] https://issues.apache.org/jira/browse/FLINK-10934

[30] https://issues.apache.org/jira/browse/FLINK-11395

[31] https://issues.apache.org/jira/browse/FLINK-11427

[32] https://issues.apache.org/jira/browse/FLINK-11499

[33] https://issues.apache.org/jira/browse/FLINK-14106

[34] https://issues.apache.org/jira/browse/FLINK-15672

[35] https://issues.apache.org/jira/browse/FLINK-15674

[36] https://issues.apache.org/jira/browse/FLINK-15679

[37] https://issues.apache.org/jira/browse/FLINK-15786

[38] https://issues.apache.org/jira/browse/FLINK-15788

[39] https://issues.apache.org/jira/browse/FLINK-15911

[39] https://issues.apache.org/jira/browse/FLINK-15154

[40] https://issues.apache.org/jira/browse/FLINK-16408

[41] https://issues.apache.org/jira/browse/FLINK-16428

[42] https://issues.apache.org/jira/browse/FLINK-16430

[43] https://issues.apache.org/jira/browse/FLINK-5763

[44] https://issues.apache.org/jira/browse/FLINK-8871

[45] https://issues.apache.org/jira/browse/FLINK-12692

[46] https://issues.apache.org/jira/browse/FLINK-15012

[47]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-customize-state-in-customized-KeyedStateBackend-td32771.html

[48] https://issues.apache.org/jira/browse/FLINK-15532

[49]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector

[50]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-114%3A+Support+Python+UDF+in+SQL+Client

[51]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-97%3A+Support+Scalar+Vectorized+Python+UDF+in+PyFlink

[52]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-120%3A+Support+conversion+between+PyFlink+Table+and+Pandas+DataFrame

[53]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-121%3A+Support+Cython+Optimizing+Python+User+Defined+Function

[54]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-96%3A+Support+Python+ML+Pipeline+API

[55] https://issues.apache.org/jira/browse/FLINK-14500

[56]
https://cwiki.apache.org/confluence/display/FLINK/FLIP+116%3A+Unified+Memory+Configuration+for+Job+Managers

[57] https://issues.apache.org/jira/browse/FLINK-16605