Fwd: [VOTE] Release 1.8.0, release candidate #3

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Fwd: [VOTE] Release 1.8.0, release candidate #3

Aljoscha Krettek
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

jincheng sun
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

Kurt Young
+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.

Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

Aljoscha Krettek
Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.


Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

Piotr Nowojski-3
-1 from my side due to performance regression found in the master branch since Jan 29th. 

In 10% JVM forks it was causing huge performance drop in some of the benchmarks (up to 30-50% reduced throughput), which could mean that one out of 10 task managers could be affected by it. Today we have merged a fix for it [1]. First benchmark run was promising [2], but we have to wait until tomorrow to make sure that the problem was definitely resolved. If that’s the case, I would recommend including it in 1.8.0, because we really do not know how big of performance regression this issue can be in the real world scenarios.

Regarding the second regression from mid February. We have found the responsible commit and this one is probably just a false positive. Because of the nature some of the benchmarks, they are running with low number of records (300k). The apparent performance regression was caused by higher initialisation time. When I temporarily increased the number of records to 2M, the regression was gone. Together with Till and Stefan Richter we discussed the potential impact of this longer initialisation time (in the case of said benchmarks initialisation time increased from 70ms to 120ms) and we think that it’s not a critical issue, that doesn’t have to block the release. Nevertheless there might some follow up work for this.


Piotr Nowojski

On 20 Mar 2019, at 10:09, Aljoscha Krettek <[hidden email]> wrote:

Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

jincheng sun
Thanks for the quick fix Aljoscha! The FLINK-11971 has been merged.

Cheers,
Jincheng

Piotr Nowojski <[hidden email]> 于2019年3月21日周四 上午12:29写道:
-1 from my side due to performance regression found in the master branch since Jan 29th. 

In 10% JVM forks it was causing huge performance drop in some of the benchmarks (up to 30-50% reduced throughput), which could mean that one out of 10 task managers could be affected by it. Today we have merged a fix for it [1]. First benchmark run was promising [2], but we have to wait until tomorrow to make sure that the problem was definitely resolved. If that’s the case, I would recommend including it in 1.8.0, because we really do not know how big of performance regression this issue can be in the real world scenarios.

Regarding the second regression from mid February. We have found the responsible commit and this one is probably just a false positive. Because of the nature some of the benchmarks, they are running with low number of records (300k). The apparent performance regression was caused by higher initialisation time. When I temporarily increased the number of records to 2M, the regression was gone. Together with Till and Stefan Richter we discussed the potential impact of this longer initialisation time (in the case of said benchmarks initialisation time increased from 70ms to 120ms) and we think that it’s not a critical issue, that doesn’t have to block the release. Nevertheless there might some follow up work for this.


Piotr Nowojski

On 20 Mar 2019, at 10:09, Aljoscha Krettek <[hidden email]> wrote:

Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

Yu Li
-1, observed stably failure on streaming bucketing end-to-end test case in two different environments (Linux/MacOS) when running with both shaded hadoop-2.8.3 jar file and hadoop-2.8.5 dist, while both env could pass with hadoop 2.6.5. More details please refer to this comment in FLINK-11972.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 04:25, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix Aljoscha! The FLINK-11971 has been merged.

Cheers,
Jincheng

Piotr Nowojski <[hidden email]> 于2019年3月21日周四 上午12:29写道:
-1 from my side due to performance regression found in the master branch since Jan 29th. 

In 10% JVM forks it was causing huge performance drop in some of the benchmarks (up to 30-50% reduced throughput), which could mean that one out of 10 task managers could be affected by it. Today we have merged a fix for it [1]. First benchmark run was promising [2], but we have to wait until tomorrow to make sure that the problem was definitely resolved. If that’s the case, I would recommend including it in 1.8.0, because we really do not know how big of performance regression this issue can be in the real world scenarios.

Regarding the second regression from mid February. We have found the responsible commit and this one is probably just a false positive. Because of the nature some of the benchmarks, they are running with low number of records (300k). The apparent performance regression was caused by higher initialisation time. When I temporarily increased the number of records to 2M, the regression was gone. Together with Till and Stefan Richter we discussed the potential impact of this longer initialisation time (in the case of said benchmarks initialisation time increased from 70ms to 120ms) and we think that it’s not a critical issue, that doesn’t have to block the release. Nevertheless there might some follow up work for this.


Piotr Nowojski

On 20 Mar 2019, at 10:09, Aljoscha Krettek <[hidden email]> wrote:

Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

jincheng sun
Thanks for the quick fix, Yu. the PR of FLINK-11972 has been merged.

Cheers,
Jincheng

Yu Li <[hidden email]> 于2019年3月21日周四 上午7:23写道:
-1, observed stably failure on streaming bucketing end-to-end test case in two different environments (Linux/MacOS) when running with both shaded hadoop-2.8.3 jar file and hadoop-2.8.5 dist, while both env could pass with hadoop 2.6.5. More details please refer to this comment in FLINK-11972.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 04:25, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix Aljoscha! The FLINK-11971 has been merged.

Cheers,
Jincheng

Piotr Nowojski <[hidden email]> 于2019年3月21日周四 上午12:29写道:
-1 from my side due to performance regression found in the master branch since Jan 29th. 

In 10% JVM forks it was causing huge performance drop in some of the benchmarks (up to 30-50% reduced throughput), which could mean that one out of 10 task managers could be affected by it. Today we have merged a fix for it [1]. First benchmark run was promising [2], but we have to wait until tomorrow to make sure that the problem was definitely resolved. If that’s the case, I would recommend including it in 1.8.0, because we really do not know how big of performance regression this issue can be in the real world scenarios.

Regarding the second regression from mid February. We have found the responsible commit and this one is probably just a false positive. Because of the nature some of the benchmarks, they are running with low number of records (300k). The apparent performance regression was caused by higher initialisation time. When I temporarily increased the number of records to 2M, the regression was gone. Together with Till and Stefan Richter we discussed the potential impact of this longer initialisation time (in the case of said benchmarks initialisation time increased from 70ms to 120ms) and we think that it’s not a critical issue, that doesn’t have to block the release. Nevertheless there might some follow up work for this.


Piotr Nowojski

On 20 Mar 2019, at 10:09, Aljoscha Krettek <[hidden email]> wrote:

Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

Yu Li
Thanks @jincheng

@Aljoscha I've just opened FLINK-11990 for the HDFS BucketingSink issue with hadoop 2.8. IMHO it might be a blocker for 1.8.0 and need your confirmation. Thanks.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 15:57, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix, Yu. the PR of FLINK-11972 has been merged.

Cheers,
Jincheng

Yu Li <[hidden email]> 于2019年3月21日周四 上午7:23写道:
-1, observed stably failure on streaming bucketing end-to-end test case in two different environments (Linux/MacOS) when running with both shaded hadoop-2.8.3 jar file and hadoop-2.8.5 dist, while both env could pass with hadoop 2.6.5. More details please refer to this comment in FLINK-11972.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 04:25, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix Aljoscha! The FLINK-11971 has been merged.

Cheers,
Jincheng

Piotr Nowojski <[hidden email]> 于2019年3月21日周四 上午12:29写道:
-1 from my side due to performance regression found in the master branch since Jan 29th. 

In 10% JVM forks it was causing huge performance drop in some of the benchmarks (up to 30-50% reduced throughput), which could mean that one out of 10 task managers could be affected by it. Today we have merged a fix for it [1]. First benchmark run was promising [2], but we have to wait until tomorrow to make sure that the problem was definitely resolved. If that’s the case, I would recommend including it in 1.8.0, because we really do not know how big of performance regression this issue can be in the real world scenarios.

Regarding the second regression from mid February. We have found the responsible commit and this one is probably just a false positive. Because of the nature some of the benchmarks, they are running with low number of records (300k). The apparent performance regression was caused by higher initialisation time. When I temporarily increased the number of records to 2M, the regression was gone. Together with Till and Stefan Richter we discussed the potential impact of this longer initialisation time (in the case of said benchmarks initialisation time increased from 70ms to 120ms) and we think that it’s not a critical issue, that doesn’t have to block the release. Nevertheless there might some follow up work for this.


Piotr Nowojski

On 20 Mar 2019, at 10:09, Aljoscha Krettek <[hidden email]> wrote:

Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.



Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

Aljoscha Krettek
Hi Yu,

I commented on the issue. For me both Hadoop 2.8.3 and Hadoop 2.4.1 seem to work. Could you have a look at my comment?

I will also cancel this RC because of various issues.

Best,
Aljoscha

On 21. Mar 2019, at 12:23, Yu Li <[hidden email]> wrote:

Thanks @jincheng

@Aljoscha I've just opened FLINK-11990 for the HDFS BucketingSink issue with hadoop 2.8. IMHO it might be a blocker for 1.8.0 and need your confirmation. Thanks.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 15:57, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix, Yu. the PR of FLINK-11972 has been merged.

Cheers,
Jincheng

Yu Li <[hidden email]> 于2019年3月21日周四 上午7:23写道:
-1, observed stably failure on streaming bucketing end-to-end test case in two different environments (Linux/MacOS) when running with both shaded hadoop-2.8.3 jar file and hadoop-2.8.5 dist, while both env could pass with hadoop 2.6.5. More details please refer to this comment in FLINK-11972.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 04:25, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix Aljoscha! The FLINK-11971 has been merged.

Cheers,
Jincheng

Piotr Nowojski <[hidden email]> 于2019年3月21日周四 上午12:29写道:
-1 from my side due to performance regression found in the master branch since Jan 29th. 

In 10% JVM forks it was causing huge performance drop in some of the benchmarks (up to 30-50% reduced throughput), which could mean that one out of 10 task managers could be affected by it. Today we have merged a fix for it [1]. First benchmark run was promising [2], but we have to wait until tomorrow to make sure that the problem was definitely resolved. If that’s the case, I would recommend including it in 1.8.0, because we really do not know how big of performance regression this issue can be in the real world scenarios.

Regarding the second regression from mid February. We have found the responsible commit and this one is probably just a false positive. Because of the nature some of the benchmarks, they are running with low number of records (300k). The apparent performance regression was caused by higher initialisation time. When I temporarily increased the number of records to 2M, the regression was gone. Together with Till and Stefan Richter we discussed the potential impact of this longer initialisation time (in the case of said benchmarks initialisation time increased from 70ms to 120ms) and we think that it’s not a critical issue, that doesn’t have to block the release. Nevertheless there might some follow up work for this.


Piotr Nowojski

On 20 Mar 2019, at 10:09, Aljoscha Krettek <[hidden email]> wrote:

Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.




Reply | Threaded
Open this post in threaded view
|

Re: [VOTE] Release 1.8.0, release candidate #3

Yu Li
Thanks for the message Aljoscha, let's discuss in JIRA (just replied there).

Best Regards,
Yu


On Thu, 21 Mar 2019 at 21:15, Aljoscha Krettek <[hidden email]> wrote:
Hi Yu,

I commented on the issue. For me both Hadoop 2.8.3 and Hadoop 2.4.1 seem to work. Could you have a look at my comment?

I will also cancel this RC because of various issues.

Best,
Aljoscha

On 21. Mar 2019, at 12:23, Yu Li <[hidden email]> wrote:

Thanks @jincheng

@Aljoscha I've just opened FLINK-11990 for the HDFS BucketingSink issue with hadoop 2.8. IMHO it might be a blocker for 1.8.0 and need your confirmation. Thanks.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 15:57, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix, Yu. the PR of FLINK-11972 has been merged.

Cheers,
Jincheng

Yu Li <[hidden email]> 于2019年3月21日周四 上午7:23写道:
-1, observed stably failure on streaming bucketing end-to-end test case in two different environments (Linux/MacOS) when running with both shaded hadoop-2.8.3 jar file and hadoop-2.8.5 dist, while both env could pass with hadoop 2.6.5. More details please refer to this comment in FLINK-11972.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 04:25, jincheng sun <[hidden email]> wrote:
Thanks for the quick fix Aljoscha! The FLINK-11971 has been merged.

Cheers,
Jincheng

Piotr Nowojski <[hidden email]> 于2019年3月21日周四 上午12:29写道:
-1 from my side due to performance regression found in the master branch since Jan 29th. 

In 10% JVM forks it was causing huge performance drop in some of the benchmarks (up to 30-50% reduced throughput), which could mean that one out of 10 task managers could be affected by it. Today we have merged a fix for it [1]. First benchmark run was promising [2], but we have to wait until tomorrow to make sure that the problem was definitely resolved. If that’s the case, I would recommend including it in 1.8.0, because we really do not know how big of performance regression this issue can be in the real world scenarios.

Regarding the second regression from mid February. We have found the responsible commit and this one is probably just a false positive. Because of the nature some of the benchmarks, they are running with low number of records (300k). The apparent performance regression was caused by higher initialisation time. When I temporarily increased the number of records to 2M, the regression was gone. Together with Till and Stefan Richter we discussed the potential impact of this longer initialisation time (in the case of said benchmarks initialisation time increased from 70ms to 120ms) and we think that it’s not a critical issue, that doesn’t have to block the release. Nevertheless there might some follow up work for this.


Piotr Nowojski

On 20 Mar 2019, at 10:09, Aljoscha Krettek <[hidden email]> wrote:

Thanks Jincheng! It would be very good to fix those but as you said, I would say they are not blockers.

On 20. Mar 2019, at 09:47, Kurt Young <[hidden email]> wrote:

+1 (non-binding)

Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully 

Best,
Kurt


On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <[hidden email]> wrote:
Hi Aljoscha&All,

When I did the `end-to-end` test for RC3 under Mac OS, I found the following two problems:

1. The verification returned for different `minikube status` is is not enough for the robustness. The strings returned by different versions of different platforms are different. the following misjudgment is caused:
When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, the minikube has actually started successfully. The core reason is that there is a bug in the `test_kubernetes_embedded_job.sh` script. See FLINK-11971 for details.

2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the `hadoop-shaded` JAR integrated into the dist.  It will cause an error when the end-to-end test cannot be found with `Hadoop` Related classes,  such as: `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end test script, or explicitly stated in the README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See FLINK-11972 for details.

I think this is not a blocker for release-1.8.0, but I think it would be better to include those commits in release-1.8 If we still have performance related bugs should be fixed.

What do you think?

Best,
Jincheng


Aljoscha Krettek <[hidden email]> 于2019年3月19日周二 下午7:58写道:
Hi All,

The release process for Flink 1.8.0 is currently ongoing. Please have a look at the thread, in case you’re interested in checking your applications against this next release of Apache Flink and participate in the process.

Best,
Aljoscha

Begin forwarded message:

From: Aljoscha Krettek <[hidden email]>
Subject: [VOTE] Release 1.8.0, release candidate #3
Date: 19. March 2019 at 12:52:50 CET
Reply-To: [hidden email]

Hi everyone,
Please review and vote on the release candidate 3 for Flink 1.8.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be deployed to dist.apache.org <http://dist.apache.org/> [2], which are signed with the key with fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.8.0-rc3" [5],
* website pull request listing the new release [6]
* website pull request adding announcement blog post [7].

The vote will be open for at least 72 hours. It is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Aljoscha

[1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274>
[2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
[3] https://dist.apache.org/repos/dist/release/flink/KEYS <https://dist.apache.org/repos/dist/release/flink/KEYS>
[4] https://repository.apache.org/content/repositories/orgapacheflink-1214 <https://repository.apache.org/content/repositories/orgapacheflink-1214>
[5] https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42 <https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
[6] https://github.com/apache/flink-web/pull/180 <https://github.com/apache/flink-web/pull/180>
[7] https://github.com/apache/flink-web/pull/179 <https://github.com/apache/flink-web/pull/179>

P.S. The difference to the previous RCs 1 and 2 is very small, you can fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3” to see the difference in commits. Its fixes for the issues that led to the cancellation of the previous RCs plus smaller fixes. Most verification/testing that was carried out should apply as is to this RC. Any functional verification that you did on previous RCs should therefore easily carry over to this one.