Hi, I'm trying to run a simple program which consumes from one kinesis stream, does a simple transformation, and produces to another stream. Running on Flink 1.4.0. Code can be seen here (if needed I can also paste it directly on this thread): https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-workingConsuming the source stream works great, but trying to use the producer causes the exception:
Did anyone have something similar? Or is there any way to debug the daemon itself, to understand the source of the error? As you can see, this is a trivial example, which I mostly copy-pasted from the documentation. Thanks, Alexey |
Hi.
We see the same error and to my understanding it’s a known error from Amazon. See. https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522 We don’t have a workaround and haven’t found the reason for the exception. It is one off the reason why we move to Kafka in the near future.
Med venlig hilsen / Best regards Lasse Nedergaard
|
Hi, This could be related: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writing-to-Kinesis-after-June-12th-td22687.html#a22701. Shortly put, the KPL library version used by default in the 1.4.x Kinesis connector, is no longer supported by AWS. Users would need to use a upgraded version >= 0.12.6 and build the Kinesis connector for the producer to work. Cheers, Gordon
On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard ([hidden email]) wrote:
|
Thanks Gordon! This indeed seems like the cause of the issue. I've ran the program using 1.5.0, after building the appropriate connector, and it's working as expected. Wondering how difficult is it to upgrade the 1.4 connector to a newer KPL version, as this kind of blocks running on EMR and producing to Kinesis... :-) Alexey On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
|
Porting and rebuilding 1.4.x isn't a big issue. I've done it on our fork, back when I reported the upcoming issue and we're running fine.
https://github.com/SaleCycle/flink/commit/d943a172ae7e6618309b45df848d3b9432e062d4 Ignore the circleci file of course, and the rest are the changes that I back ported from 1.5 Dyana On 2018/06/14 20:44:10, Alexey Tsitkin <[hidden email]> wrote: > Thanks Gordon! > > This indeed seems like the cause of the issue. > I've ran the program using 1.5.0, after building the appropriate connector, > and it's working as expected. > > Wondering how difficult is it to upgrade the 1.4 connector to a newer KPL > version, as this kind of blocks running on EMR and producing to Kinesis... > :-) > > Alexey > > On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai <[hidden email]> > wrote: > > > Hi, > > > > This could be related: > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writing-to-Kinesis-after-June-12th-td22687.html#a22701 > > . > > > > Shortly put, the KPL library version used by default in the 1.4.x Kinesis > > connector, is no longer supported by AWS. > > Users would need to use a upgraded version >= 0.12.6 and build the Kinesis > > connector for the producer to work. > > > > We should probably add a warning about this in the Kinesis connector docs. > > > > Cheers, > > Gordon > > > > On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard ([hidden email]) > > wrote: > > > > Hi. > > > > We see the same error and to my understanding it’s a known error from > > Amazon. See. > > https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522 > > > > We don’t have a workaround and haven’t found the reason for the exception. > > It is one off the reason why we move to Kafka in the near future. > > > > Med venlig hilsen / Best regards > > Lasse Nedergaard > > > > > > Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <[hidden email] > > >: > > > > Hi, > > I'm trying to run a simple program which consumes from one kinesis stream, > > does a simple transformation, and produces to another stream. > > Running on Flink 1.4.0. > > > > Code can be seen here (if needed I can also paste it directly on this > > thread): > > > > https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working > > > > Consuming the source stream works great, but trying to use the producer > > causes the exception: > > > > *org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException: > > The child process has been shutdown and can no longer accept messages.* > > * at > > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)* > > * at > > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)* > > * at > > org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)* > > * ...* > > > > > > Did anyone have something similar? > > Or is there any way to debug the daemon itself, to understand the source > > of the error? > > > > As you can see, this is a trivial example, which I mostly copy-pasted from > > the documentation. > > > > Thanks, > > Alexey > > > > > |
@Alexey If you’d like to stick to 1.4.x for now, you can just do: `mvn clean install -Daws.kinesis-kpl-version=0.12.6` when building the Kinesis connector, to upgrade the KPL version used. Cheers, Gordon
On 15 June 2018 at 9:48:33 AM, dyana.rose ([hidden email]) wrote:
|
I’m seeing a different exception when producing to Kinesis, which seems to do with back pressure handling: java.lang.RuntimeException: An exception was thrown while processing a record: Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999. Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999. Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999. Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999. Record has reached expiration When this exception occurs the entire job restarts after cancelling all the workers. We have restart configuration set to 3, so after 3 restarts the entire application dies. We’re also running 1.4.x on EMR, and rebuilding with “-Daws.kinesis-kpl-version=0.12.6”
as suggested below didn’t seem to help. Is there a recommended solution to handle these kinds of exceptions without the entire application getting killed and without loss of data? Thanks, Turar From: "Tzu-Li (Gordon) Tai" <[hidden email]> @Alexey If you’d like to stick to 1.4.x for now, you can just do: `mvn clean install -Daws.kinesis-kpl-version=0.12.6` when building the Kinesis connector, to upgrade the KPL version used. I think we should add this to the documentation. Here’s a JIRA to track that - https://issues.apache.org/jira/browse/FLINK-9595.
Cheers, Gordon On 15 June 2018 at 9:48:33 AM, dyana.rose ([hidden email]) wrote:
|
Free forum by Nabble | Edit this page |