Problem producing to Kinesis

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem producing to Kinesis

Alexey Tsitkin
  Hi,
I'm trying to run a simple program which consumes from one kinesis stream, does a simple transformation, and produces to another stream.
Running on Flink 1.4.0.

Code can be seen here (if needed I can also paste it directly on this thread):
https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working

Consuming the source stream works great, but trying to use the producer causes the exception:
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException: The child process has been shutdown and can no longer accept messages.
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)
    at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)
    ...

Did anyone have something similar?
Or is there any way to debug the daemon itself, to understand the source of the error?

As you can see, this is a trivial example, which I mostly copy-pasted from the documentation.

Thanks,
Alexey
Reply | Threaded
Open this post in threaded view
|

Re: Problem producing to Kinesis

Lasse Nedergaard
Hi. 

We see the same error and to my understanding it’s a known error from Amazon. See. https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522

We don’t have a workaround and haven’t found the reason for the exception. It is one off the reason why we move to Kafka in the near future. 

Med venlig hilsen / Best regards
Lasse Nedergaard


Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <[hidden email]>:

  Hi,
I'm trying to run a simple program which consumes from one kinesis stream, does a simple transformation, and produces to another stream.
Running on Flink 1.4.0.

Code can be seen here (if needed I can also paste it directly on this thread):
https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working

Consuming the source stream works great, but trying to use the producer causes the exception:
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException: The child process has been shutdown and can no longer accept messages.
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)
    at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)
    ...

Did anyone have something similar?
Or is there any way to debug the daemon itself, to understand the source of the error?

As you can see, this is a trivial example, which I mostly copy-pasted from the documentation.

Thanks,
Alexey
Reply | Threaded
Open this post in threaded view
|

Re: Problem producing to Kinesis

Tzu-Li (Gordon) Tai
Hi,


Shortly put, the KPL library version used by default in the 1.4.x Kinesis connector, is no longer supported by AWS.
Users would need to use a upgraded version >= 0.12.6 and build the Kinesis connector for the producer to work.

We should probably add a warning about this in the Kinesis connector docs.

Cheers,
Gordon

On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard ([hidden email]) wrote:

Hi. 

We see the same error and to my understanding it’s a known error from Amazon. See. https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522

We don’t have a workaround and haven’t found the reason for the exception. It is one off the reason why we move to Kafka in the near future. 

Med venlig hilsen / Best regards
Lasse Nedergaard


Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <[hidden email]>:

  Hi,
I'm trying to run a simple program which consumes from one kinesis stream, does a simple transformation, and produces to another stream.
Running on Flink 1.4.0.

Code can be seen here (if needed I can also paste it directly on this thread):
https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working

Consuming the source stream works great, but trying to use the producer causes the exception:
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException: The child process has been shutdown and can no longer accept messages.
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)
    at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)
    ...

Did anyone have something similar?
Or is there any way to debug the daemon itself, to understand the source of the error?

As you can see, this is a trivial example, which I mostly copy-pasted from the documentation.

Thanks,
Alexey
Reply | Threaded
Open this post in threaded view
|

Re: Problem producing to Kinesis

Alexey Tsitkin
Thanks Gordon!

This indeed seems like the cause of the issue.
I've ran the program using 1.5.0, after building the appropriate connector, and it's working as expected.

Wondering how difficult is it to upgrade the 1.4 connector to a newer KPL version, as this kind of blocks running on EMR and producing to Kinesis... :-)

Alexey

On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi,


Shortly put, the KPL library version used by default in the 1.4.x Kinesis connector, is no longer supported by AWS.
Users would need to use a upgraded version >= 0.12.6 and build the Kinesis connector for the producer to work.

We should probably add a warning about this in the Kinesis connector docs.

Cheers,
Gordon

On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard ([hidden email]) wrote:

Hi. 

We see the same error and to my understanding it’s a known error from Amazon. See. https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522

We don’t have a workaround and haven’t found the reason for the exception. It is one off the reason why we move to Kafka in the near future. 

Med venlig hilsen / Best regards
Lasse Nedergaard


Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <[hidden email]>:

  Hi,
I'm trying to run a simple program which consumes from one kinesis stream, does a simple transformation, and produces to another stream.
Running on Flink 1.4.0.

Code can be seen here (if needed I can also paste it directly on this thread):
https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working

Consuming the source stream works great, but trying to use the producer causes the exception:
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException: The child process has been shutdown and can no longer accept messages.
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)
    at org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)
    at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)
    ...

Did anyone have something similar?
Or is there any way to debug the daemon itself, to understand the source of the error?

As you can see, this is a trivial example, which I mostly copy-pasted from the documentation.

Thanks,
Alexey
Reply | Threaded
Open this post in threaded view
|

Re: Problem producing to Kinesis

dyana.rose@salecycle.com
Porting and rebuilding 1.4.x isn't a big issue. I've done it on our fork, back when I reported the upcoming issue and we're running fine.

https://github.com/SaleCycle/flink/commit/d943a172ae7e6618309b45df848d3b9432e062d4

Ignore the circleci file of course, and the rest are the changes that I back ported from 1.5

Dyana

On 2018/06/14 20:44:10, Alexey Tsitkin <[hidden email]> wrote:

> Thanks Gordon!
>
> This indeed seems like the cause of the issue.
> I've ran the program using 1.5.0, after building the appropriate connector,
> and it's working as expected.
>
> Wondering how difficult is it to upgrade the 1.4 connector to a newer KPL
> version, as this kind of blocks running on EMR and producing to Kinesis...
> :-)
>
> Alexey
>
> On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai <[hidden email]>
> wrote:
>
> > Hi,
> >
> > This could be related:
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writing-to-Kinesis-after-June-12th-td22687.html#a22701
> > .
> >
> > Shortly put, the KPL library version used by default in the 1.4.x Kinesis
> > connector, is no longer supported by AWS.
> > Users would need to use a upgraded version >= 0.12.6 and build the Kinesis
> > connector for the producer to work.
> >
> > We should probably add a warning about this in the Kinesis connector docs.
> >
> > Cheers,
> > Gordon
> >
> > On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard ([hidden email])
> > wrote:
> >
> > Hi.
> >
> > We see the same error and to my understanding it’s a known error from
> > Amazon. See.
> > https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522
> >
> > We don’t have a workaround and haven’t found the reason for the exception.
> > It is one off the reason why we move to Kafka in the near future.
> >
> > Med venlig hilsen / Best regards
> > Lasse Nedergaard
> >
> >
> > Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <[hidden email]
> > >:
> >
> >   Hi,
> > I'm trying to run a simple program which consumes from one kinesis stream,
> > does a simple transformation, and produces to another stream.
> > Running on Flink 1.4.0.
> >
> > Code can be seen here (if needed I can also paste it directly on this
> > thread):
> >
> > https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working
> >
> > Consuming the source stream works great, but trying to use the producer
> > causes the exception:
> >
> > *org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException:
> > The child process has been shutdown and can no longer accept messages.*
> > *    at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)*
> > *    at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)*
> > *    at
> > org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)*
> > *    ...*
> >
> >
> > Did anyone have something similar?
> > Or is there any way to debug the daemon itself, to understand the source
> > of the error?
> >
> > As you can see, this is a trivial example, which I mostly copy-pasted from
> > the documentation.
> >
> > Thanks,
> > Alexey
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Problem producing to Kinesis

Tzu-Li (Gordon) Tai
@Alexey

If you’d like to stick to 1.4.x for now, you can just do:
`mvn clean install -Daws.kinesis-kpl-version=0.12.6` when building the Kinesis connector, to upgrade the KPL version used.

I think we should add this to the documentation. Here’s a JIRA to track that - https://issues.apache.org/jira/browse/FLINK-9595.

Cheers,
Gordon

On 15 June 2018 at 9:48:33 AM, dyana.rose ([hidden email]) wrote:

Porting and rebuilding 1.4.x isn't a big issue. I've done it on our fork, back when I reported the upcoming issue and we're running fine.

https://github.com/SaleCycle/flink/commit/d943a172ae7e6618309b45df848d3b9432e062d4

Ignore the circleci file of course, and the rest are the changes that I back ported from 1.5

Dyana

On 2018/06/14 20:44:10, Alexey Tsitkin <[hidden email]> wrote:

> Thanks Gordon!
>
> This indeed seems like the cause of the issue.
> I've ran the program using 1.5.0, after building the appropriate connector,
> and it's working as expected.
>
> Wondering how difficult is it to upgrade the 1.4 connector to a newer KPL
> version, as this kind of blocks running on EMR and producing to Kinesis...
> :-)
>
> Alexey
>
> On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai <[hidden email]>
> wrote:
>
> > Hi,
> >
> > This could be related:
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writing-to-Kinesis-after-June-12th-td22687.html#a22701
> > .
> >
> > Shortly put, the KPL library version used by default in the 1.4.x Kinesis
> > connector, is no longer supported by AWS.
> > Users would need to use a upgraded version >= 0.12.6 and build the Kinesis
> > connector for the producer to work.
> >
> > We should probably add a warning about this in the Kinesis connector docs.
> >
> > Cheers,
> > Gordon
> >
> > On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard ([hidden email])
> > wrote:
> >
> > Hi.
> >
> > We see the same error and to my understanding it’s a known error from
> > Amazon. See.
> > https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522
> >
> > We don’t have a workaround and haven’t found the reason for the exception.
> > It is one off the reason why we move to Kafka in the near future.
> >
> > Med venlig hilsen / Best regards
> > Lasse Nedergaard
> >
> >
> > Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <[hidden email]
> > >:
> >
> > Hi,
> > I'm trying to run a simple program which consumes from one kinesis stream,
> > does a simple transformation, and produces to another stream.
> > Running on Flink 1.4.0.
> >
> > Code can be seen here (if needed I can also paste it directly on this
> > thread):
> >
> > https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working
> >
> > Consuming the source stream works great, but trying to use the producer
> > causes the exception:
> >
> > *org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException:
> > The child process has been shutdown and can no longer accept messages.*
> > * at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)*
> > * at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)*
> > * at
> > org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)*
> > * ...*
> >
> >
> > Did anyone have something similar?
> > Or is there any way to debug the daemon itself, to understand the source
> > of the error?
> >
> > As you can see, this is a trivial example, which I mostly copy-pasted from
> > the documentation.
> >
> > Thanks,
> > Alexey
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Problem producing to Kinesis

Sandybayev, Turar (CAI - Atlanta)

I’m seeing a different exception when producing to Kinesis, which seems to do with back pressure handling:

 

java.lang.RuntimeException: An exception was thrown while processing a record: Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999.

Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999.

Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999.

Rate exceeded for shard shardId-000000000026 in stream turar-test-output under account 9999.

Record has reached expiration

 

 

When this exception occurs the entire job restarts after cancelling all the workers. We have restart configuration set to 3, so after 3 restarts the entire application dies. We’re also running 1.4.x on EMR, and rebuilding with “-Daws.kinesis-kpl-version=0.12.6” as suggested below didn’t seem to help.

 

Is there a recommended solution to handle these kinds of exceptions without the entire application getting killed and without loss of data?

 

Thanks,

Turar

 

 

From: "Tzu-Li (Gordon) Tai" <[hidden email]>
Date: Friday, June 15, 2018 at 4:48 AM
To: "dyana.rose" <[hidden email]>, "[hidden email]" <[hidden email]>
Subject: Re: Problem producing to Kinesis

 

@Alexey

 

If you’d like to stick to 1.4.x for now, you can just do:

`mvn clean install -Daws.kinesis-kpl-version=0.12.6` when building the Kinesis connector, to upgrade the KPL version used.

 

I think we should add this to the documentation. Here’s a JIRA to track that - https://issues.apache.org/jira/browse/FLINK-9595.

 

Cheers,

Gordon

On 15 June 2018 at 9:48:33 AM, dyana.rose ([hidden email]) wrote:

Porting and rebuilding 1.4.x isn't a big issue. I've done it on our fork, back when I reported the upcoming issue and we're running fine.

https://github.com/SaleCycle/flink/commit/d943a172ae7e6618309b45df848d3b9432e062d4

Ignore the circleci file of course, and the rest are the changes that I back ported from 1.5

Dyana

On 2018/06/14 20:44:10, Alexey Tsitkin <[hidden email]> wrote:
> Thanks Gordon!
>
> This indeed seems like the cause of the issue.
> I've ran the program using 1.5.0, after building the appropriate connector,
> and it's working as expected.
>
> Wondering how difficult is it to upgrade the 1.4 connector to a newer KPL
> version, as this kind of blocks running on EMR and producing to Kinesis...
> :-)
>
> Alexey
>
> On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai <[hidden email]>
> wrote:
>
> > Hi,
> >
> > This could be related:
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writing-to-Kinesis-after-June-12th-td22687.html#a22701
> > .
> >
> > Shortly put, the KPL library version used by default in the 1.4.x Kinesis
> > connector, is no longer supported by AWS.
> > Users would need to use a upgraded version >= 0.12.6 and build the Kinesis
> > connector for the producer to work.
> >
> > We should probably add a warning about this in the Kinesis connector docs.
> >
> > Cheers,
> > Gordon
> >
> > On 14 June 2018 at 9:16:04 PM, Lasse Nedergaard ([hidden email])
> > wrote:
> >
> > Hi.
> >
> > We see the same error and to my understanding it’s a known error from
> > Amazon. See.
> > https://github.com/awslabs/amazon-kinesis-producer/issues/39#issuecomment-396219522
> >
> > We don’t have a workaround and haven’t found the reason for the exception.
> > It is one off the reason why we move to Kafka in the near future.
> >
> > Med venlig hilsen / Best regards
> > Lasse Nedergaard
> >
> >
> > Den 14. jun. 2018 kl. 20.24 skrev Alexey Tsitkin <[hidden email]
> > >:
> >
> > Hi,
> > I'm trying to run a simple program which consumes from one kinesis stream,
> > does a simple transformation, and produces to another stream.
> > Running on Flink 1.4.0.
> >
> > Code can be seen here (if needed I can also paste it directly on this
> > thread):
> >
> > https://stackoverflow.com/questions/50847164/flink-producing-to-kinesis-not-working
> >
> > Consuming the source stream works great, but trying to use the producer
> > causes the exception:
> >
> > *org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.DaemonException:
> > The child process has been shutdown and can no longer accept messages.*
> > * at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.Daemon.add(Daemon.java:176)*
> > * at
> > org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:477)*
> > * at
> > org.apache.flink.streaming.connectors.kinesis.FlinkKinesisProducer.invoke(FlinkKinesisProducer.java:248)*
> > * ...*
> >
> >
> > Did anyone have something similar?
> > Or is there any way to debug the daemon itself, to understand the source
> > of the error?
> >
> > As you can see, this is a trivial example, which I mostly copy-pasted from
> > the documentation.
> >
> > Thanks,
> > Alexey
> >
> >
>