Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Vijay Balakrishnan
Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming from a Kinesis Stream. 
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] - Encountered an unexpected expired iterator 
 
The error on the console ends up being a misleading one: "Caused by: org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException: 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: ..)
"
 
How do I increase the ClientConfiguration.clientExecutiontimeout to avoid this issue or is this the right way to handle this issue ? I don't want the FlinkKinesisConsumer streaming app to fail just because there might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read from the start of the Kinesis Stream.
 
 TIA,
Reply | Threaded
Open this post in threaded view
|

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Tzu-Li (Gordon) Tai
Hi!

Thanks for reporting this.

This looks like an overlooked corner case that the Kinesis connector doesn’t handle properly.

First, let me clarify the case and how it can be reproduced. Please let me know if the following is correct:
1. You started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. No records were written to the Kinesis stream at all.
3. After a period of time, you received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Cheers,
Gordon

On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan ([hidden email]) wrote:

Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming from a Kinesis Stream. 
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] - Encountered an unexpected expired iterator 
 
The error on the console ends up being a misleading one: "Caused by: org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException: 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: ..)
"
 
How do I increase the ClientConfiguration.clientExecutiontimeout to avoid this issue or is this the right way to handle this issue ? I don't want the FlinkKinesisConsumer streaming app to fail just because there might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read from the start of the Kinesis Stream.
 
 TIA,
Reply | Threaded
Open this post in threaded view
|

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Vijay Balakrishnan
Hi Gordon,

My use-case was slightly different.

1.  Started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. Only a few Records were written to the Kinesis stream
3. The FlinkKinesisConsumer reads the records from Kinesis stream. Then after a period of time of not reading anymore Kinesis Stream records, it received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Also, in 1 with LATEST  as the startup position, I have not been able to read any records from the Kinesis Stream.Still trying to pinpoint what i am doing wrong. For sure, I am not using checkpoints and not sure if this causes any issues with LATEST option.
TIA,
Vijay

On Thu, Dec 13, 2018 at 2:59 AM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi!

Thanks for reporting this.

This looks like an overlooked corner case that the Kinesis connector doesn’t handle properly.

First, let me clarify the case and how it can be reproduced. Please let me know if the following is correct:
1. You started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. No records were written to the Kinesis stream at all.
3. After a period of time, you received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Cheers,
Gordon

On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan ([hidden email]) wrote:

Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming from a Kinesis Stream. 
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] - Encountered an unexpected expired iterator 
 
The error on the console ends up being a misleading one: "Caused by: org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException: 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: ..)
"
 
How do I increase the ClientConfiguration.clientExecutiontimeout to avoid this issue or is this the right way to handle this issue ? I don't want the FlinkKinesisConsumer streaming app to fail just because there might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read from the start of the Kinesis Stream.
 
 TIA,
Reply | Threaded
Open this post in threaded view
|

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Tzu-Li (Gordon) Tai
Hi,

I’m suspecting that this is the issue: https://issues.apache.org/jira/browse/FLINK-11164.

One more thing to clarify to be sure of this:
Do you have multiple shards in the Kinesis stream, and if yes, are some of them actually empty?
Meaning that, even though you mentioned some records were written to the Kinesis stream, some shards actually weren’t written any records.

Cheers,
Gordon


On 14 December 2018 at 4:10:30 AM, Vijay Balakrishnan ([hidden email]) wrote:

Hi Gordon,

My use-case was slightly different.

1.  Started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. Only a few Records were written to the Kinesis stream
3. The FlinkKinesisConsumer reads the records from Kinesis stream. Then after a period of time of not reading anymore Kinesis Stream records, it received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Also, in 1 with LATEST  as the startup position, I have not been able to read any records from the Kinesis Stream.Still trying to pinpoint what i am doing wrong. For sure, I am not using checkpoints and not sure if this causes any issues with LATEST option.
TIA,
Vijay

On Thu, Dec 13, 2018 at 2:59 AM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi!

Thanks for reporting this.

This looks like an overlooked corner case that the Kinesis connector doesn’t handle properly.

First, let me clarify the case and how it can be reproduced. Please let me know if the following is correct:
1. You started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. No records were written to the Kinesis stream at all.
3. After a period of time, you received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Cheers,
Gordon

On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan ([hidden email]) wrote:

Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming from a Kinesis Stream. 
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] - Encountered an unexpected expired iterator 
 
The error on the console ends up being a misleading one: "Caused by: org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException: 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: ..)
"
 
How do I increase the ClientConfiguration.clientExecutiontimeout to avoid this issue or is this the right way to handle this issue ? I don't want the FlinkKinesisConsumer streaming app to fail just because there might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read from the start of the Kinesis Stream.
 
 TIA,
Reply | Threaded
Open this post in threaded view
|

Re: Encountered the following Expired Iterator exception in getRecords using FlinkKinesisConsumer

Vijay Balakrishnan
I have 2 shards in the Kinesis Streams- need to figure out how to check from the logs if records are being written to both shards .
Not sure if this is what you are looking for in terms of # of shards read- seems like 1 from the logs below:

DEBUG org.apache.http.wire                                         [] - http-outgoing-12 << "{"took":2,"errors":false,"items":[{"index":{"_index":"mlist-5sec-comp-idx","_type":"mlist_5sec_comp_schema","_id":"66zEqWcBmN9Gpk7gdac7","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":34,"_primary_term":1,"status":201}}]}"

14:51:23,842 [I/O dispatcher 36] DEBUG org.apache.http.wire                                         [] - http-outgoing-9 << "{"took":1,"errors":false,"items":[{"index":{"_index":"mlist-5sec-inst-idx","_type":"mlist_5sec_inst_schema","_id":"7KzEqWcBmN9Gpk7gdadA","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":43,"_primary_term":1,"status":201}}]}"


On Fri, Dec 14, 2018 at 12:28 AM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi,

I’m suspecting that this is the issue: https://issues.apache.org/jira/browse/FLINK-11164.

One more thing to clarify to be sure of this:
Do you have multiple shards in the Kinesis stream, and if yes, are some of them actually empty?
Meaning that, even though you mentioned some records were written to the Kinesis stream, some shards actually weren’t written any records.

Cheers,
Gordon


On 14 December 2018 at 4:10:30 AM, Vijay Balakrishnan ([hidden email]) wrote:

Hi Gordon,

My use-case was slightly different.

1.  Started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. Only a few Records were written to the Kinesis stream
3. The FlinkKinesisConsumer reads the records from Kinesis stream. Then after a period of time of not reading anymore Kinesis Stream records, it received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Also, in 1 with LATEST  as the startup position, I have not been able to read any records from the Kinesis Stream.Still trying to pinpoint what i am doing wrong. For sure, I am not using checkpoints and not sure if this causes any issues with LATEST option.
TIA,
Vijay

On Thu, Dec 13, 2018 at 2:59 AM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi!

Thanks for reporting this.

This looks like an overlooked corner case that the Kinesis connector doesn’t handle properly.

First, let me clarify the case and how it can be reproduced. Please let me know if the following is correct:
1. You started a Kinesis connector source, with TRIM_HORIZON as the startup position.
2. No records were written to the Kinesis stream at all.
3. After a period of time, you received the “Encountered an unexpected expired iterator” warning in the logs, and the job failed with the misleading AmazonKinesisException?

Cheers,
Gordon

On 13 December 2018 at 6:53:11 AM, Vijay Balakrishnan ([hidden email]) wrote:

Hi,
Using FlinkKinesisConsumer in a long running Flink Streaming app consuming from a Kinesis Stream. 
Encountered the following Expired Iterator exception in getRecords():
 org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer [] - Encountered an unexpected expired iterator 
 
The error on the console ends up being a misleading one: "Caused by: org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.model.AmazonKinesisException: 1 validation error detected: Value 'EARLIEST_SEQUENCE_NUM' at 'startingSequenceNumber' failed to satisfy constraint: Member must satisfy regular expression pattern: 0|([1-9]\d{0,128}) (Service: AmazonKinesis; Status Code: 400; Error Code: ValidationException; Request ID: ..)
"
 
How do I increase the ClientConfiguration.clientExecutiontimeout to avoid this issue or is this the right way to handle this issue ? I don't want the FlinkKinesisConsumer streaming app to fail just because there might be no records in the Kinesis Stream. I am using TRIM_HORIZON to read from the start of the Kinesis Stream.
 
 TIA,