Re: Flink DynamoDB stream connector losing records
Posted by
Ying Xu on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Flink-DynamoDB-stream-connector-losing-records-tp38020p38096.html
Sorry for the delayed reply. When you mention certain records getting skipped, is it from the same run or across different runs. Any more specific details on how/when records are lost?
FlinkDynamoDBStreamsConsumer is built on top of FlinkKinesisConsumer , with similar offset management mechanism. In theory it shouldn't lose exactly-once semantics in the case of getting throttled. We haven't run it in any AWS kinesis analytics environment though.
Thanks.
On Thu, Sep 10, 2020 at 7:51 AM Andrey Zagrebin <
[hidden email]> wrote:
Generally speaking this should not be a problem for exactly-once but I am not familiar with the DynamoDB and its Flink connector.
Did you observe any failover in Flink logs?
And I suspect I have throttled by DynamoDB stream, I contacted AWS support but got no response except for increasing WCU and RCU.
Is it possible that Flink will lose exactly-once semantics when throttled?
Hi Andrey,
Thanks for your suggestion, but I'm using Kinesis analytics application which supports only Flink 1.8....
Regards,
Jiawei
On Thu, Sep 10, 2020 at 10:13 PM Andrey Zagrebin <
[hidden email]> wrote:
Hi Jiawei,
Could you try Flink latest release 1.11?
1.8 will probably not get bugfix releases.
I will cc Ying Xu who might have a better idea about the DinamoDB source.
Best,
Andrey
Hi,
I'm using AWS kinesis analytics application with Flink 1.8. I am using the FlinkDynamoDBStreamsConsumer to consume DynamoDB stream records. But recently I found my internal state is wrong.
After I printed some logs I found some DynamoDB stream record are skipped and not consumed by Flink. May I know if someone encountered the same issue before? Or is it a known issue in Flink 1.8?
Thanks,
Jiawei