(DEPRECATED) Apache Flink User Mailing List archive.

Checkpoint Space Usage Debugging

Classic

List

Threaded

2 messages Options

Kent Murra

Checkpoint Space Usage Debugging

I'm looking into a situation where our checkpoint sizes are automatically growing over time. I'm unable to pinpoint exactly why this is happening, and it would be great if there was a way to figure out how much checkpoint space is attributable to each operator so I can narrow it down. Are there any tools or methods for introspecting the checkpoint data so that I can determine where the space is going?

The pipeline in question is consuming from Kinesis and batching up data using windows. I suspected that I was doing something wrong with windowing, but I'm emitting FIRE_AND_PURGE and also setting a max end timestamp. The Kinesis consumer is not emitting watermarks at the moment, but as far as I know thats not necessary for proper checkpointing (only exactly once behavior).

Yun Tang

Re: Checkpoint Space Usage Debugging

Hi Kent

You can view checkpoint details via web UI to know how much checkpointed data uploaded for each operator, and you can compare the state size as time goes on to see whether they upload checkpointed data in stable range.

Best

Yun Tang

From: Kent Murra <[hidden email]>
Sent: Saturday, April 18, 2020 1:47
To: [hidden email] <[hidden email]>
Subject: Checkpoint Space Usage Debugging