If a flink pipeline with a kafka source was to be re stored from a checkpoint, would a partition be assigned to a the same sub task index ? It does not seem so, but wanted to confirm. We are trying to simulate an edge case where we have dangling files. We create these files by forcing 2 or more restores between 2 consecutive checkpoints. These files are created after the first checkpoint but are rendered redundant by another restore before the next checkpoint. We want to be sure that all data in these files are accounted for by checking for the records in resolved files but were not sure we should check the ones with the same task index or all sub indexes...
for example, if we have 2 sub task index ids
if _in-progress-part-1-1 is a dangling/redundant file would that data be in part-1-2 ( the next finalized file but with the same sub task id ) or could it be in part-2-2 plus for example ? If the partitions were to maintain affinity, we would imagine that it would be former but wanted to confirm.
Regards.