Flink io files

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink io files

Flavio Pompermaier
Hi to all,

what information can I infer from the files contained in the flink-io-xxx dir?
what does represent each channel file and what does it mean when there are a lot of 0-size files and some very big files..?

Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: Flink io files

Stephan Ewen
Hey!

These files are the spilled data from a sort, a hash table, or a cache, when memory runs short.

If you have some very big files and some 0 sized, I would guess you are running a Hash Join, and have heavy skew in the distribution of the keys.

Greetings,
Stephan


On Wed, Oct 21, 2015 at 12:50 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,

what information can I infer from the files contained in the flink-io-xxx dir?
what does represent each channel file and what does it mean when there are a lot of 0-size files and some very big files..?

Best,
Flavio