Checkpoint ?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Checkpoint ?

Jim Langston

Hi all,

 

I have a long running , streaming app saving checkpoints to

the file system.

 

What is the layout of the checkpoint directory ? My current

checkpoint directory has >2000 directories in it , similar to this:

 

chk-4645

 

 

Also, the directory has grown to >3GB

 

I have a small cluster, and all were started at the same time, nothing

has been restarted, but this is occurring one of the nodes, the others have

about the same number of directories in the checkpoint directory, but

not nearly as large.

 

 

Why are there so many chk-xxxx directories ? And why can they become

so large ? Is there something I should be setting in the yaml file ?

 

I was going to just remove them , but it just struck me as odd that there

are so many …

 

 

Thanks

 

Jim

Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint ?

Aljoscha Krettek
Hi Jim,

What are your checkpointing settings? Are you checkpointing to a distributed file system, such as HDFS or S3 or the local file system. The latter should not be used in a production setting and I would not expect this to work properly. (Except if the local filesystem is actually a network mounted file system)

Best,
Aljoscha

On 15. May 2017, at 17:05, Jim Langston <[hidden email]> wrote:

Hi all,
 
I have a long running , streaming app saving checkpoints to
the file system. 
 
What is the layout of the checkpoint directory ? My current
checkpoint directory has >2000 directories in it , similar to this:
 
chk-4645
 
 
Also, the directory has grown to >3GB
 
I have a small cluster, and all were started at the same time, nothing
has been restarted, but this is occurring one of the nodes, the others have
about the same number of directories in the checkpoint directory, but
not nearly as large.
 
 
Why are there so many chk-xxxx directories ? And why can they become
so large ? Is there something I should be setting in the yaml file ?
 
I was going to just remove them , but it just struck me as odd that there
are so many …
 
 
Thanks
 
Jim