Re: Checking for existance of output directory/files before running a batch job
Posted by
Niels Basjes on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Checking-for-existance-of-output-directory-files-before-running-a-batch-job-tp8573p8598.html
Yes, that did the trick. Thanks.
I was using a relative path without any FS specification.
So my path was "foo" and on the cluster this resolves to "hdfs:///user/nbasjes/foo"
Locally this resolved to "file:///home/nbasjes/foo" and hence the mismatch I was looking at.
For now I can work with this fine.
Yet I think having a method on the ExecutionEnvironment instance 'getFileSystem()' that would return me the actual filesystem against which my job "is going to be executed" would solve this in an easier way. That way I can use a relative path (i.e. "foo") and run it anywhere (local, Yarn, Mesos, etc.) without any problems.
What do you guys think?
Is this desirable? Possible?
Niels.