Checking for existance of output directory/files before running a batch job

Posted by Niels Basjes on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Checking-for-existance-of-output-directory-files-before-running-a-batch-job-tp8573.html

Hi,

I have a batch job that I run on yarn that creates files in HDFS.
I want to avoid running this job at all if the output already exists.

So in my code (before submitting the job into yarn-session) I do this:

    String directory = "foo";
    Path directory = new Path(directoryName);
    FileSystem fs = directory.getFileSystem();

    if (!fs.exists(directory)) {
        // run the job
    }

What I found is that this code apparently checks the 'wrong' file system. (I always get 'false' even if it exists in hdfs)
I checked the API of the execution environment yet I was unable to get the 'correct' filesystem from there.
What is the proper way to check this?

--
Best regards / Met vriendelijke groeten,

Niels Basjes