Monitoring folder in flink

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Monitoring folder in flink

flinkuser101
This post was updated on .
I have folder where new files arrive at schedule. Why is my flink readfile not reading new files. I have used but *PROCESS_ONCE* and *PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same file but the execution does not terminate whereas for PROCESS_ONCE it terminates in IDE.

    String path = "C:\\test";
   
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
   
    TextInputFormat format = new TextInputFormat(new org.apache.flink.core.fs.Path(path));
   
    DataStream<String> inputStream = env.readFile(format, path, FileProcessingMode.PROCESS_ONCE, 100);
Reply | Threaded
Open this post in threaded view
|

Re: Monitoring folder in flink

Fabian Hueske-2
Hi,

with PROCESS_CONTINUOUSLY the application monitors the directory and processes new arriving files or files that have been modified. In this case the application never terminates because it is waiting for new files to appear.
With PROCESS_ONCE, the content of a directory is processed as it was when the application was started. Once all files are processed the application terminates.

What kind of behavior are you looking for?

Best, Fabian

2017-10-22 9:37 GMT+02:00 Sugandha Amatya <[hidden email]>:
I have folder where new files arrive at schedule. Why is my flink readfile not reading new files. I have used but *PROCESS_ONCE* and *PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same file but the execution does not terminate whereas for PROCESS_ONCE it terminates in IDE.

    String path = "C:\\test";
    
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
    TextInputFormat format = new TextInputFormat(new org.apache.flink.core.fs.Path(path));
    
    DataStream<String> inputStream = env.readFile(format, path, FileProcessingMode.PROCESS_ONCE, 100);

Here at stackoverflow.

Reply | Threaded
Open this post in threaded view
|

Re: Monitoring folder in flink

flinkuser101
Hi

I found that flink polls directory based on modified date. In windows when I copy files the modified date remained same. So, PROCESS_CONTINUOUSLY resolved the issue.

On Tue, Oct 24, 2017 at 6:09 PM, Fabian Hueske <[hidden email]> wrote:
Hi,

with PROCESS_CONTINUOUSLY the application monitors the directory and processes new arriving files or files that have been modified. In this case the application never terminates because it is waiting for new files to appear.
With PROCESS_ONCE, the content of a directory is processed as it was when the application was started. Once all files are processed the application terminates.

What kind of behavior are you looking for?

Best, Fabian

2017-10-22 9:37 GMT+02:00 Sugandha Amatya <[hidden email]>:
I have folder where new files arrive at schedule. Why is my flink readfile not reading new files. I have used but *PROCESS_ONCE* and *PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same file but the execution does not terminate whereas for PROCESS_ONCE it terminates in IDE.

    String path = "C:\\test";
    
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
    TextInputFormat format = new TextInputFormat(new org.apache.flink.core.fs.Path(path));
    
    DataStream<String> inputStream = env.readFile(format, path, FileProcessingMode.PROCESS_ONCE, 100);

Here at stackoverflow.


Reply | Threaded
Open this post in threaded view
|

Re: Monitoring folder in flink

☼ R Nair (रविशंकर नायर)
In reply to this post by flinkuser101
Can you please share the full code? 

Thanks, RAV

On Oct 22, 2017 3:37 AM, "Sugandha Amatya" <[hidden email]> wrote:
I have folder where new files arrive at schedule. Why is my flink readfile not reading new files. I have used but *PROCESS_ONCE* and *PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same file but the execution does not terminate whereas for PROCESS_ONCE it terminates in IDE.

    String path = "C:\\test";
    
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
    TextInputFormat format = new TextInputFormat(new org.apache.flink.core.fs.Path(path));
    
    DataStream<String> inputStream = env.readFile(format, path, FileProcessingMode.PROCESS_ONCE, 100);

Here at stackoverflow.

Reply | Threaded
Open this post in threaded view
|

Re: Monitoring folder in flink

flinkuser101
The code I have pasted is all that I have.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/