 
	
					
		
	
					| Hi to all,running the example at http://flink.incubator.apache.org/docs/0.7-incubating/local_execution.html I was thinking that the writeAsText on a local file was creating a text file on my local filesystem..instead it creates something similar to a sequence file (within a folder). This is something misleading I think...or the API name is wrong or this is a bug (IMHO). Btw..how can I modify the following program to write results in a single text file on my local filesystem? 
 | 
 
	
					
		
	
					| Dear Flavio, Yes, the writeAsText() merthod really creates a folder which contains a file for each execution thread, so your threads do not block each other and the execution can use multiple cores on your machine. You can see similar results if you try it with env.execute() from an IDE. There are filesystems, HDFS to mention the most prominent one which can transparently treat such folder structure as a single file and then it would behave as you expect. I hope this answers your question. Best, Marton On Wed, Oct 29, 2014 at 8:31 PM, Flavio Pompermaier <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| Would it be that difficult to change the behaviour for file:/// and create a single file?or is there a way to do that? On Oct 29, 2014 9:52 PM, "Márton Balassi" <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| You can set the DOP of the data sink to 1 [1].  There is also a config parameter whether to create a directory or not in case of DOP=1. If I remember correctly, the default is to NOT create a folder for DOP=1. Best, Fabian 2014-10-29 22:22 GMT+01:00 Flavio Pompermaier <[hidden email]>: 
 | 
 
	
					
		
	
					| 
				In reply to this post by Flavio Pompermaier
			 Just use setParallelism(). This specifies how many threads are used for the operator.Cheers, This will give you a single output file. On Wed, Oct 29, 2014 at 10:22 PM, Flavio Pompermaier <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| 
				In reply to this post by Fabian Hueske
			 Regarding the text vs. sequence output. writeAsText() emits each record using its toString() method, which should be the String itself in your case. So if it would write binary data, something is wrong... 2014-10-29 22:34 GMT+01:00 Fabian Hueske <[hidden email]>: 
 | 
 
	
					
		
	
					| Hi Flavio,any updates on this bug? 2014-10-29 22:36 GMT+01:00 Fabian Hueske <[hidden email]>: 
 | 
 
	
					
		
	
					| Nope. This is actually a bug for me, I don't know what the FLINK community or committee think On Mon, Nov 3, 2014 at 11:52 AM, Fabian Hueske <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| OK, I assume the problem of creating multiple files (+ output directory) is fixed by setting the DOP of the OutputFormat to 1, right?But you still get binary output with a TextOutputFormat that writes a DataSet<String>? 2014-11-03 11:58 GMT+01:00 Flavio Pompermaier <[hidden email]>: 
 | 
 
	
					
		
	
					| Hey! Parallel outputs require multiple output files. The only way to make this a single file by default is to set the default parallelism of file outputs to 1. That would cause many surprises on cluster execution, actually. It may be a fair compromise to set the default parallelism of sinks to 1 if the execution environment is the local environment. Stephan On Mon, Nov 3, 2014 at 12:06 PM, Fabian Hueske <[hidden email]> wrote: 
 | 
 
	
					
		
	
					| That is not a big problem, it should just be well documented :) On Mon, Nov 3, 2014 at 12:09 PM, Stephan Ewen <[hidden email]> wrote: 
 | 
| Free forum by Nabble | Edit this page | 
 
	

 
	
	
		
