Re: Problems with RollingSink

Posted by Kostas Kloudas on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Problems-with-RollingSink-tp10351p10371.html

Hi Diego,

You cannot prefix each stream with a different
string so that the paths do not collide?

If I understand your use-case correctly, this might work.

Cheers,
Kostas

On Nov 29, 2016, at 10:04 AM, Diego Fustes Villadóniga <[hidden email]> wrote:

Hi Kostas,
 
Thanks for your reply.
 
The problem is at the initialization of the job.  The reason was that I was using the same HDFS path as sink for 3 different streams, which is something that I would like. I can fix it by using different paths
for each stream.
 
Maybe there is a way to achieve this in a different manner by joining the streams somehow before sinking… maybe through Kafka?
 
Kind Regards,
 
Diego
 
 
De: Kostas Kloudas [[hidden email]] 
Enviado el: lunes, 28 de noviembre de 2016 19:13
Para: [hidden email]
Asunto: Re: Problems with RollingSink
 
Hi Diego,
 
The message shows that two tasks are trying to touch concurrently the same file.
 
This message is thrown upon recovery after a failure, or at the initialization of the job?
Could you please check the logs for other exceptions before this?
 
Can this be related to this issue?
 
Thanks,
Kostas
 
On Nov 28, 2016, at 5:37 PM, Diego Fustes Villadóniga <[hidden email]> wrote:
 
Hi colleagues,
 
I am experiencing problems when trying to write events from a stream to HDFS. I get the following exception:
 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): failed to create file /user/biguardian/events/2016-11-28--15/flinkpart-0-0.text for DFSClient_NONMAPREDUCE_1634980080_43 for client 172.21.40.75 because current leaseholder is trying to recreate file.
 
My Flink version is 1.1.3 and I am running it directly from a JAR (not in YARN) with java -jar.
 
Do you know the reason of this error?
 
Kind regards,
 
Diego