RE: Problems with RollingSink

Posted by Diego Fustes Villadóniga on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Problems-with-RollingSink-tp10351p10361.html

Hi Kostas,

 

Thanks for your reply.

 

The problem is at the initialization of the job.  The reason was that I was using the same HDFS path as sink for 3 different streams, which is something that I would like. I can fix it by using different paths

for each stream.

 

Maybe there is a way to achieve this in a different manner by joining the streams somehow before sinking… maybe through Kafka?

 

Kind Regards,

 

Diego

 

 

 

De: Kostas Kloudas [mailto:[hidden email]]
Enviado el: lunes, 28 de noviembre de 2016 19:13
Para: [hidden email]
Asunto: Re: Problems with RollingSink

 

Hi Diego,

 

The message shows that two tasks are trying to touch concurrently the same file.

 

This message is thrown upon recovery after a failure, or at the initialization of the job?

Could you please check the logs for other exceptions before this?

 

Can this be related to this issue?

https://www.mail-archive.com/issues@.../msg73871.html

 

Thanks,

Kostas

 

On Nov 28, 2016, at 5:37 PM, Diego Fustes Villadóniga <[hidden email]> wrote:

 

Hi colleagues,

 

I am experiencing problems when trying to write events from a stream to HDFS. I get the following exception:

 

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): failed to create file /user/biguardian/events/2016-11-28--15/flinkpart-0-0.text for DFSClient_NONMAPREDUCE_1634980080_43 for client 172.21.40.75 because current leaseholder is trying to recreate file.

 

My Flink version is 1.1.3 and I am running it directly from a JAR (not in YARN) with java -jar.

 

Do you know the reason of this error?

 

Kind regards,

 

Diego