Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Filip Łęczycki
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
    val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki



Zrzut ekranu z 2015-03-15 14:05:27.png (256K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Maximilian Michels
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max


On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
    val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki



Reply | Threaded
Open this post in threaded view
|

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Filip Łęczycki
Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))
ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])
val file=env.readHadoopFile[Void, AlignmentRecord](new    ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)
file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:
seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))
    .groupBy(0).sum(1)
    .map(R=>(1,R._1,R._2))
    .groupBy(0)
    .sortGroup(1, Order.ASCENDING)
    .first(100)
    .map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,
Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max


On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
    val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki




Reply | Threaded
Open this post in threaded view
|

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Stephan Ewen
Hi!

From a quick look at the code it seems that this is a followup exception that occurs because the task has been shut down and the buffer pools destroyed.

Is there another root exception that is the root cause of the failure?

Greetings,
Stephan


On Tue, Apr 21, 2015 at 5:46 PM, Filip Łęczycki <[hidden email]> wrote:
Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))
ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])
val file=env.readHadoopFile[Void, AlignmentRecord](new    ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)
file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:
seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))
    .groupBy(0).sum(1)
    .map(R=>(1,R._1,R._2))
    .groupBy(0)
    .sortGroup(1, Order.ASCENDING)
    .first(100)
    .map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,
Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max


On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
    val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki





Reply | Threaded
Open this post in threaded view
|

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Filip Łęczycki
Hi Stephan,

You are right, sorry for not including this in initial mail.

I am receiving below information:

04/26/2015 17:13:43 DataSink (Print to System.out)(1/1) switched to FINISHED 
04/26/2015 17:13:43 CHAIN DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.createInput(ExecutionEnvironment.scala:356) (org.apache.flink.api.scala.hadoop.mapreduce.HadoopInpu) -> Map (Map at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:78))(1/4) switched to FAILED 
java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

04/26/2015 17:13:43 Job execution switched to status FAILING.
04/26/2015 17:13:43 Job execution switched to status FAILED.

Best regards,
Filip

Pozdrawiam,
Filip Łęczycki

2015-04-21 19:03 GMT+02:00 Stephan Ewen <[hidden email]>:
Hi!

From a quick look at the code it seems that this is a followup exception that occurs because the task has been shut down and the buffer pools destroyed.

Is there another root exception that is the root cause of the failure?

Greetings,
Stephan


On Tue, Apr 21, 2015 at 5:46 PM, Filip Łęczycki <[hidden email]> wrote:
Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))
ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])
val file=env.readHadoopFile[Void, AlignmentRecord](new    ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)
file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:
seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))
    .groupBy(0).sum(1)
    .map(R=>(1,R._1,R._2))
    .groupBy(0)
    .sortGroup(1, Order.ASCENDING)
    .first(100)
    .map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,
Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max


On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
    val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki






Reply | Threaded
Open this post in threaded view
|

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Stephan Ewen
Hi Filip!

We pushed a fix to the master end of last week, which addressed an issue with pre-mature buffer pool deallocation.

Can you try if the latest version fixes your problem?

Greetings,
Stephan


On Sun, Apr 26, 2015 at 5:16 PM, Filip Łęczycki <[hidden email]> wrote:
Hi Stephan,

You are right, sorry for not including this in initial mail.

I am receiving below information:

04/26/2015 17:13:43 DataSink (Print to System.out)(1/1) switched to FINISHED 
04/26/2015 17:13:43 CHAIN DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.createInput(ExecutionEnvironment.scala:356) (org.apache.flink.api.scala.hadoop.mapreduce.HadoopInpu) -> Map (Map at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:78))(1/4) switched to FAILED 
java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

04/26/2015 17:13:43 Job execution switched to status FAILING.
04/26/2015 17:13:43 Job execution switched to status FAILED.

Best regards,
Filip

Pozdrawiam,
Filip Łęczycki

2015-04-21 19:03 GMT+02:00 Stephan Ewen <[hidden email]>:
Hi!

From a quick look at the code it seems that this is a followup exception that occurs because the task has been shut down and the buffer pools destroyed.

Is there another root exception that is the root cause of the failure?

Greetings,
Stephan


On Tue, Apr 21, 2015 at 5:46 PM, Filip Łęczycki <[hidden email]> wrote:
Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))
ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])
val file=env.readHadoopFile[Void, AlignmentRecord](new    ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)
file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:
seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))
    .groupBy(0).sum(1)
    .map(R=>(1,R._1,R._2))
    .groupBy(0)
    .sortGroup(1, Order.ASCENDING)
    .first(100)
    .map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,
Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max


On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
    val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki