(DEPRECATED) Apache Flink User Mailing List archive.

Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Classic

List

Threaded

6 messages Options

Filip Łęczycki

Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

I have an issue while running the following line while usig flink v 0.8.1 (:

val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :

org.bdgenomics.formats.avro.AlignmentRecord

http://mvnrepository.com/artifact/org.bdgenomics.bdg-formats/bdg-formats/0.4.0

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException

at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)

at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)

at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)

at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki

Zrzut ekranu z 2015-03-15 14:05:27.png (256K) Download Attachment

Maximilian Michels

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,

Max

On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:

Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord
http://mvnrepository.com/artifact/org.bdgenomics.bdg-formats/bdg-formats/0.4.0

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki

Filip Łęczycki

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))

ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])

val file=env.readHadoopFile[Void, AlignmentRecord](new ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)

file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:

Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.

at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)

at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)

at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)

at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)

at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)

at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)

at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)

at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)

at akka.actor.Actor$class.aroundReceive(Actor.scala:465)

at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)

at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)

at akka.actor.ActorCell.invoke(ActorCell.scala:487)

at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)

at akka.dispatch.Mailbox.run(Mailbox.scala:221)

at akka.dispatch.Mailbox.exec(Mailbox.scala:231)

at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)

at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)

at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)

at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Caused by: java.lang.NullPointerException

at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)

at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)

at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)

at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)

at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)

at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)

at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:

seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))

.groupBy(0).sum(1)

.map(R=>(1,R._1,R._2))

.groupBy(0)

.sortGroup(1, Order.ASCENDING)

.first(100)

.map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,

Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:

Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max

On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord
http://mvnrepository.com/artifact/org.bdgenomics.bdg-formats/bdg-formats/0.4.0

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki

Stephan Ewen

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Hi!

From a quick look at the code it seems that this is a followup exception that occurs because the task has been shut down and the buffer pools destroyed.

Is there another root exception that is the root cause of the failure?

Greetings,
Stephan

On Tue, Apr 21, 2015 at 5:46 PM, Filip Łęczycki <[hidden email]> wrote:

Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))
ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])
val file=env.readHadoopFile[Void, AlignmentRecord](new ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)
file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:
seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))
.groupBy(0).sum(1)
.map(R=>(1,R._1,R._2))
.groupBy(0)
.sortGroup(1, Order.ASCENDING)
.first(100)
.map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,
Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max

On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord
http://mvnrepository.com/artifact/org.bdgenomics.bdg-formats/bdg-formats/0.4.0

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki

Filip Łęczycki

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Hi Stephan,

You are right, sorry for not including this in initial mail.

I am receiving below information:

04/26/2015 17:13:43 DataSink (Print to System.out)(1/1) switched to FINISHED

04/26/2015 17:13:43 CHAIN DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.createInput(ExecutionEnvironment.scala:356) (org.apache.flink.api.scala.hadoop.mapreduce.HadoopInpu) -> Map (Map at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:78))(1/4) switched to FAILED

java.lang.NullPointerException

at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)

at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)

at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)

at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)

at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)

at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)

at java.lang.Thread.run(Thread.java:744)

04/26/2015 17:13:43 Job execution switched to status FAILING.

04/26/2015 17:13:43 Job execution switched to status FAILED.

Best regards,

Filip

Pozdrawiam,
Filip Łęczycki

2015-04-21 19:03 GMT+02:00 Stephan Ewen <[hidden email]>:

Hi!

From a quick look at the code it seems that this is a followup exception that occurs because the task has been shut down and the buffer pools destroyed.

Is there another root exception that is the root cause of the failure?

Greetings,
Stephan

On Tue, Apr 21, 2015 at 5:46 PM, Filip Łęczycki <[hidden email]> wrote:
Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))
ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])
val file=env.readHadoopFile[Void, AlignmentRecord](new ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)
file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:
seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))
.groupBy(0).sum(1)
.map(R=>(1,R._1,R._2))
.groupBy(0)
.sortGroup(1, Order.ASCENDING)
.first(100)
.map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,
Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max

On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord
http://mvnrepository.com/artifact/org.bdgenomics.bdg-formats/bdg-formats/0.4.0

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki

Stephan Ewen

Re: Flink - Avro - AvroTypeInfo issue - Index out of bounds exception

Hi Filip!

We pushed a fix to the master end of last week, which addressed an issue with pre-mature buffer pool deallocation.

Can you try if the latest version fixes your problem?

Greetings,

Stephan

On Sun, Apr 26, 2015 at 5:16 PM, Filip Łęczycki <[hidden email]> wrote:

Hi Stephan,

You are right, sorry for not including this in initial mail.

I am receiving below information:

04/26/2015 17:13:43 DataSink (Print to System.out)(1/1) switched to FINISHED
04/26/2015 17:13:43 CHAIN DataSource (at org.apache.flink.api.scala.ExecutionEnvironment.createInput(ExecutionEnvironment.scala:356) (org.apache.flink.api.scala.hadoop.mapreduce.HadoopInpu) -> Map (Map at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:78))(1/4) switched to FAILED
java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

04/26/2015 17:13:43 Job execution switched to status FAILING.
04/26/2015 17:13:43 Job execution switched to status FAILED.

Best regards,
Filip

Pozdrawiam,
Filip Łęczycki

2015-04-21 19:03 GMT+02:00 Stephan Ewen <[hidden email]>:
Hi!

From a quick look at the code it seems that this is a followup exception that occurs because the task has been shut down and the buffer pools destroyed.

Is there another root exception that is the root cause of the failure?

Greetings,
Stephan

On Tue, Apr 21, 2015 at 5:46 PM, Filip Łęczycki <[hidden email]> wrote:
Hi guys,

Thank you very much for you help, upgarding to the 0.9.0-milestone resolved the issue but the new one arised. While trying to run the following code:

val job = Job.getInstance(new Configuration(true))
ParquetInputFormat.setReadSupportClass(job, classOf[AvroReadSupport[AlignmentRecord]])
val file=env.readHadoopFile[Void, AlignmentRecord](new ParquetInputFormat[AlignmentRecord],classOf[Void],classOf[AlignmentRecord],path1, job)
file.map(R=>R._2.getCigar()).first(1000).print()

I receive following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$receiveWithLogMessages$1.applyOrElse(JobManager.scala:306)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:91)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NullPointerException
at org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.setNextBuffer(SpanningRecordSerializer.java:93)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:92)
at org.apache.flink.runtime.operators.shipping.OutputCollector.collect(OutputCollector.java:65)
at org.apache.flink.runtime.operators.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:174)
at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
at java.lang.Thread.run(Thread.java:744)

The file is loaded properly, I am able to print out it's conent but when I try to do anything more complex (like .getCigar) the above exception arrives. Moreover, code that worked in the previous version, simple mapping:
seqQualSet.map(R=>R.qual.get).map(R=>(R.sum[Int]/R.size, 1L))
.groupBy(0).sum(1)
.map(R=>(1,R._1,R._2))
.groupBy(0)
.sortGroup(1, Order.ASCENDING)
.first(100)
.map(R=>(R._2,R._3))

after upgrade doesn't work and causes the same error as above.

Could you please advise me on that? If you need more information to determine the issue I will gladly provide.

Regards,
Filip Łęczycki

Pozdrawiam,
Filip Łęczycki

2015-04-14 11:43 GMT+02:00 Maximilian Michels <[hidden email]>:
Hi Filip,

I think your issue is best dealt with on the user mailing list. Unfortunately, you can't use attachments on the mailing lists. So if you want to post a screenshot you'll have to upload it somewhere else (e.g. http://imgur.com/).

I can confirm your error. Would you mind using the 0.9.0-milestone release? Just change the Flink version in the Maven pom.xml to 0.9.0-milestone-1. Your example works with this version.

Best regards,
Max

On Mon, Apr 13, 2015 at 6:42 PM, Filip Łęczycki <[hidden email]> wrote:
Hi

I have an issue while running the following line while usig flink v 0.8.1 (:
val asdf = new AvroTypeInfo[AlignmentRecord](classOf[AlignmentRecord])

Alignment record belongs to the package :
org.bdgenomics.formats.avro.AlignmentRecord
http://mvnrepository.com/artifact/org.bdgenomics.bdg-formats/bdg-formats/0.4.0

While trying to run I am receivng following exception:

Exception in thread "main" java.lang.IndexOutOfBoundsException
at org.apache.flink.api.java.typeutils.PojoTypeInfo.getPojoFieldAt(PojoTypeInfo.java:178)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.generateFieldsFromAvroSchema(AvroTypeInfo.java:52)
at org.apache.flink.api.java.typeutils.AvroTypeInfo.<init>(AvroTypeInfo.java:38)
at flinkTest.FlinkBAMReader$.main(FlinkBAMReader.scala:74)

While debugging I have noticed that the generated AvroTypeInfo class has a fields array with 42 elements, while the totalFIelds property has value 52 (please find screenshot attached) which seems to be the cause of the exception. Could you please help me o determine what may be the issue with the parser? Is this some bug in AvroTypeInfo class or the AlignmentRecord class is somehow corrupted?

Best Regards/Pozdrawiam,
Filip Łęczycki