Projection exception with "Thrifted" Parquet data

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Projection exception with "Thrifted" Parquet data

Flavio Pompermaier
Hi to all,


If I do the following everything works fine:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.print();


But If do:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.project(1).print();

I get this exception:


Exception in thread "main" org.apache.flink.optimizer.CompilerException: Error translating node 'Map "Projection [1]" : MAP [[ GlobalProperties [partitioning=RANDOM_PARTITIONED] ]] [[ LocalProperties [ordering=null, grouped=null, unique=null] ]]': Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:360)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:103)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:198)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)

at org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:170)

at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:176)

at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:54)

at flink.parquet.ParquetThriftExample.main(ParquetThriftExample.java:70)

Caused by: org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:275)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createSingleInputVertex(JobGraphGenerator.java:803)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:305)

... 8 more

Caused by: java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.writeObject(Person.java:641)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)

at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:314)

at org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:268)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:273)

... 10 more

Caused by: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.validate(Person.java:632)

at flink.parquet.thrift.Person.write(Person.java:557)

at flink.parquet.thrift.Person.writeObject(Person.java:639)

... 34 more

Thanks in advance,
Flavio


Reply | Threaded
Open this post in threaded view
|

Re: Projection exception with "Thrifted" Parquet data

Stephan Ewen
Hi!

It seems that the  "project()" operator wants to write an empty instance of type "Person" as part of the object, which is not possible for thrift.

We can remove that instance, since it is not really needed. It was intended to be a reusable object instance, but it has become obsolete.

Stephan


On Thu, May 21, 2015 at 12:56 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,


If I do the following everything works fine:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.print();


But If do:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.project(1).print();

I get this exception:


Exception in thread "main" org.apache.flink.optimizer.CompilerException: Error translating node 'Map "Projection [1]" : MAP [[ GlobalProperties [partitioning=RANDOM_PARTITIONED] ]] [[ LocalProperties [ordering=null, grouped=null, unique=null] ]]': Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:360)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:103)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:198)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)

at org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:170)

at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:176)

at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:54)

at flink.parquet.ParquetThriftExample.main(ParquetThriftExample.java:70)

Caused by: org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:275)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createSingleInputVertex(JobGraphGenerator.java:803)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:305)

... 8 more

Caused by: java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.writeObject(Person.java:641)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)

at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:314)

at org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:268)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:273)

... 10 more

Caused by: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.validate(Person.java:632)

at flink.parquet.thrift.Person.write(Person.java:557)

at flink.parquet.thrift.Person.writeObject(Person.java:639)

... 34 more

Thanks in advance,
Flavio



Reply | Threaded
Open this post in threaded view
|

Re: Projection exception with "Thrifted" Parquet data

Flavio Pompermaier
Great! Thanks for the great support Stephan

On Thu, May 21, 2015 at 2:22 PM, Stephan Ewen <[hidden email]> wrote:
Hi!

It seems that the  "project()" operator wants to write an empty instance of type "Person" as part of the object, which is not possible for thrift.

We can remove that instance, since it is not really needed. It was intended to be a reusable object instance, but it has become obsolete.

Stephan


On Thu, May 21, 2015 at 12:56 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,


If I do the following everything works fine:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.print();


But If do:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.project(1).print();

I get this exception:


Exception in thread "main" org.apache.flink.optimizer.CompilerException: Error translating node 'Map "Projection [1]" : MAP [[ GlobalProperties [partitioning=RANDOM_PARTITIONED] ]] [[ LocalProperties [ordering=null, grouped=null, unique=null] ]]': Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:360)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:103)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:198)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)

at org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:170)

at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:176)

at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:54)

at flink.parquet.ParquetThriftExample.main(ParquetThriftExample.java:70)

Caused by: org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:275)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createSingleInputVertex(JobGraphGenerator.java:803)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:305)

... 8 more

Caused by: java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.writeObject(Person.java:641)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)

at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:314)

at org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:268)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:273)

... 10 more

Caused by: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.validate(Person.java:632)

at flink.parquet.thrift.Person.write(Person.java:557)

at flink.parquet.thrift.Person.writeObject(Person.java:639)

... 34 more

Thanks in advance,
Flavio




Reply | Threaded
Open this post in threaded view
|

Re: Projection exception with "Thrifted" Parquet data

Stephan Ewen
Hi Flavio!

The latest master should have a fix for the problem.


Greetings,
Stephan


On Thu, May 21, 2015 at 2:24 PM, Flavio Pompermaier <[hidden email]> wrote:
Great! Thanks for the great support Stephan

On Thu, May 21, 2015 at 2:22 PM, Stephan Ewen <[hidden email]> wrote:
Hi!

It seems that the  "project()" operator wants to write an empty instance of type "Person" as part of the object, which is not possible for thrift.

We can remove that instance, since it is not really needed. It was intended to be a reusable object instance, but it has become obsolete.

Stephan


On Thu, May 21, 2015 at 12:56 PM, Flavio Pompermaier <[hidden email]> wrote:
Hi to all,


If I do the following everything works fine:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.print();


But If do:

DataSet<Tuple2<Void, Person>> input = readThrift(env, "newpath");
input.project(1).print();

I get this exception:


Exception in thread "main" org.apache.flink.optimizer.CompilerException: Error translating node 'Map "Projection [1]" : MAP [[ GlobalProperties [partitioning=RANDOM_PARTITIONED] ]] [[ LocalProperties [ordering=null, grouped=null, unique=null] ]]': Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:360)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:103)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:198)

at org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)

at org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:170)

at org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:176)

at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:54)

at flink.parquet.ParquetThriftExample.main(ParquetThriftExample.java:70)

Caused by: org.apache.flink.runtime.operators.util.CorruptConfigurationException: Could not write the user code wrapper class org.apache.flink.api.common.operators.util.UserCodeObjectWrapper : java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:275)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createSingleInputVertex(JobGraphGenerator.java:803)

at org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:305)

... 8 more

Caused by: java.io.IOException: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.writeObject(Person.java:641)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1495)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1547)

at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1508)

at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)

at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)

at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)

at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:314)

at org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:268)

at org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:273)

... 10 more

Caused by: org.apache.thrift.protocol.TProtocolException: Required field 'name' was not present! Struct: Person(name:null, id:0, phone:null)

at flink.parquet.thrift.Person.validate(Person.java:632)

at flink.parquet.thrift.Person.write(Person.java:557)

at flink.parquet.thrift.Person.writeObject(Person.java:639)

... 34 more

Thanks in advance,
Flavio