Hi,
I was trying to read a simple binary file using SerializedInputFormat as suggested in a different thread, but encounters the following error. I tried to do what the exception suggests, but eventhough createInput() returns a DataSet object I couldn't find how to specify which file to read. Any help is appreciated. The file I am trying to read is a simple binary file with containing java short values. Is there any example on reading binary files available? Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The type returned by the input format could not be automatically determined. Please specify the TypeInformation of the produced type explicitly by using the 'createInput(InputFormat, TypeInformation)' method instead. Thank you, Saliya Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center |
Hi Saliya, in order to set the file path for the
Cheers, On Mon, Feb 8, 2016 at 7:00 AM, Saliya Ekanayake <[hidden email]> wrote:
|
In reply to this post by Saliya Ekanayake
Hi Saliya,
Thanks for your question. Flink's type analyzer couldn't extract the type information. You may implement the ResultTypeQueryable interface in your custom source. That way you can manually specify the correct type. If that doesn't help you, could you please share more of the stack trace? Thanks, Max On Mon, Feb 8, 2016 at 7:00 AM, Saliya Ekanayake <[hidden email]> wrote: > Hi, > > I was trying to read a simple binary file using SerializedInputFormat as > suggested in a different thread, but encounters the following error. I tried > to do what the exception suggests, but eventhough createInput() returns a > DataSet object I couldn't find how to specify which file to read. > > Any help is appreciated. The file I am trying to read is a simple binary > file with containing java short values. Is there any example on reading > binary files available? > > Exception in thread "main" > org.apache.flink.api.common.InvalidProgramException: The type returned by > the input format could not be automatically determined. Please specify the > TypeInformation of the produced type explicitly by using the > 'createInput(InputFormat, TypeInformation)' method instead. > > Thank you, > Saliya > > > -- > Saliya Ekanayake > Ph.D. Candidate | Research Assistant > School of Informatics and Computing | Digital Science Center > Indiana University, Bloomington > Cell 812-391-4914 > http://saliya.org |
Thank you Till and Max. I'll try the set file path method and let you know. On Feb 8, 2016 5:45 AM, "Maximilian Michels" <[hidden email]> wrote:
Hi Saliya, |
In reply to this post by Till Rohrmann
Till, I am still having trouble getting this to work. Here's my code (https://github.com/esaliya/flinkit) String binaryFile = "src/main/resources/sample.bin"; I still get the same error as shown below Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: The type returned by the input format could not be automatically determined. Please specify the TypeInformation of the produced type explicitly by using the 'createInput(InputFormat, TypeInformation)' method instead. at org.apache.flink.api.java.ExecutionEnvironment.createInput(ExecutionEnvironment.java:511) at org.saliya.flinkit.WordCount.main(WordCount.java:24) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) On Mon, Feb 8, 2016 at 5:42 AM, Till Rohrmann <[hidden email]> wrote:
Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center |
Hi, please try to replace DataSet<ShortValue> ds = env.createInput(sif); DataSet<ShortValue> ds = env.createInput(sif, ValueTypeInfo.SHORT_VALUE_TYPE_INFO); 2016-02-08 19:33 GMT+01:00 Saliya Ekanayake <[hidden email]>:
|
Thank you, Fabian. It solved the compilation error, but at runtime I get an end-of-file exception. I've put up a sample code with data at Github https://github.com/esaliya/flinkit. The data file is a binary file containing 64 Short values. 02/08/2016 16:01:19 CHAIN DataSource (at main(WordCount.java:25) (org.apache.flink.api.common.io.SerializedInputFormat)) -> FlatMap (count())(4/8) switched to FAILED java.io.EOFException at java.io.DataInputStream.readShort(DataInputStream.java:315) at org.apache.flink.core.memory.InputViewDataInputStreamWrapper.readShort(InputViewDataInputStreamWrapper.java:92) at org.apache.flink.types.ShortValue.read(ShortValue.java:88) at org.apache.flink.api.common.io.SerializedInputFormat.deserialize(SerializedInputFormat.java:37) at org.apache.flink.api.common.io.SerializedInputFormat.deserialize(SerializedInputFormat.java:31) at org.apache.flink.api.common.io.BinaryInputFormat.nextRecord(BinaryInputFormat.java:274) at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:169) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) On Mon, Feb 8, 2016 at 3:50 PM, Fabian Hueske <[hidden email]> wrote:
Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center |
The SerializedInputFormat extends the BinaryInputFormat which expects a special block-wise encoding and certain metadata fields. I suggest to implement a custom input format based on FileInputFormat.It is not suited to read arbitrary binary files such as a file with 64 short values. 2016-02-08 22:05 GMT+01:00 Saliya Ekanayake <[hidden email]>:
|
Thank you, Fabian. I'll try to do it. On Mon, Feb 8, 2016 at 4:37 PM, Fabian Hueske <[hidden email]> wrote:
Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center |
Free forum by Nabble | Edit this page |