First of all, split locality can make a huge difference.It will also enable a tighter integration, API-wise as well for the execution by pushing for example filters or projections directly into the data source and therefore reduce the data to be read from the file system.2014-11-11 12:30 GMT+01:00 Flavio Pompermaier <[hidden email]>:Maybe this is a dumb question but could you explain me what are the benefits of a dedicated Flink IF vs the one available by default in Hadoop IF wrapper?Is it just because of data locality of task slots?On Tue, Nov 11, 2014 at 12:16 PM, Fabian Hueske <[hidden email]> wrote:Best, FabianA dedicated Flink IF would be great though, IMO.Hi Flavio,I am not aware of a Flink InputFormat for Parquet. However, it should be hopefully covered by the Hadoop IF wrapper.2014-11-11 12:10 GMT+01:00 Flavio Pompermaier <[hidden email]>:Hi to all,I'd like to know whether Flink is able exploit Parquet format to read data efficiently from HDFS.Is there any example available?Bets,Flavio
Free forum by Nabble | Edit this page |