Re: How to load multiple same-format files with single batch job?

Posted by Fabian Hueske-2 on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/How-to-load-multiple-same-format-files-with-single-batch-job-tp25806p25893.html

Hi,

The files will be read in a streaming fashion.
Typically files are broken down into processing splits that are distributed to tasks for reading.
How a task reads a file split depends on the implementation, but usually the format reads the split as a stream and does not read the split as a whole before emitting records.

Best,
Fabian

Am Mo., 4. Feb. 2019 um 12:06 Uhr schrieb françois lacombe <[hidden email]>:
Hi Fabian,

Thank you for this input.
This is interesting.

With such an input format, will all the file will be loaded in memory before to be processed or will all be streamed?

All the best
François

Le mar. 29 janv. 2019 à 22:20, Fabian Hueske <[hidden email]> a écrit :
Hi,

You can point a file-based input format to a directory and the input format should read all files in that directory.
That works as well for TableSources that are internally use file-based input formats.
Is that what you are looking for?

Best, Fabian

Am Mo., 28. Jan. 2019 um 17:22 Uhr schrieb françois lacombe <[hidden email]>:
Hi all,

I'm wondering if it's possible and what's the best way to achieve the loading of multiple files with a Json source to a JDBC sink ?
I'm running Flink 1.7.0

Let's say I have about 1500 files with the same structure (same format, schema, everything) and I want to load them with a *batch* job
Can Flink handle the loading of one and each file in a single source and send data to my JDBC sink?
I wish I can provide the URL of the directory containing my thousand files to the batch source to make it load all of them sequentially.
My sources and sinks are currently available for BatchTableSource, I guess the cost to make them available for streaming would be quite expensive for me for the moment.

Have someone ever done this?
Am I wrong to expect doing so with a batch job?

All the best

François Lacombe


      

Arbre vert.jpg Pensez à la planète, imprimer ce papier que si nécessaire 


      

Arbre vert.jpg Pensez à la planète, imprimer ce papier que si nécessaire