|
Hi,It is not available for the Batch API, you would have to use the DataStream API.Best,AljoschaOn 15. Aug 2017, at 01:16, Kien Truong <[hidden email]> wrote:Hi,Admittedly, I have not suggested this because I thought it was not available for batch API.Regards,KienOn Aug 15, 2017, at 00:06, Nico Kruber <[hidden email]> wrote:Hi Eranga and Kien,
Flink supports asynchronous IO since version 1.2, see [1] for details.
You basically pack your URL download into the asynchronous part and collect
the resulting string for further processing in your pipeline.
Nico
[1] https://ci.apache.org/projects/flink/flink-docs- /release-1.3/dev/stream
asyncio.html
On Monday, 14 August 2017 17:50:47 CEST Kien Truong wrote:
Hi,
While this task is quite trivial to do with Flink Dataset API, using
readTextFile to read the input and
a flatMap function to perform the downloading, it might not be a good idea.
The download process is I/O bound, and will block the synchronous
flatMap function,
so the throughput will not be very good.
Until Flink supports asynchronous functions, I suggest you looks elsewhere.
An example with master-workers architecture using Akka can be found here
https://github.com/typesafehub/activator-akka- distributed-workers
Regards,
Kien
On 8/14/2017 10:09 AM, Eranga Heshan wrote:
Hi all,
I am fairly new to Flink. I have this project where I have a list of
URLs (In one node) which need to be crawled distributedly. Then for
each URL, I need the serialized crawled result to be written to a
single text file.
I want to know if there are similar projects which I can look into or
an idea on how to implement this.
Thanks & Regards,
Eranga Heshan
/Undergraduate/
Computer Science & Engineering
University of Moratuwa
Mobile: <a href="tel:+94%2071%20138%202686" value="+94711382686" target="_blank">+94 71 138 2686 <<a href="tel:%2B94%2071%20552%202087" target="_blank">tel:%2B94%2071%20552%202087>
Email: [hidden email] <[hidden email]>
< https://www.facebook.com/erangaheshan >
< https://twitter.com/erangaheshan >
< https://www.linkedin.com/in/erangaheshan >
Free forum by Nabble | Edit this page |