S3 as streaming source

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

S3 as streaming source

Soumya Simanta
Is there a standard Flink S3 source yet? 

Thanks
-Soumya

Reply | Threaded
Open this post in threaded view
|

Re: S3 as streaming source

Tzu-Li Tai
Hi Soumya,

No, currently there is no Flink standard supported S3 streaming source. As far as I know, there isn't one out in the public yet either. The community is open to submissions for new connectors, so if you happen to be working on one for S3, you can file up a JIRA to let us know.

Also, are you looking for a S3 streaming source that fetches S3 event notifications (ref: http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html), or streaming files / objects from S3 for a data stream program? I assume the first one, since otherwise writing Flink batch jobs will suit you more (the batch DataSet API already supports this).
Reply | Threaded
Open this post in threaded view
|

Re: S3 as streaming source

Chiwan Park-2
Hi all,

I think we can use `readFile`, `readFileStream` methods in `StreamExecutionEnvironment` to create streaming source from S3 because data are stored as file in S3. But I haven’t test it.

Regards,
Chiwan Park

> On Jun 3, 2016, at 2:37 PM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:
>
> Hi Soumya,
>
> No, currently there is no Flink standard supported S3 streaming source. As
> far as I know, there isn't one out in the public yet either. The community
> is open to submissions for new connectors, so if you happen to be working on
> one for S3, you can file up a JIRA to let us know.
>
> Also, are you looking for a S3 streaming source that fetches S3 event
> notifications (ref:
> http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html), or
> streaming files / objects from S3 for a data stream program? I assume the
> first one, since otherwise writing Flink batch jobs will suit you more (the
> batch DataSet API already supports this).
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-as-streaming-source-tp7357p7358.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: S3 as streaming source

Tzu-Li Tai
Hi,

I've gave it a quick test and Chiwan is right. The methods `readFile`, `readFileStream`, `readTextFile` on StreamExecutionEnvironment works with the S3 scheme to stream from S3 objects.
Reply | Threaded
Open this post in threaded view
|

Re: S3 as streaming source

Ufuk Celebi
You can check this docs page out for S3/AWS support:
https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/aws.html

On Fri, Jun 3, 2016 at 8:55 AM, Tzu-Li (Gordon) Tai <[hidden email]> wrote:

> Hi,
>
> I've gave it a quick test and Chiwan is right. The methods `readFile`,
> `readFileStream`, `readTextFile` on StreamExecutionEnvironment works with
> the S3 scheme to stream from S3 objects.
>
>
>
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/S3-as-streaming-source-tp7357p7361.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.