Looking for relevant sources related to connecting Apache Flink and Edgent.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Looking for relevant sources related to connecting Apache Flink and Edgent.

Felipe Gutierrez
Hi,

I am trying to design a little prototype with Flink and Apache Edgent (http://edgent.apache.org/) and I would like some help on the direction for it. I am running Flink at my laptop and Edgent on my Raspberry Pi with a simple filter for a proximity sensor (https://github.com/felipegutierrez/explore-rpi/blob/master/src/main/java/org/sense/edgent/app/UltrasonicEdgentApp.java).

My idea is to push down the filter operator from Flink to the Raspberry Pi which is running Apache Edgent. With this in mind, where do you guys advise me to start?

I have some ideas to study...
1 - Try to get the list of operators that Flink is about to execute on the JobManager. source: https://ci.apache.org/projects/flink/flink-docs-stable/internals/job_scheduling.html
2 - Implement a connector to Apache Edgent in order to exchange messages between Flink and Edgent.

Do you guys think in another source that is interesting regarding my prototype?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: Looking for relevant sources related to connecting Apache Flink and Edgent.

Kostas Kloudas
Hi Felipe,

This seems related to your previous question about a custom scheduler that knows which task to run on which machine.
As Chesnay said, this is a rather involved and laborious task, if you want to do it as a general framework.

But if you know what operation to push down, then why not decoupling the two and implementing the filtering as a separate job 
running on your Raspberry and a new job which consumes the output of the first and does the analytics?

Cheers,
Kostas

On Thu, Nov 29, 2018 at 10:23 AM Felipe Gutierrez <[hidden email]> wrote:
Hi,

I am trying to design a little prototype with Flink and Apache Edgent (http://edgent.apache.org/) and I would like some help on the direction for it. I am running Flink at my laptop and Edgent on my Raspberry Pi with a simple filter for a proximity sensor (https://github.com/felipegutierrez/explore-rpi/blob/master/src/main/java/org/sense/edgent/app/UltrasonicEdgentApp.java).

My idea is to push down the filter operator from Flink to the Raspberry Pi which is running Apache Edgent. With this in mind, where do you guys advise me to start?

I have some ideas to study...
1 - Try to get the list of operators that Flink is about to execute on the JobManager. source: https://ci.apache.org/projects/flink/flink-docs-stable/internals/job_scheduling.html
2 - Implement a connector to Apache Edgent in order to exchange messages between Flink and Edgent.

Do you guys think in another source that is interesting regarding my prototype?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: Looking for relevant sources related to connecting Apache Flink and Edgent.

Kostas Kloudas
Hi again,

I forgot to say that, unfortunately, I am not familiar with Apache Edgent, but if you can write your filter in Edgent's programming model,
Then you can push your data from Edgent to a third party storage system (e.g. Kafka, HDFS, etc) and use Flink's connectors, instead of 
having to implement a custom source.

Cheers,
Kostas

On Thu, Nov 29, 2018 at 11:08 AM Kostas Kloudas <[hidden email]> wrote:
Hi Felipe,

This seems related to your previous question about a custom scheduler that knows which task to run on which machine.
As Chesnay said, this is a rather involved and laborious task, if you want to do it as a general framework.

But if you know what operation to push down, then why not decoupling the two and implementing the filtering as a separate job 
running on your Raspberry and a new job which consumes the output of the first and does the analytics?

Cheers,
Kostas

On Thu, Nov 29, 2018 at 10:23 AM Felipe Gutierrez <[hidden email]> wrote:
Hi,

I am trying to design a little prototype with Flink and Apache Edgent (http://edgent.apache.org/) and I would like some help on the direction for it. I am running Flink at my laptop and Edgent on my Raspberry Pi with a simple filter for a proximity sensor (https://github.com/felipegutierrez/explore-rpi/blob/master/src/main/java/org/sense/edgent/app/UltrasonicEdgentApp.java).

My idea is to push down the filter operator from Flink to the Raspberry Pi which is running Apache Edgent. With this in mind, where do you guys advise me to start?

I have some ideas to study...
1 - Try to get the list of operators that Flink is about to execute on the JobManager. source: https://ci.apache.org/projects/flink/flink-docs-stable/internals/job_scheduling.html
2 - Implement a connector to Apache Edgent in order to exchange messages between Flink and Edgent.

Do you guys think in another source that is interesting regarding my prototype?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: Looking for relevant sources related to connecting Apache Flink and Edgent.

Felipe Gutierrez
thanks Kostas for the quick reply,

yes. It is related to my previous question.

When you said "But if you know what operation to push down" -> This is what I am trying to search on Flink code. I want to know the operation on the fly.
The component on Flink that will say to me that there is a filter on the query specified by the user. I want to get this metadata and send a message to my RPi through a Flink connector (I guess this is the way to do) and the data stream will come to Flink already filtered.

I intend to start with a simple and naive example. Do you know which component on Flink I can get the operations on the fly that are running inside a query?

thanks
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Nov 29, 2018 at 11:18 AM Kostas Kloudas <[hidden email]> wrote:
Hi again,

I forgot to say that, unfortunately, I am not familiar with Apache Edgent, but if you can write your filter in Edgent's programming model,
Then you can push your data from Edgent to a third party storage system (e.g. Kafka, HDFS, etc) and use Flink's connectors, instead of 
having to implement a custom source.

Cheers,
Kostas

On Thu, Nov 29, 2018 at 11:08 AM Kostas Kloudas <[hidden email]> wrote:
Hi Felipe,

This seems related to your previous question about a custom scheduler that knows which task to run on which machine.
As Chesnay said, this is a rather involved and laborious task, if you want to do it as a general framework.

But if you know what operation to push down, then why not decoupling the two and implementing the filtering as a separate job 
running on your Raspberry and a new job which consumes the output of the first and does the analytics?

Cheers,
Kostas

On Thu, Nov 29, 2018 at 10:23 AM Felipe Gutierrez <[hidden email]> wrote:
Hi,

I am trying to design a little prototype with Flink and Apache Edgent (http://edgent.apache.org/) and I would like some help on the direction for it. I am running Flink at my laptop and Edgent on my Raspberry Pi with a simple filter for a proximity sensor (https://github.com/felipegutierrez/explore-rpi/blob/master/src/main/java/org/sense/edgent/app/UltrasonicEdgentApp.java).

My idea is to push down the filter operator from Flink to the Raspberry Pi which is running Apache Edgent. With this in mind, where do you guys advise me to start?

I have some ideas to study...
1 - Try to get the list of operators that Flink is about to execute on the JobManager. source: https://ci.apache.org/projects/flink/flink-docs-stable/internals/job_scheduling.html
2 - Implement a connector to Apache Edgent in order to exchange messages between Flink and Edgent.

Do you guys think in another source that is interesting regarding my prototype?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: Looking for relevant sources related to connecting Apache Flink and Edgent.

Felipe Gutierrez
I guess this message from 2016 is very related of what I am looking for (http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Execution-Plan-td4290.html). I am posting here for future references.

I am going to implement a toy example to visualize this. Do you guys see this description as actual of latest Flink source code?
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Nov 29, 2018 at 12:01 PM Felipe Gutierrez <[hidden email]> wrote:
thanks Kostas for the quick reply,

yes. It is related to my previous question.

When you said "But if you know what operation to push down" -> This is what I am trying to search on Flink code. I want to know the operation on the fly.
The component on Flink that will say to me that there is a filter on the query specified by the user. I want to get this metadata and send a message to my RPi through a Flink connector (I guess this is the way to do) and the data stream will come to Flink already filtered.

I intend to start with a simple and naive example. Do you know which component on Flink I can get the operations on the fly that are running inside a query?

thanks
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Nov 29, 2018 at 11:18 AM Kostas Kloudas <[hidden email]> wrote:
Hi again,

I forgot to say that, unfortunately, I am not familiar with Apache Edgent, but if you can write your filter in Edgent's programming model,
Then you can push your data from Edgent to a third party storage system (e.g. Kafka, HDFS, etc) and use Flink's connectors, instead of 
having to implement a custom source.

Cheers,
Kostas

On Thu, Nov 29, 2018 at 11:08 AM Kostas Kloudas <[hidden email]> wrote:
Hi Felipe,

This seems related to your previous question about a custom scheduler that knows which task to run on which machine.
As Chesnay said, this is a rather involved and laborious task, if you want to do it as a general framework.

But if you know what operation to push down, then why not decoupling the two and implementing the filtering as a separate job 
running on your Raspberry and a new job which consumes the output of the first and does the analytics?

Cheers,
Kostas

On Thu, Nov 29, 2018 at 10:23 AM Felipe Gutierrez <[hidden email]> wrote:
Hi,

I am trying to design a little prototype with Flink and Apache Edgent (http://edgent.apache.org/) and I would like some help on the direction for it. I am running Flink at my laptop and Edgent on my Raspberry Pi with a simple filter for a proximity sensor (https://github.com/felipegutierrez/explore-rpi/blob/master/src/main/java/org/sense/edgent/app/UltrasonicEdgentApp.java).

My idea is to push down the filter operator from Flink to the Raspberry Pi which is running Apache Edgent. With this in mind, where do you guys advise me to start?

I have some ideas to study...
1 - Try to get the list of operators that Flink is about to execute on the JobManager. source: https://ci.apache.org/projects/flink/flink-docs-stable/internals/job_scheduling.html
2 - Implement a connector to Apache Edgent in order to exchange messages between Flink and Edgent.

Do you guys think in another source that is interesting regarding my prototype?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: Looking for relevant sources related to connecting Apache Flink and Edgent.

Fabian Hueske-2
Hi Felipe,

You can define TableSources (for SQL, Table API) that support filter push-down.
The optimizer will figure out this opportunity and hand filters to a custom TableSource.

I should add that AFAIK this feature is not used very often (expect some rough edges) and that the API is likely to change in the future.
But it might be enough for a simple POC.

Best, Fabian

Am Fr., 30. Nov. 2018 um 10:13 Uhr schrieb Felipe Gutierrez <[hidden email]>:
I guess this message from 2016 is very related of what I am looking for (http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Execution-Plan-td4290.html). I am posting here for future references.

I am going to implement a toy example to visualize this. Do you guys see this description as actual of latest Flink source code?
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Nov 29, 2018 at 12:01 PM Felipe Gutierrez <[hidden email]> wrote:
thanks Kostas for the quick reply,

yes. It is related to my previous question.

When you said "But if you know what operation to push down" -> This is what I am trying to search on Flink code. I want to know the operation on the fly.
The component on Flink that will say to me that there is a filter on the query specified by the user. I want to get this metadata and send a message to my RPi through a Flink connector (I guess this is the way to do) and the data stream will come to Flink already filtered.

I intend to start with a simple and naive example. Do you know which component on Flink I can get the operations on the fly that are running inside a query?

thanks
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Nov 29, 2018 at 11:18 AM Kostas Kloudas <[hidden email]> wrote:
Hi again,

I forgot to say that, unfortunately, I am not familiar with Apache Edgent, but if you can write your filter in Edgent's programming model,
Then you can push your data from Edgent to a third party storage system (e.g. Kafka, HDFS, etc) and use Flink's connectors, instead of 
having to implement a custom source.

Cheers,
Kostas

On Thu, Nov 29, 2018 at 11:08 AM Kostas Kloudas <[hidden email]> wrote:
Hi Felipe,

This seems related to your previous question about a custom scheduler that knows which task to run on which machine.
As Chesnay said, this is a rather involved and laborious task, if you want to do it as a general framework.

But if you know what operation to push down, then why not decoupling the two and implementing the filtering as a separate job 
running on your Raspberry and a new job which consumes the output of the first and does the analytics?

Cheers,
Kostas

On Thu, Nov 29, 2018 at 10:23 AM Felipe Gutierrez <[hidden email]> wrote:
Hi,

I am trying to design a little prototype with Flink and Apache Edgent (http://edgent.apache.org/) and I would like some help on the direction for it. I am running Flink at my laptop and Edgent on my Raspberry Pi with a simple filter for a proximity sensor (https://github.com/felipegutierrez/explore-rpi/blob/master/src/main/java/org/sense/edgent/app/UltrasonicEdgentApp.java).

My idea is to push down the filter operator from Flink to the Raspberry Pi which is running Apache Edgent. With this in mind, where do you guys advise me to start?

I have some ideas to study...
1 - Try to get the list of operators that Flink is about to execute on the JobManager. source: https://ci.apache.org/projects/flink/flink-docs-stable/internals/job_scheduling.html
2 - Implement a connector to Apache Edgent in order to exchange messages between Flink and Edgent.

Do you guys think in another source that is interesting regarding my prototype?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: Looking for relevant sources related to connecting Apache Flink and Edgent.

Felipe Gutierrez
Cool, thanks! 

I am also going to build a little POC like you said.

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Fri, Nov 30, 2018 at 11:33 AM Fabian Hueske <[hidden email]> wrote:
Hi Felipe,

You can define TableSources (for SQL, Table API) that support filter push-down.
The optimizer will figure out this opportunity and hand filters to a custom TableSource.

I should add that AFAIK this feature is not used very often (expect some rough edges) and that the API is likely to change in the future.
But it might be enough for a simple POC.

Best, Fabian

Am Fr., 30. Nov. 2018 um 10:13 Uhr schrieb Felipe Gutierrez <[hidden email]>:
I guess this message from 2016 is very related of what I am looking for (http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Execution-Plan-td4290.html). I am posting here for future references.

I am going to implement a toy example to visualize this. Do you guys see this description as actual of latest Flink source code?
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Nov 29, 2018 at 12:01 PM Felipe Gutierrez <[hidden email]> wrote:
thanks Kostas for the quick reply,

yes. It is related to my previous question.

When you said "But if you know what operation to push down" -> This is what I am trying to search on Flink code. I want to know the operation on the fly.
The component on Flink that will say to me that there is a filter on the query specified by the user. I want to get this metadata and send a message to my RPi through a Flink connector (I guess this is the way to do) and the data stream will come to Flink already filtered.

I intend to start with a simple and naive example. Do you know which component on Flink I can get the operations on the fly that are running inside a query?

thanks
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez


On Thu, Nov 29, 2018 at 11:18 AM Kostas Kloudas <[hidden email]> wrote:
Hi again,

I forgot to say that, unfortunately, I am not familiar with Apache Edgent, but if you can write your filter in Edgent's programming model,
Then you can push your data from Edgent to a third party storage system (e.g. Kafka, HDFS, etc) and use Flink's connectors, instead of 
having to implement a custom source.

Cheers,
Kostas

On Thu, Nov 29, 2018 at 11:08 AM Kostas Kloudas <[hidden email]> wrote:
Hi Felipe,

This seems related to your previous question about a custom scheduler that knows which task to run on which machine.
As Chesnay said, this is a rather involved and laborious task, if you want to do it as a general framework.

But if you know what operation to push down, then why not decoupling the two and implementing the filtering as a separate job 
running on your Raspberry and a new job which consumes the output of the first and does the analytics?

Cheers,
Kostas

On Thu, Nov 29, 2018 at 10:23 AM Felipe Gutierrez <[hidden email]> wrote:
Hi,

I am trying to design a little prototype with Flink and Apache Edgent (http://edgent.apache.org/) and I would like some help on the direction for it. I am running Flink at my laptop and Edgent on my Raspberry Pi with a simple filter for a proximity sensor (https://github.com/felipegutierrez/explore-rpi/blob/master/src/main/java/org/sense/edgent/app/UltrasonicEdgentApp.java).

My idea is to push down the filter operator from Flink to the Raspberry Pi which is running Apache Edgent. With this in mind, where do you guys advise me to start?

I have some ideas to study...
1 - Try to get the list of operators that Flink is about to execute on the JobManager. source: https://ci.apache.org/projects/flink/flink-docs-stable/internals/job_scheduling.html
2 - Implement a connector to Apache Edgent in order to exchange messages between Flink and Edgent.

Do you guys think in another source that is interesting regarding my prototype?

Thanks,
Felipe
--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez