Flink DataSet Iterate updating additional variable

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink DataSet Iterate updating additional variable

Antonio Martínez Carratalá
Hello

I'm trying to implement the ForceAtlas2 (graph layout) algorithm in Flink using datasets, it is an iterative algorithm and I have most of it ready, but there is something I don't know how to do. Apart from the dataset with the coordinates (x,y) of each node I need an additional variable to regulate the speed, very simplified it would be something like this:

DataSet<Node> nodes = env.fromCollection(nodesList);
Double speed = 1.0;
nodes.iterate(100) {
   nodes = nodes with forces to apply calculated;
   speed = speed regulated based on previous value and forces calculated;
   nodes = nodes coordinates updated with forces * speed;
} closesWith(nodes)

In my case, the nodes are the dataset I'm iterating over and it is working perfect if I forget about speed, but I don't know how to keep speed variable updated in every iteration to be able to use it in the next one

Any suggestions? Thanks



Reply | Threaded
Open this post in threaded view
|

Re: Flink DataSet Iterate updating additional variable

r_khachatryan
Hi Antonio,


Regards,
Roman


On Mon, Jul 13, 2020 at 3:49 PM Antonio Martínez Carratalá <[hidden email]> wrote:
Hello

I'm trying to implement the ForceAtlas2 (graph layout) algorithm in Flink using datasets, it is an iterative algorithm and I have most of it ready, but there is something I don't know how to do. Apart from the dataset with the coordinates (x,y) of each node I need an additional variable to regulate the speed, very simplified it would be something like this:

DataSet<Node> nodes = env.fromCollection(nodesList);
Double speed = 1.0;
nodes.iterate(100) {
   nodes = nodes with forces to apply calculated;
   speed = speed regulated based on previous value and forces calculated;
   nodes = nodes coordinates updated with forces * speed;
} closesWith(nodes)

In my case, the nodes are the dataset I'm iterating over and it is working perfect if I forget about speed, but I don't know how to keep speed variable updated in every iteration to be able to use it in the next one

Any suggestions? Thanks



Reply | Threaded
Open this post in threaded view
|

Re: Flink DataSet Iterate updating additional variable

Antonio Martínez Carratalá
Hi Roman, 

Thank you for your quick reply, but as far as I know broadcast variables cannot be written, my problem is that I need to update the value of the speed variable to use it in the next iteration.

Iterate only has one input dataset and propagates it to the next iteration using closeWith(), but I need another variable (maybe dataset) to be propagated too, is this possible in some way?

Thanks

On Mon, Jul 13, 2020 at 8:47 PM Khachatryan Roman <[hidden email]> wrote:
Hi Antonio,


Regards,
Roman


On Mon, Jul 13, 2020 at 3:49 PM Antonio Martínez Carratalá <[hidden email]> wrote:
Hello

I'm trying to implement the ForceAtlas2 (graph layout) algorithm in Flink using datasets, it is an iterative algorithm and I have most of it ready, but there is something I don't know how to do. Apart from the dataset with the coordinates (x,y) of each node I need an additional variable to regulate the speed, very simplified it would be something like this:

DataSet<Node> nodes = env.fromCollection(nodesList);
Double speed = 1.0;
nodes.iterate(100) {
   nodes = nodes with forces to apply calculated;
   speed = speed regulated based on previous value and forces calculated;
   nodes = nodes coordinates updated with forces * speed;
} closesWith(nodes)

In my case, the nodes are the dataset I'm iterating over and it is working perfect if I forget about speed, but I don't know how to keep speed variable updated in every iteration to be able to use it in the next one

Any suggestions? Thanks




Reply | Threaded
Open this post in threaded view
|

Re: Flink DataSet Iterate updating additional variable

r_khachatryan
Hi Antonio,

Yes, you are right. Revisiting your question, I'm wondering whether it's possible to partition speeds and nodes in the same way (stably across iterations)? (I'm assuming a distributed setup)
If not, each iteration would have to wait for *all* subtasks of the previous iteration to finish, right? 
Which will likely neglect the benefits of the iterative approach. 

Regards,
Roman


On Tue, Jul 14, 2020 at 9:36 AM Antonio Martínez Carratalá <[hidden email]> wrote:
Hi Roman, 

Thank you for your quick reply, but as far as I know broadcast variables cannot be written, my problem is that I need to update the value of the speed variable to use it in the next iteration.

Iterate only has one input dataset and propagates it to the next iteration using closeWith(), but I need another variable (maybe dataset) to be propagated too, is this possible in some way?

Thanks

On Mon, Jul 13, 2020 at 8:47 PM Khachatryan Roman <[hidden email]> wrote:
Hi Antonio,


Regards,
Roman


On Mon, Jul 13, 2020 at 3:49 PM Antonio Martínez Carratalá <[hidden email]> wrote:
Hello

I'm trying to implement the ForceAtlas2 (graph layout) algorithm in Flink using datasets, it is an iterative algorithm and I have most of it ready, but there is something I don't know how to do. Apart from the dataset with the coordinates (x,y) of each node I need an additional variable to regulate the speed, very simplified it would be something like this:

DataSet<Node> nodes = env.fromCollection(nodesList);
Double speed = 1.0;
nodes.iterate(100) {
   nodes = nodes with forces to apply calculated;
   speed = speed regulated based on previous value and forces calculated;
   nodes = nodes coordinates updated with forces * speed;
} closesWith(nodes)

In my case, the nodes are the dataset I'm iterating over and it is working perfect if I forget about speed, but I don't know how to keep speed variable updated in every iteration to be able to use it in the next one

Any suggestions? Thanks