Support for gRPC in Flink StateFun 2.x

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Support for gRPC in Flink StateFun 2.x

Dalmo Cirne

Hi,

 

In the latest Flink Forward, from April 2020, there were mentions that adding support to gRPC, in addition to HTTP, was in the works and would be implemented in the future.

 

Looking into the flink-statefun repository on GitHub, one can see that there is already some work done with gRPC, but parity with its HTTP counterpart is not there, yet.

 

Is there a roadmap or an estimate of when gRPC will be implemented in StateFun?

 

Thank you,

 

Dalmo

 

 

 

 

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Support for gRPC in Flink StateFun 2.x

Igal Shilman
Hi,

Your observation is correct, currently the only way to invoke a remote function is trough an HTTP POST request to a service that exposes a StateFun endpoint.

The endpoint must implement the client side of a the “RequestReply” protocol as defined by StateFun (basically an invocation contains the state and message, and a response contains a description of the side effects).

While gRPC can be easily added a as a replacement for the transport layer, the client side (the remote function) would still have to implement the RequestReply protocol.

To truly utilize gRPC we would want to introduce a new type of protocol, that can exploit the low latency bi-directional streams to and from the function.

While for the later it is a bit difficult to commit for a specific date the former can be easily implemented in the next StateFun release.

Would you be able to share with us a little bit more about your original motivation to ask this question :-)
This would help us as we gather more and more use cases.

For example: target language, environment, how gRPC services are being discovered.

Thanks,
Igal

On Thursday, September 17, 2020, Dalmo Cirne <[hidden email]> wrote:

Hi,

 

In the latest Flink Forward, from April 2020, there were mentions that adding support to gRPC, in addition to HTTP, was in the works and would be implemented in the future.

 

Looking into the flink-statefun repository on GitHub, one can see that there is already some work done with gRPC, but parity with its HTTP counterpart is not there, yet.

 

Is there a roadmap or an estimate of when gRPC will be implemented in StateFun?

 

Thank you,

 

Dalmo

 

 

 

 

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Support for gRPC in Flink StateFun 2.x

Dalmo Cirne

Thank you for the quick reply, Igal.

 

Our use case is the following: A stream of data from Kafka is fed into Flink where data transformations take place. After that we send that transformed data to an inference engine to score the relevance of each record. (Rough simplification.)

 

Doing that using HTTP endpoints is possible, and it is the solution we have in place today, however, for each request to that endpoint, we need to incur the cost of establishing the connection, etc., thus increasing the latency of the system.

 

We do process data in batches to mitigate the latency, but it is not the same as having a bi-directional stream, as it would be possible using gRPC. Furthermore, we already use gRPC in other parts of our system.

 

We also want to be able to scale those endpoints up and down, as demand for the service fluctuates depending on the hour and day. Combining StateFun and Kubernetes would allow for that elasticity of the service, while keeping state of the execution, since inferences are not always just one endpoint, but a collection of them where the output of one becomes the input of the next, culminating with the predicted score(s).

 

We are evaluating StateFun because Flink is already part of the infrastructure. With that said, gRPC is also part of our requirements, thus motivation for the question.

 

I’d love to hear more about plans to implement support for gRPC and perhaps become an early adopter.

 

I hope this helps with understanding of the use case. Happy to talk further and answer more questions.

 

Best,

 

Dalmo

 

 

 

From: Igal Shilman <[hidden email]>
Date: Saturday, September 19, 2020 at 01:41
To: Dalmo Cirne <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Subject: Re: Support for gRPC in Flink StateFun 2.x

 

Hi,

 

Your observation is correct, currently the only way to invoke a remote function is trough an HTTP POST request to a service that exposes a StateFun endpoint.

 

The endpoint must implement the client side of a the “RequestReply” protocol as defined by StateFun (basically an invocation contains the state and message, and a response contains a description of the side effects).

 

While gRPC can be easily added a as a replacement for the transport layer, the client side (the remote function) would still have to implement the RequestReply protocol.

 

To truly utilize gRPC we would want to introduce a new type of protocol, that can exploit the low latency bi-directional streams to and from the function.

 

While for the later it is a bit difficult to commit for a specific date the former can be easily implemented in the next StateFun release.

 

Would you be able to share with us a little bit more about your original motivation to ask this question :-)

This would help us as we gather more and more use cases.

 

For example: target language, environment, how gRPC services are being discovered.

 

Thanks,

Igal

 

On Thursday, September 17, 2020, Dalmo Cirne <[hidden email]> wrote:

Hi,

 

In the latest Flink Forward, from April 2020, there were mentions that adding support to gRPC, in addition to HTTP, was in the works and would be implemented in the future.

 

Looking into the flink-statefun repository on GitHub, one can see that there is already some work done with gRPC, but parity with its HTTP counterpart is not there, yet.

 

Is there a roadmap or an estimate of when gRPC will be implemented in StateFun?

 

Thank you,

 

Dalmo

 

 

 

 

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Support for gRPC in Flink StateFun 2.x

Igal Shilman
Hi Dalmo,

Thanks a lot for sharing this use case!

If I understand the requirement correctly, you are mostly concerned with performance. In that case I've created
an issue [1] to add a gRPC transport for StateFun, and I believe we would be able to implement it in the upcoming weeks.

Just a side note about the way StateFun invokes remote functions via HTTP, at the moment:

- StateFun keeps a connection pool, to avoid re-establishing the connection for each request.
- StateFun batches requests per address (key) to amortize the cost of a round trip, and state shipment.

There is an RC2 for the upcoming StateFun version, with some improvements around HTTP functions,
and operational visibility (logs and metrics). So perhaps you can take that for a spin if you are evaluating StateFun
at the moment. The release itself is expected to happen at the end of this week.



Thanks,
Igal.


On Tue, Sep 22, 2020 at 4:38 AM Dalmo Cirne <[hidden email]> wrote:

Thank you for the quick reply, Igal.

 

Our use case is the following: A stream of data from Kafka is fed into Flink where data transformations take place. After that we send that transformed data to an inference engine to score the relevance of each record. (Rough simplification.)

 

Doing that using HTTP endpoints is possible, and it is the solution we have in place today, however, for each request to that endpoint, we need to incur the cost of establishing the connection, etc., thus increasing the latency of the system.

 

We do process data in batches to mitigate the latency, but it is not the same as having a bi-directional stream, as it would be possible using gRPC. Furthermore, we already use gRPC in other parts of our system.

 

We also want to be able to scale those endpoints up and down, as demand for the service fluctuates depending on the hour and day. Combining StateFun and Kubernetes would allow for that elasticity of the service, while keeping state of the execution, since inferences are not always just one endpoint, but a collection of them where the output of one becomes the input of the next, culminating with the predicted score(s).

 

We are evaluating StateFun because Flink is already part of the infrastructure. With that said, gRPC is also part of our requirements, thus motivation for the question.

 

I’d love to hear more about plans to implement support for gRPC and perhaps become an early adopter.

 

I hope this helps with understanding of the use case. Happy to talk further and answer more questions.

 

Best,

 

Dalmo

 

 

 

From: Igal Shilman <[hidden email]>
Date: Saturday, September 19, 2020 at 01:41
To: Dalmo Cirne <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Subject: Re: Support for gRPC in Flink StateFun 2.x

 

Hi,

 

Your observation is correct, currently the only way to invoke a remote function is trough an HTTP POST request to a service that exposes a StateFun endpoint.

 

The endpoint must implement the client side of a the “RequestReply” protocol as defined by StateFun (basically an invocation contains the state and message, and a response contains a description of the side effects).

 

While gRPC can be easily added a as a replacement for the transport layer, the client side (the remote function) would still have to implement the RequestReply protocol.

 

To truly utilize gRPC we would want to introduce a new type of protocol, that can exploit the low latency bi-directional streams to and from the function.

 

While for the later it is a bit difficult to commit for a specific date the former can be easily implemented in the next StateFun release.

 

Would you be able to share with us a little bit more about your original motivation to ask this question :-)

This would help us as we gather more and more use cases.

 

For example: target language, environment, how gRPC services are being discovered.

 

Thanks,

Igal

 

On Thursday, September 17, 2020, Dalmo Cirne <[hidden email]> wrote:

Hi,

 

In the latest Flink Forward, from April 2020, there were mentions that adding support to gRPC, in addition to HTTP, was in the works and would be implemented in the future.

 

Looking into the flink-statefun repository on GitHub, one can see that there is already some work done with gRPC, but parity with its HTTP counterpart is not there, yet.

 

Is there a roadmap or an estimate of when gRPC will be implemented in StateFun?

 

Thank you,

 

Dalmo

 

 

 

 

 

 

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Support for gRPC in Flink StateFun 2.x

Dalmo Cirne

Thank you so much for creating the ticket, Igal. We are looking forward to being able to use it!

 

And thank you for giving a little more context about how StateFun keeps a connection pool and tries to optimize for performance and throughput.

 

With that said, gRPC is an architectural choice we have made. It would be better to maintain project consistency, rather than opening exceptions here and there.

 

We will definitely take StateFun for a spin once we can use it with gRPC.

 

Cheers,

 

Dalmo

 

 

 

From: Igal Shilman <[hidden email]>
Date: Wednesday, September 23, 2020 at 07:53
To: Dalmo Cirne <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Subject: Re: Support for gRPC in Flink StateFun 2.x

 

Hi Dalmo,

 

Thanks a lot for sharing this use case!

 

If I understand the requirement correctly, you are mostly concerned with performance. In that case I've created

an issue [1] to add a gRPC transport for StateFun, and I believe we would be able to implement it in the upcoming weeks.

 

Just a side note about the way StateFun invokes remote functions via HTTP, at the moment:

 

- StateFun keeps a connection pool, to avoid re-establishing the connection for each request.

- StateFun batches requests per address (key) to amortize the cost of a round trip, and state shipment.

 

There is an RC2 for the upcoming StateFun version, with some improvements around HTTP functions,

and operational visibility (logs and metrics). So perhaps you can take that for a spin if you are evaluating StateFun

at the moment. The release itself is expected to happen at the end of this week.

 

 

 

Thanks,

Igal.

 

 

On Tue, Sep 22, 2020 at 4:38 AM Dalmo Cirne <[hidden email]> wrote:

Thank you for the quick reply, Igal.

 

Our use case is the following: A stream of data from Kafka is fed into Flink where data transformations take place. After that we send that transformed data to an inference engine to score the relevance of each record. (Rough simplification.)

 

Doing that using HTTP endpoints is possible, and it is the solution we have in place today, however, for each request to that endpoint, we need to incur the cost of establishing the connection, etc., thus increasing the latency of the system.

 

We do process data in batches to mitigate the latency, but it is not the same as having a bi-directional stream, as it would be possible using gRPC. Furthermore, we already use gRPC in other parts of our system.

 

We also want to be able to scale those endpoints up and down, as demand for the service fluctuates depending on the hour and day. Combining StateFun and Kubernetes would allow for that elasticity of the service, while keeping state of the execution, since inferences are not always just one endpoint, but a collection of them where the output of one becomes the input of the next, culminating with the predicted score(s).

 

We are evaluating StateFun because Flink is already part of the infrastructure. With that said, gRPC is also part of our requirements, thus motivation for the question.

 

I’d love to hear more about plans to implement support for gRPC and perhaps become an early adopter.

 

I hope this helps with understanding of the use case. Happy to talk further and answer more questions.

 

Best,

 

Dalmo

 

 

 

From: Igal Shilman <[hidden email]>
Date: Saturday, September 19, 2020 at 01:41
To: Dalmo Cirne <[hidden email]>
Cc: "[hidden email]" <[hidden email]>
Subject: Re: Support for gRPC in Flink StateFun 2.x

 

Hi,

 

Your observation is correct, currently the only way to invoke a remote function is trough an HTTP POST request to a service that exposes a StateFun endpoint.

 

The endpoint must implement the client side of a the “RequestReply” protocol as defined by StateFun (basically an invocation contains the state and message, and a response contains a description of the side effects).

 

While gRPC can be easily added a as a replacement for the transport layer, the client side (the remote function) would still have to implement the RequestReply protocol.

 

To truly utilize gRPC we would want to introduce a new type of protocol, that can exploit the low latency bi-directional streams to and from the function.

 

While for the later it is a bit difficult to commit for a specific date the former can be easily implemented in the next StateFun release.

 

Would you be able to share with us a little bit more about your original motivation to ask this question :-)

This would help us as we gather more and more use cases.

 

For example: target language, environment, how gRPC services are being discovered.

 

Thanks,

Igal

 

On Thursday, September 17, 2020, Dalmo Cirne <[hidden email]> wrote:

Hi,

 

In the latest Flink Forward, from April 2020, there were mentions that adding support to gRPC, in addition to HTTP, was in the works and would be implemented in the future.

 

Looking into the flink-statefun repository on GitHub, one can see that there is already some work done with gRPC, but parity with its HTTP counterpart is not there, yet.

 

Is there a roadmap or an estimate of when gRPC will be implemented in StateFun?

 

Thank you,

 

Dalmo