How does Flink handle shorted lived keyed streams

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

How does Flink handle shorted lived keyed streams

narasimha
Hi,

Belos is the use case.

Have a stream of transaction events, success/failure of a transaction can be determined by those events. 
Partitioning stream by transaction id and applying CEP to determine the success/failure of a transaction.
Each transaction keyed stream is valid only until the final status is found. Which can end up having large inactive keyed streams in the system. 

Know that using keygroup flink distributes the keyedstream to tasks based on it, but still there will be a large set of inactive keys.

Does this have any side effects? If so what has to be done to overcome humongous keyed streams?

--
A.Narasimha Swamy
Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle shorted lived keyed streams

Xintong Song

On Wed, Dec 23, 2020 at 11:57 PM narasimha <[hidden email]> wrote:
Hi,

Belos is the use case.

Have a stream of transaction events, success/failure of a transaction can be determined by those events. 
Partitioning stream by transaction id and applying CEP to determine the success/failure of a transaction.
Each transaction keyed stream is valid only until the final status is found. Which can end up having large inactive keyed streams in the system. 

Know that using keygroup flink distributes the keyedstream to tasks based on it, but still there will be a large set of inactive keys.

Does this have any side effects? If so what has to be done to overcome humongous keyed streams?

--
A.Narasimha Swamy
Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle shorted lived keyed streams

narasimha
Thanks Xintong.

I'll check it out and get back to you.

On Thu, Dec 24, 2020 at 1:30 PM Xintong Song <[hidden email]> wrote:

On Wed, Dec 23, 2020 at 11:57 PM narasimha <[hidden email]> wrote:
Hi,

Belos is the use case.

Have a stream of transaction events, success/failure of a transaction can be determined by those events. 
Partitioning stream by transaction id and applying CEP to determine the success/failure of a transaction.
Each transaction keyed stream is valid only until the final status is found. Which can end up having large inactive keyed streams in the system. 

Know that using keygroup flink distributes the keyedstream to tasks based on it, but still there will be a large set of inactive keys.

Does this have any side effects? If so what has to be done to overcome humongous keyed streams?

--
A.Narasimha Swamy


--
A.Narasimha Swamy
Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle shorted lived keyed streams

narasimha
It is not solving the problem.

I could see the memory keep increasing, resulting in a lot of high GCs.

There could be a memory leak, just want to know how to know if older keps are skill alive, even after the pattern has been satisfied or within range of the pattern has expired. 

Can someone suggest how to proceed further. 
Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle shorted lived keyed streams

Matthias
Hi narashima,
not sure whether this fits your use case, but have you considered creating a savepoint and analyzing it using the State Processor API [1]?

Best,
Matthias


On Wed, Feb 10, 2021 at 6:08 PM narasimha <[hidden email]> wrote:
It is not solving the problem.

I could see the memory keep increasing, resulting in a lot of high GCs.

There could be a memory leak, just want to know how to know if older keps are skill alive, even after the pattern has been satisfied or within range of the pattern has expired. 

Can someone suggest how to proceed further. 

Reply | Threaded
Open this post in threaded view
|

Re: How does Flink handle shorted lived keyed streams

narasimha
Thanks Matthias, I think it will help to find out what all live keys are present. 

Let me check and revert back on the thread. 

On Wed, Feb 10, 2021 at 10:46 PM Matthias Pohl <[hidden email]> wrote:
Hi narashima,
not sure whether this fits your use case, but have you considered creating a savepoint and analyzing it using the State Processor API [1]?

Best,
Matthias


On Wed, Feb 10, 2021 at 6:08 PM narasimha <[hidden email]> wrote:
It is not solving the problem.

I could see the memory keep increasing, resulting in a lot of high GCs.

There could be a memory leak, just want to know how to know if older keps are skill alive, even after the pattern has been satisfied or within range of the pattern has expired. 

Can someone suggest how to proceed further. 



--
A.Narasimha Swamy