Clear irrelevant state values

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Clear irrelevant state values

Sowmya Vallabhajosyula
Hi,

Scenario: Health care where a list of patient events are flowing in. We would like to keep a Value / List state holding all events and updating the state based on a set of business rules. For e.g. if 4 vitals exceed range in 24 hours, state is sirs. If the patient is in sirs state and a source of infection is reported now, update state to sepsis. Attached flow for your reference.

Considering the example of RideSource, if I use a RichFlatMapFunction to maintain this state (please let me know if this doesn't make sense),

(a) out.collect will return the current patient state to a sink so at the end of every event, we know the state of the patient at this instance. Is my understanding right?
(b) Let's say after x days, these events that we recorded are not valid anymore. How do we clear the state?

--
Thanks and Regards,
Sowmya Vallabhajosyula

bre-f2a.png (336K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Clear irrelevant state values

Gyula Fóra-2
Hi,

(a) I think your understanding is correct, one consideration might be that if you are always sending the state to the sink, it might make sense to build it there directly using a RichSinkFunction.

(b) There is no built-in support for this at the moment. What you can do yourself is to generate removal markers for the patients automatically. We could probably add this feature later which might be easier to implement for some state backends. For instance in RocksDB we could use a time-to-live database to remove states after a given period.

Cheers,
Gyula

Sowmya Vallabhajosyula <[hidden email]> ezt írta (időpont: 2016. ápr. 25., H, 11:22):
Hi,

Scenario: Health care where a list of patient events are flowing in. We would like to keep a Value / List state holding all events and updating the state based on a set of business rules. For e.g. if 4 vitals exceed range in 24 hours, state is sirs. If the patient is in sirs state and a source of infection is reported now, update state to sepsis. Attached flow for your reference.

Considering the example of RideSource, if I use a RichFlatMapFunction to maintain this state (please let me know if this doesn't make sense),

(a) out.collect will return the current patient state to a sink so at the end of every event, we know the state of the patient at this instance. Is my understanding right?
(b) Let's say after x days, these events that we recorded are not valid anymore. How do we clear the state?

--
Thanks and Regards,
Sowmya Vallabhajosyula
Reply | Threaded
Open this post in threaded view
|

Re: Clear irrelevant state values

Sowmya Vallabhajosyula
Hi Gyula,

Thank you so much.

1. Can you point me to any documentation on removal markers?
2. My understanding is this implementation of custom state maintenance does not impact scalabiity. Is that right?

Thanks,
Sowmya

On Mon, Apr 25, 2016 at 3:06 PM, Gyula Fóra <[hidden email]> wrote:
Hi,

(a) I think your understanding is correct, one consideration might be that if you are always sending the state to the sink, it might make sense to build it there directly using a RichSinkFunction.

(b) There is no built-in support for this at the moment. What you can do yourself is to generate removal markers for the patients automatically. We could probably add this feature later which might be easier to implement for some state backends. For instance in RocksDB we could use a time-to-live database to remove states after a given period.

Cheers,
Gyula

Sowmya Vallabhajosyula <[hidden email]> ezt írta (időpont: 2016. ápr. 25., H, 11:22):
Hi,

Scenario: Health care where a list of patient events are flowing in. We would like to keep a Value / List state holding all events and updating the state based on a set of business rules. For e.g. if 4 vitals exceed range in 24 hours, state is sirs. If the patient is in sirs state and a source of infection is reported now, update state to sepsis. Attached flow for your reference.

Considering the example of RideSource, if I use a RichFlatMapFunction to maintain this state (please let me know if this doesn't make sense),

(a) out.collect will return the current patient state to a sink so at the end of every event, we know the state of the patient at this instance. Is my understanding right?
(b) Let's say after x days, these events that we recorded are not valid anymore. How do we clear the state?

--
Thanks and Regards,
Sowmya Vallabhajosyula



--
Thanks and Regards,
Sowmya Vallabhajosyula
Reply | Threaded
Open this post in threaded view
|

Re: Clear irrelevant state values

Gyula Fóra
Hi,

The removal markers are just something I made up :) What I meant is that you can generate events in a custom source for instance that will trigger the removal of the state. This might be easy or hard to do depending on your use-case.

What do you mean by custom state maintenance? As long as you are using the state interfaces correctly in your functions you should be fine in terms of scalability.

Gyula

Sowmya Vallabhajosyula <[hidden email]> ezt írta (időpont: 2016. ápr. 25., H, 13:29):
Hi Gyula,

Thank you so much.

1. Can you point me to any documentation on removal markers?
2. My understanding is this implementation of custom state maintenance does not impact scalabiity. Is that right?

Thanks,
Sowmya

On Mon, Apr 25, 2016 at 3:06 PM, Gyula Fóra <[hidden email]> wrote:
Hi,

(a) I think your understanding is correct, one consideration might be that if you are always sending the state to the sink, it might make sense to build it there directly using a RichSinkFunction.

(b) There is no built-in support for this at the moment. What you can do yourself is to generate removal markers for the patients automatically. We could probably add this feature later which might be easier to implement for some state backends. For instance in RocksDB we could use a time-to-live database to remove states after a given period.

Cheers,
Gyula

Sowmya Vallabhajosyula <[hidden email]> ezt írta (időpont: 2016. ápr. 25., H, 11:22):
Hi,

Scenario: Health care where a list of patient events are flowing in. We would like to keep a Value / List state holding all events and updating the state based on a set of business rules. For e.g. if 4 vitals exceed range in 24 hours, state is sirs. If the patient is in sirs state and a source of infection is reported now, update state to sepsis. Attached flow for your reference.

Considering the example of RideSource, if I use a RichFlatMapFunction to maintain this state (please let me know if this doesn't make sense),

(a) out.collect will return the current patient state to a sink so at the end of every event, we know the state of the patient at this instance. Is my understanding right?
(b) Let's say after x days, these events that we recorded are not valid anymore. How do we clear the state?

--
Thanks and Regards,
Sowmya Vallabhajosyula



--
Thanks and Regards,
Sowmya Vallabhajosyula
Reply | Threaded
Open this post in threaded view
|

Re: Clear irrelevant state values

Sowmya Vallabhajosyula
Thanks Gyula.

Yes, I am using state only in RichFlatMapFunction. Will try to evaluate generating events for removal of state.

Regards,
Sowmya

On Mon, Apr 25, 2016 at 5:44 PM, Gyula Fóra <[hidden email]> wrote:
Hi,

The removal markers are just something I made up :) What I meant is that you can generate events in a custom source for instance that will trigger the removal of the state. This might be easy or hard to do depending on your use-case.

What do you mean by custom state maintenance? As long as you are using the state interfaces correctly in your functions you should be fine in terms of scalability.

Gyula

Sowmya Vallabhajosyula <[hidden email]> ezt írta (időpont: 2016. ápr. 25., H, 13:29):
Hi Gyula,

Thank you so much.

1. Can you point me to any documentation on removal markers?
2. My understanding is this implementation of custom state maintenance does not impact scalabiity. Is that right?

Thanks,
Sowmya

On Mon, Apr 25, 2016 at 3:06 PM, Gyula Fóra <[hidden email]> wrote:
Hi,

(a) I think your understanding is correct, one consideration might be that if you are always sending the state to the sink, it might make sense to build it there directly using a RichSinkFunction.

(b) There is no built-in support for this at the moment. What you can do yourself is to generate removal markers for the patients automatically. We could probably add this feature later which might be easier to implement for some state backends. For instance in RocksDB we could use a time-to-live database to remove states after a given period.

Cheers,
Gyula

Sowmya Vallabhajosyula <[hidden email]> ezt írta (időpont: 2016. ápr. 25., H, 11:22):
Hi,

Scenario: Health care where a list of patient events are flowing in. We would like to keep a Value / List state holding all events and updating the state based on a set of business rules. For e.g. if 4 vitals exceed range in 24 hours, state is sirs. If the patient is in sirs state and a source of infection is reported now, update state to sepsis. Attached flow for your reference.

Considering the example of RideSource, if I use a RichFlatMapFunction to maintain this state (please let me know if this doesn't make sense),

(a) out.collect will return the current patient state to a sink so at the end of every event, we know the state of the patient at this instance. Is my understanding right?
(b) Let's say after x days, these events that we recorded are not valid anymore. How do we clear the state?

--
Thanks and Regards,
Sowmya Vallabhajosyula



--
Thanks and Regards,
Sowmya Vallabhajosyula



--
Thanks and Regards,
Sowmya Vallabhajosyula