(DEPRECATED) Apache Flink User Mailing List archive.

What does "Continuous incremental cleanup" mean in Flink 1.8 release notes

Classic

List

Threaded

4 messages Options

Tony Wei

What does "Continuous incremental cleanup" mean in Flink 1.8 release notes

Hi everyone,

I read the Flink 1.8 release notes about state [1], and it said

Continuous incremental cleanup of old Keyed State with TTL
We introduced TTL (time-to-live) for Keyed state in Flink 1.6 (FLINK-9510). This feature allowed to clean up and make inaccessible keyed state entries when accessing them. In addition state would now also being cleaned up when writing a savepoint/checkpoint.
Flink 1.8 introduces continous cleanup of old entries for both the RocksDB state backend (FLINK-10471) and the heap state backend (FLINK-10473). This means that old entries (according to the ttl setting) are continously being cleanup up.

I'm not familiar with TTL's implementation in Flink 1.6 and what new features introduced in Flink
1.8. I don't understand what difference between these two release version after reading the

release notes. Did they change the outcome of TTL feature, or provide new TTL features, or just
change the behavior of executing TTL mechanism.

Could you give me more references to learn about it? A simple example to illustrate it is more
appreciated. Thank you.

Best,

Tony Wei

[1] https://ci.apache.org/projects/flink/flink-docs-master/release-notes/flink-1.8.html#state

Konstantin Knauf-2

Re: What does "Continuous incremental cleanup" mean in Flink 1.8 release notes

Hi Tony,

before Flink 1.8 expired state is only cleaned up, when you try to access it after expiration, i.e. when user code tries to access the expired state, the state value is cleaned and "null" is returned. There was also already the option to clean up expired state during full snapshots (https://github.com/apache/flink/pull/6460). With Flink 1.8 expired state is cleaned up continuously in the background regardless of checkpointing or any attempt to access it after expiration.

As a reference the linked JIRA tickets should be a good starting point.

Hope this helps.

Konstantin

On Fri, Mar 8, 2019 at 10:45 AM Tony Wei <[hidden email]> wrote:

Hi everyone,

I read the Flink 1.8 release notes about state [1], and it said

Continuous incremental cleanup of old Keyed State with TTL
We introduced TTL (time-to-live) for Keyed state in Flink 1.6 (FLINK-9510). This feature allowed to clean up and make inaccessible keyed state entries when accessing them. In addition state would now also being cleaned up when writing a savepoint/checkpoint.
Flink 1.8 introduces continous cleanup of old entries for both the RocksDB state backend (FLINK-10471) and the heap state backend (FLINK-10473). This means that old entries (according to the ttl setting) are continously being cleanup up.

I'm not familiar with TTL's implementation in Flink 1.6 and what new features introduced in Flink
1.8. I don't understand what difference between these two release version after reading the
release notes. Did they change the outcome of TTL feature, or provide new TTL features, or just
change the behavior of executing TTL mechanism.

Could you give me more references to learn about it? A simple example to illustrate it is more
appreciated. Thank you.

Best,
Tony Wei

[1] https://ci.apache.org/projects/flink/flink-docs-master/release-notes/flink-1.8.html#state

Konstantin Knauf | Solutions Architect

+49 160 91394525

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen

Tony Wei

Re: What does "Continuous incremental cleanup" mean in Flink 1.8 release notes

Hi Konstantin,

That is really helpful. Thanks.

Another follow-up question: The document said "Cleanup in full snapshot" is not applicable for

the incremental checkpointing in the RocksDB state backend. However, when user manually

trigger a savepoint and restart job from it, the expired states should be clean up as well based

on Flink 1.6's implementation. Am I right?

Best,

Tony Wei

Konstantin Knauf <[hidden email]> 於 2019年3月9日週六上午7:00寫道：

Hi Tony,

before Flink 1.8 expired state is only cleaned up, when you try to access it after expiration, i.e. when user code tries to access the expired state, the state value is cleaned and "null" is returned. There was also already the option to clean up expired state during full snapshots (https://github.com/apache/flink/pull/6460). With Flink 1.8 expired state is cleaned up continuously in the background regardless of checkpointing or any attempt to access it after expiration.

As a reference the linked JIRA tickets should be a good starting point.

Hope this helps.

Konstantin

On Fri, Mar 8, 2019 at 10:45 AM Tony Wei <[hidden email]> wrote:
Hi everyone,

I read the Flink 1.8 release notes about state [1], and it said

Continuous incremental cleanup of old Keyed State with TTL
We introduced TTL (time-to-live) for Keyed state in Flink 1.6 (FLINK-9510). This feature allowed to clean up and make inaccessible keyed state entries when accessing them. In addition state would now also being cleaned up when writing a savepoint/checkpoint.
Flink 1.8 introduces continous cleanup of old entries for both the RocksDB state backend (FLINK-10471) and the heap state backend (FLINK-10473). This means that old entries (according to the ttl setting) are continously being cleanup up.

I'm not familiar with TTL's implementation in Flink 1.6 and what new features introduced in Flink
1.8. I don't understand what difference between these two release version after reading the
release notes. Did they change the outcome of TTL feature, or provide new TTL features, or just
change the behavior of executing TTL mechanism.

Could you give me more references to learn about it? A simple example to illustrate it is more
appreciated. Thank you.

Best,
Tony Wei

[1] https://ci.apache.org/projects/flink/flink-docs-master/release-notes/flink-1.8.html#state

--
Konstantin Knauf | Solutions Architect
+49 160 91394525

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen

Konstantin Knauf-2

Re: What does "Continuous incremental cleanup" mean in Flink 1.8 release notes

Hi Tony,

yes, when taking a savepoint the same strategy as the during a non-incremental checkpoint is used.

Best,

Konstantin

On Mon, Mar 11, 2019 at 2:29 AM Tony Wei <[hidden email]> wrote:

Hi Konstantin,

That is really helpful. Thanks.

Another follow-up question: The document said "Cleanup in full snapshot" is not applicable for
the incremental checkpointing in the RocksDB state backend. However, when user manually
trigger a savepoint and restart job from it, the expired states should be clean up as well based
on Flink 1.6's implementation. Am I right?

Best,
Tony Wei

Konstantin Knauf <[hidden email]> 於 2019年3月9日週六上午7:00寫道：
Hi Tony,

before Flink 1.8 expired state is only cleaned up, when you try to access it after expiration, i.e. when user code tries to access the expired state, the state value is cleaned and "null" is returned. There was also already the option to clean up expired state during full snapshots (https://github.com/apache/flink/pull/6460). With Flink 1.8 expired state is cleaned up continuously in the background regardless of checkpointing or any attempt to access it after expiration.

As a reference the linked JIRA tickets should be a good starting point.

Hope this helps.

Konstantin

On Fri, Mar 8, 2019 at 10:45 AM Tony Wei <[hidden email]> wrote:
Hi everyone,

I read the Flink 1.8 release notes about state [1], and it said

Continuous incremental cleanup of old Keyed State with TTL
We introduced TTL (time-to-live) for Keyed state in Flink 1.6 (FLINK-9510). This feature allowed to clean up and make inaccessible keyed state entries when accessing them. In addition state would now also being cleaned up when writing a savepoint/checkpoint.
Flink 1.8 introduces continous cleanup of old entries for both the RocksDB state backend (FLINK-10471) and the heap state backend (FLINK-10473). This means that old entries (according to the ttl setting) are continously being cleanup up.

I'm not familiar with TTL's implementation in Flink 1.6 and what new features introduced in Flink
1.8. I don't understand what difference between these two release version after reading the
release notes. Did they change the outcome of TTL feature, or provide new TTL features, or just
change the behavior of executing TTL mechanism.

Could you give me more references to learn about it? A simple example to illustrate it is more
appreciated. Thank you.

Best,
Tony Wei

[1] https://ci.apache.org/projects/flink/flink-docs-master/release-notes/flink-1.8.html#state

--
Konstantin Knauf | Solutions Architect
+49 160 91394525

Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time
--
Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen

Konstantin Knauf | Solutions Architect

+49 160 91394525

Join Flink Forward - The Apache Flink Conference

Stream Processing | Event Driven | Real Time

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen