Disable WAL in RocksDB recovery

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Disable WAL in RocksDB recovery

Juha Mynttinen
Hello there, 

I'd like to bring to discussion a previously discussed topic - disabling WAL in RocksDB recovery.

It's clear that WAL is not needed during the process, the reason being that the WAL is never read, so there's no need to write it.

AFAIK the last thing that was done with WAL during recovery is an attempt to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922). If I interpret the comments in the ticket correctly, what happened was that a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes segfault.

What can be seen in the ticket is that having WAL causes a significant performance penalty. Thus, getting rid of WAL would be a very nice performance improvement. I think it'd be worth to creating a new JIRA ticket at least as a reminder that WAL should be removed?

I'm planning adding an experimental flag to remove WAL in the environment I'm using Flink and trying it out. If the flag is made configurable, WAL can always be re-enabled if removing it causes issues.

Thoughts?

Regards,
Juha

Reply | Threaded
Open this post in threaded view
|

Re: Disable WAL in RocksDB recovery

Yu Li
Thanks for bringing this up Juha, and good catch.

We actually are disabling WAL for routine writes by default when using RocksDB and never encountered segment fault issues. However, from history in FLINK-8922, segment fault issue occurs during restore if WAL is disabled, so I guess the root cause lies in RocksDB batch write (org.rocksdb.WriteBatch). And IMHO this is a RocksDB bug (it should work well when WAL is disabled, no matter under single or batch write).

+1 for opening a new JIRA to figure the root cause out, fix it and disable WAL during restore by default (maybe checking the fixes around WriteBatch in later RocksDB versions could help locate the issue more quickly), and thanks for volunteering taking the efforts. I will follow up and help review if any findings / PR submission.

Best Regards,
Yu


On Wed, 16 Sep 2020 at 13:58, Juha Mynttinen <[hidden email]> wrote:
Hello there, 

I'd like to bring to discussion a previously discussed topic - disabling WAL in RocksDB recovery.

It's clear that WAL is not needed during the process, the reason being that the WAL is never read, so there's no need to write it.

AFAIK the last thing that was done with WAL during recovery is an attempt to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922). If I interpret the comments in the ticket correctly, what happened was that a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes segfault.

What can be seen in the ticket is that having WAL causes a significant performance penalty. Thus, getting rid of WAL would be a very nice performance improvement. I think it'd be worth to creating a new JIRA ticket at least as a reminder that WAL should be removed?

I'm planning adding an experimental flag to remove WAL in the environment I'm using Flink and trying it out. If the flag is made configurable, WAL can always be re-enabled if removing it causes issues.

Thoughts?

Regards,
Juha

Reply | Threaded
Open this post in threaded view
|

Re: Disable WAL in RocksDB recovery

Juha Mynttinen
Good,

I opened this JIRA for the issue https://issues.apache.org/jira/browse/FLINK-19303. The discussion can be moved there.

Regards,
Juha

From: Yu Li <[hidden email]>
Sent: Friday, September 18, 2020 3:58 PM
To: Juha Mynttinen <[hidden email]>
Cc: [hidden email] <[hidden email]>
Subject: Re: Disable WAL in RocksDB recovery
 
Thanks for bringing this up Juha, and good catch.

We actually are disabling WAL for routine writes by default when using RocksDB and never encountered segment fault issues. However, from history in FLINK-8922, segment fault issue occurs during restore if WAL is disabled, so I guess the root cause lies in RocksDB batch write (org.rocksdb.WriteBatch). And IMHO this is a RocksDB bug (it should work well when WAL is disabled, no matter under single or batch write).

+1 for opening a new JIRA to figure the root cause out, fix it and disable WAL during restore by default (maybe checking the fixes around WriteBatch in later RocksDB versions could help locate the issue more quickly), and thanks for volunteering taking the efforts. I will follow up and help review if any findings / PR submission.

Best Regards,
Yu


On Wed, 16 Sep 2020 at 13:58, Juha Mynttinen <[hidden email]> wrote:
Hello there, 

I'd like to bring to discussion a previously discussed topic - disabling WAL in RocksDB recovery.

It's clear that WAL is not needed during the process, the reason being that the WAL is never read, so there's no need to write it.

AFAIK the last thing that was done with WAL during recovery is an attempt to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922 [issues.apache.org]). If I interpret the comments in the ticket correctly, what happened was that a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes segfault.

What can be seen in the ticket is that having WAL causes a significant performance penalty. Thus, getting rid of WAL would be a very nice performance improvement. I think it'd be worth to creating a new JIRA ticket at least as a reminder that WAL should be removed?

I'm planning adding an experimental flag to remove WAL in the environment I'm using Flink and trying it out. If the flag is made configurable, WAL can always be re-enabled if removing it causes issues.

Thoughts?

Regards,
Juha

Reply | Threaded
Open this post in threaded view
|

Re: Disable WAL in RocksDB recovery

Yu Li
Great, thanks for the follow up.

Best Regards,
Yu


On Mon, 21 Sep 2020 at 15:04, Juha Mynttinen <[hidden email]> wrote:
Good,

I opened this JIRA for the issue https://issues.apache.org/jira/browse/FLINK-19303. The discussion can be moved there.

Regards,
Juha

From: Yu Li <[hidden email]>
Sent: Friday, September 18, 2020 3:58 PM
To: Juha Mynttinen <[hidden email]>
Cc: [hidden email] <[hidden email]>
Subject: Re: Disable WAL in RocksDB recovery
 
Thanks for bringing this up Juha, and good catch.

We actually are disabling WAL for routine writes by default when using RocksDB and never encountered segment fault issues. However, from history in FLINK-8922, segment fault issue occurs during restore if WAL is disabled, so I guess the root cause lies in RocksDB batch write (org.rocksdb.WriteBatch). And IMHO this is a RocksDB bug (it should work well when WAL is disabled, no matter under single or batch write).

+1 for opening a new JIRA to figure the root cause out, fix it and disable WAL during restore by default (maybe checking the fixes around WriteBatch in later RocksDB versions could help locate the issue more quickly), and thanks for volunteering taking the efforts. I will follow up and help review if any findings / PR submission.

Best Regards,
Yu


On Wed, 16 Sep 2020 at 13:58, Juha Mynttinen <[hidden email]> wrote:
Hello there, 

I'd like to bring to discussion a previously discussed topic - disabling WAL in RocksDB recovery.

It's clear that WAL is not needed during the process, the reason being that the WAL is never read, so there's no need to write it.

AFAIK the last thing that was done with WAL during recovery is an attempt to remove it and later reverting that removal (https://issues.apache.org/jira/browse/FLINK-8922 [issues.apache.org]). If I interpret the comments in the ticket correctly, what happened was that a) WAL was kept in the recovery, 2) it's unknown why removing WAL causes segfault.

What can be seen in the ticket is that having WAL causes a significant performance penalty. Thus, getting rid of WAL would be a very nice performance improvement. I think it'd be worth to creating a new JIRA ticket at least as a reminder that WAL should be removed?

I'm planning adding an experimental flag to remove WAL in the environment I'm using Flink and trying it out. If the flag is made configurable, WAL can always be re-enabled if removing it causes issues.

Thoughts?

Regards,
Juha