Bandwidth throttling of checkpoints uploading to s3

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Bandwidth throttling of checkpoints uploading to s3

Pavel Potseluev
Hello!
 
We use flink with periodically checkpointing to s3 file system. And when flink uploads checkpoint to s3 it makes high load to the network. We have found in AWS CLI S3 configuration option called max_bandwidth which allows to limit rate in bytes per second. Is there a way to have the same functionality with flink?
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
Reply | Threaded
Open this post in threaded view
|

Re: Bandwidth throttling of checkpoints uploading to s3

Yu Li
Hi Pavel,

Currently there's no such throttling functionality in Flink and I think it's a valid requirement. But before opening a JIRA for this, please allow me to ask for more details to better understand your scenario:
1. What kind of state backend are you using? Since you observe high load to network, I guess the state is large and you are using RocksDB backend?
2. If you're using RocksDB backend, have you configured to use incremental checkpoint? To be more specified, have you set the "state.backend.incremental" property to true? (by default it's false)
3. If you're using RocksDB backend with full checkpoint, what the incremental size of checkpoint would be (within a checkpoint interval)?
4. What's the max bandwidth you'd like to throttle to for S3?

Asking because if you're using RocksDB with full checkpoint and the incremental checkpoint size is as small as not exceeding your expected throttle for S3, you could directly try incremental checkpoint to resolve the current problem.

Thanks.

Best Regards,
Yu


On Fri, 12 Jul 2019 at 20:39, Pavel Potseluev <[hidden email]> wrote:
Hello!
 
We use flink with periodically checkpointing to s3 file system. And when flink uploads checkpoint to s3 it makes high load to the network. We have found in AWS CLI S3 configuration option called max_bandwidth which allows to limit rate in bytes per second. Is there a way to have the same functionality with flink?
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
Reply | Threaded
Open this post in threaded view
|

Re: Bandwidth throttling of checkpoints uploading to s3

Pavel Potseluev
  1. We use FsStateBackend and state snapshot size is about 700 Mbyte.
  2. We are thinking about migration to RocksDBStateBackend and turning on incremental checkpoints.
  3. I think incremental size would be small in our current use case so incremental checkpoints can solve the problem.
  4. I think it is about 50 Mbit/s.
Thanks.
 
12.07.2019, 17:27, "Yu Li" <[hidden email]>:
Hi Pavel,
 
Currently there's no such throttling functionality in Flink and I think it's a valid requirement. But before opening a JIRA for this, please allow me to ask for more details to better understand your scenario:
1. What kind of state backend are you using? Since you observe high load to network, I guess the state is large and you are using RocksDB backend?
2. If you're using RocksDB backend, have you configured to use incremental checkpoint? To be more specified, have you set the "state.backend.incremental" property to true? (by default it's false)
3. If you're using RocksDB backend with full checkpoint, what the incremental size of checkpoint would be (within a checkpoint interval)?
4. What's the max bandwidth you'd like to throttle to for S3?
 
Asking because if you're using RocksDB with full checkpoint and the incremental checkpoint size is as small as not exceeding your expected throttle for S3, you could directly try incremental checkpoint to resolve the current problem.
 
Thanks.
 
Best Regards,
Yu
 
On Fri, 12 Jul 2019 at 20:39, Pavel Potseluev <[hidden email]> wrote:
Hello!
 
We use flink with periodically checkpointing to s3 file system. And when flink uploads checkpoint to s3 it makes high load to the network. We have found in AWS CLI S3 configuration option called max_bandwidth which allows to limit rate in bytes per second. Is there a way to have the same functionality with flink?
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
 
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
Reply | Threaded
Open this post in threaded view
|

Re: Bandwidth throttling of checkpoints uploading to s3

Yu Li
Thanks for the information Pavel, good to know.

And I've created FLINK-13251 to introduce the checkpoint bandwidth throttling feature, FYI.

Best Regards,
Yu


On Sat, 13 Jul 2019 at 00:11, Павел Поцелуев <[hidden email]> wrote:
  1. We use FsStateBackend and state snapshot size is about 700 Mbyte.
  2. We are thinking about migration to RocksDBStateBackend and turning on incremental checkpoints.
  3. I think incremental size would be small in our current use case so incremental checkpoints can solve the problem.
  4. I think it is about 50 Mbit/s.
Thanks.
 
12.07.2019, 17:27, "Yu Li" <[hidden email]>:
Hi Pavel,
 
Currently there's no such throttling functionality in Flink and I think it's a valid requirement. But before opening a JIRA for this, please allow me to ask for more details to better understand your scenario:
1. What kind of state backend are you using? Since you observe high load to network, I guess the state is large and you are using RocksDB backend?
2. If you're using RocksDB backend, have you configured to use incremental checkpoint? To be more specified, have you set the "state.backend.incremental" property to true? (by default it's false)
3. If you're using RocksDB backend with full checkpoint, what the incremental size of checkpoint would be (within a checkpoint interval)?
4. What's the max bandwidth you'd like to throttle to for S3?
 
Asking because if you're using RocksDB with full checkpoint and the incremental checkpoint size is as small as not exceeding your expected throttle for S3, you could directly try incremental checkpoint to resolve the current problem.
 
Thanks.
 
Best Regards,
Yu
 
On Fri, 12 Jul 2019 at 20:39, Pavel Potseluev <[hidden email]> wrote:
Hello!
 
We use flink with periodically checkpointing to s3 file system. And when flink uploads checkpoint to s3 it makes high load to the network. We have found in AWS CLI S3 configuration option called max_bandwidth which allows to limit rate in bytes per second. Is there a way to have the same functionality with flink?
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
 
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
Reply | Threaded
Open this post in threaded view
|

Re: Bandwidth throttling of checkpoints uploading to s3

Pavel Potseluev
Great, thank you.
 
12.07.2019, 21:31, "Yu Li" <[hidden email]>:
Thanks for the information Pavel, good to know.
 
And I've created FLINK-13251 to introduce the checkpoint bandwidth throttling feature, FYI.
 
Best Regards,
Yu
 
On Sat, 13 Jul 2019 at 00:11, Павел Поцелуев <[hidden email]> wrote:
  1. We use FsStateBackend and state snapshot size is about 700 Mbyte.
  2. We are thinking about migration to RocksDBStateBackend and turning on incremental checkpoints.
  3. I think incremental size would be small in our current use case so incremental checkpoints can solve the problem.
  4. I think it is about 50 Mbit/s.
Thanks.
 
12.07.2019, 17:27, "Yu Li" <[hidden email]>:
Hi Pavel,
 
Currently there's no such throttling functionality in Flink and I think it's a valid requirement. But before opening a JIRA for this, please allow me to ask for more details to better understand your scenario:
1. What kind of state backend are you using? Since you observe high load to network, I guess the state is large and you are using RocksDB backend?
2. If you're using RocksDB backend, have you configured to use incremental checkpoint? To be more specified, have you set the "state.backend.incremental" property to true? (by default it's false)
3. If you're using RocksDB backend with full checkpoint, what the incremental size of checkpoint would be (within a checkpoint interval)?
4. What's the max bandwidth you'd like to throttle to for S3?
 
Asking because if you're using RocksDB with full checkpoint and the incremental checkpoint size is as small as not exceeding your expected throttle for S3, you could directly try incremental checkpoint to resolve the current problem.
 
Thanks.
 
Best Regards,
Yu
 
On Fri, 12 Jul 2019 at 20:39, Pavel Potseluev <[hidden email]> wrote:
Hello!
 
We use flink with periodically checkpointing to s3 file system. And when flink uploads checkpoint to s3 it makes high load to the network. We have found in AWS CLI S3 configuration option called max_bandwidth which allows to limit rate in bytes per second. Is there a way to have the same functionality with flink?
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
 
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC
 
 
 
-- 
Best regards,
Pavel Potseluev
Software developer, Yandex.Classifieds LLC