Using Google Cloud Storage for checkpointing

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Google Cloud Storage for checkpointing

Rohil Surana
Hi

I was trying to setup checkpointing on Google Cloud Storage with Flink on Kubernetes, but was facing issues with Google Cloud Storage Connector classes not loading, even though in the logs I can see it being included in the classpath.

Logs showing classpath - https://pastebin.com/R1P7Eepz
Logs showing ClassNotFound Exception for class GoogleHadoopFileSystem - https://pastebin.com/LGMPzVbp
Hadoop conf core-site.xml - https://pastebin.com/CfEmTk2t

What extra I have done -
1.) Create a new Flink image with Google Cloud Storage connector jar in /etc/flink/lib folder.
2.) Add GCS service account credentials as Kubernetes secret
3.) Mount secret and hadoop, flink ConfigMaps on the taskmanager and jobmanager deployments.

Flink version - 1.4.2

Any help is appreciated.
Thank you.

Rohil

Reply | Threaded
Open this post in threaded view
|

Re: Using Google Cloud Storage for checkpointing

Till Rohrmann
Hi Rohil,

this sounds a little bit strange. If the GoogleHadoopFileSystem jar is on the classpath and the implementation is specified in core-site.xml, then the Hadoop Filesystem should be able to load the GCS filesystem. I just tried it out locally (without K8s though) and it seemed to work.

Could you maybe share a little bit more information about your setup. Which Hadoop version are you running? Maybe you could share the complete `JobManager` log with us. Does the same problem arise if you use Flink 1.5?

Cheers,
Till

On Thu, Jun 28, 2018 at 1:41 PM Rohil Surana <[hidden email]> wrote:
Hi

I was trying to setup checkpointing on Google Cloud Storage with Flink on Kubernetes, but was facing issues with Google Cloud Storage Connector classes not loading, even though in the logs I can see it being included in the classpath.

Logs showing classpath - https://pastebin.com/R1P7Eepz
Logs showing ClassNotFound Exception for class GoogleHadoopFileSystem - https://pastebin.com/LGMPzVbp
Hadoop conf core-site.xml - https://pastebin.com/CfEmTk2t

What extra I have done -
1.) Create a new Flink image with Google Cloud Storage connector jar in /etc/flink/lib folder.
2.) Add GCS service account credentials as Kubernetes secret
3.) Mount secret and hadoop, flink ConfigMaps on the taskmanager and jobmanager deployments.

Flink version - 1.4.2

Any help is appreciated.
Thank you.

Rohil

Reply | Threaded
Open this post in threaded view
|

Re: Using Google Cloud Storage for checkpointing

Rohil Surana
Thanks Till,

On trying locally the setup seemed to have worked until I submitted jobs to it, it started giving me ClassNotFoundExceptions for some other classes. So I created a new shaded fat jar from the gcs-connector source code, which worked locally and on K8s.

Thanks for all the help.

Rohil

On Fri, Jun 29, 2018 at 7:02 PM Till Rohrmann <[hidden email]> wrote:
Hi Rohil,

this sounds a little bit strange. If the GoogleHadoopFileSystem jar is on the classpath and the implementation is specified in core-site.xml, then the Hadoop Filesystem should be able to load the GCS filesystem. I just tried it out locally (without K8s though) and it seemed to work.

Could you maybe share a little bit more information about your setup. Which Hadoop version are you running? Maybe you could share the complete `JobManager` log with us. Does the same problem arise if you use Flink 1.5?

Cheers,
Till

On Thu, Jun 28, 2018 at 1:41 PM Rohil Surana <[hidden email]> wrote:
Hi

I was trying to setup checkpointing on Google Cloud Storage with Flink on Kubernetes, but was facing issues with Google Cloud Storage Connector classes not loading, even though in the logs I can see it being included in the classpath.

Logs showing classpath - https://pastebin.com/R1P7Eepz
Logs showing ClassNotFound Exception for class GoogleHadoopFileSystem - https://pastebin.com/LGMPzVbp
Hadoop conf core-site.xml - https://pastebin.com/CfEmTk2t

What extra I have done -
1.) Create a new Flink image with Google Cloud Storage connector jar in /etc/flink/lib folder.
2.) Add GCS service account credentials as Kubernetes secret
3.) Mount secret and hadoop, flink ConfigMaps on the taskmanager and jobmanager deployments.

Flink version - 1.4.2

Any help is appreciated.
Thank you.

Rohil

Reply | Threaded
Open this post in threaded view
|

Re: Using Google Cloud Storage for checkpointing

Till Rohrmann
Glad to hear it! 

On Sat, Jun 30, 2018 at 9:48 AM Rohil Surana <[hidden email]> wrote:
Thanks Till,

On trying locally the setup seemed to have worked until I submitted jobs to it, it started giving me ClassNotFoundExceptions for some other classes. So I created a new shaded fat jar from the gcs-connector source code, which worked locally and on K8s.

Thanks for all the help.

Rohil

On Fri, Jun 29, 2018 at 7:02 PM Till Rohrmann <[hidden email]> wrote:
Hi Rohil,

this sounds a little bit strange. If the GoogleHadoopFileSystem jar is on the classpath and the implementation is specified in core-site.xml, then the Hadoop Filesystem should be able to load the GCS filesystem. I just tried it out locally (without K8s though) and it seemed to work.

Could you maybe share a little bit more information about your setup. Which Hadoop version are you running? Maybe you could share the complete `JobManager` log with us. Does the same problem arise if you use Flink 1.5?

Cheers,
Till

On Thu, Jun 28, 2018 at 1:41 PM Rohil Surana <[hidden email]> wrote:
Hi

I was trying to setup checkpointing on Google Cloud Storage with Flink on Kubernetes, but was facing issues with Google Cloud Storage Connector classes not loading, even though in the logs I can see it being included in the classpath.

Logs showing classpath - https://pastebin.com/R1P7Eepz
Logs showing ClassNotFound Exception for class GoogleHadoopFileSystem - https://pastebin.com/LGMPzVbp
Hadoop conf core-site.xml - https://pastebin.com/CfEmTk2t

What extra I have done -
1.) Create a new Flink image with Google Cloud Storage connector jar in /etc/flink/lib folder.
2.) Add GCS service account credentials as Kubernetes secret
3.) Mount secret and hadoop, flink ConfigMaps on the taskmanager and jobmanager deployments.

Flink version - 1.4.2

Any help is appreciated.
Thank you.

Rohil