Running a Beam Pipeline on GCP Dataproc Flink Cluster

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Running a Beam Pipeline on GCP Dataproc Flink Cluster

Xander Song
I am attempting to run a Beam pipeline on a GCP Dataproc Flink cluster. I have followed the instructions at this repo to create a Flink cluster on Dataproc using an initialization action. However, the resulting cluster uses version 1.5.6 of Flink, and my project requires a more recent version (version 1.7, 1.8, or 1.9) for compatibility with Beam.

Inside of the flink.sh script in the linked repo, there is a line for installing Flink from a snapshot URL instead of apt. Is this the correct mechanism for installing a different version of Flink using the initialization script? If so, how is it meant to be used?

Thank you in advance.
Reply | Threaded
Open this post in threaded view
|

Re: Running a Beam Pipeline on GCP Dataproc Flink Cluster

Ismaël Mejía

On Fri, Feb 7, 2020 at 12:54 AM Xander Song <[hidden email]> wrote:
I am attempting to run a Beam pipeline on a GCP Dataproc Flink cluster. I have followed the instructions at this repo to create a Flink cluster on Dataproc using an initialization action. However, the resulting cluster uses version 1.5.6 of Flink, and my project requires a more recent version (version 1.7, 1.8, or 1.9) for compatibility with Beam.

Inside of the flink.sh script in the linked repo, there is a line for installing Flink from a snapshot URL instead of apt. Is this the correct mechanism for installing a different version of Flink using the initialization script? If so, how is it meant to be used?

Thank you in advance.