How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

徐涛
Hi Experts
        When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
        One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.
        I want to know is it OK if the version is different? Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?
       
        Thanks a lot.

Best
Henry
Reply | Threaded
Open this post in threaded view
|

Re: How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

vino yang
Hi Henry,

When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.

I want to know is it OK if the version is different? 

> I don't think it is OK, because you will use a lower version of the client to access the higher version of the server.

Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

> I personally recommend that you keep the two versions consistent to eliminate the possibility of causing various potential problems. 
In fact, Flink provides a bundle of Hadoop 2.7.x bundles for you to download.[1]


Thanks, vino.

徐涛 <[hidden email]> 于2018年10月26日周五 上午9:13写道:
Hi Experts
        When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
        One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.
        I want to know is it OK if the version is different? Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

        Thanks a lot.

Best
Henry
Reply | Threaded
Open this post in threaded view
|

Re: How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

徐涛
Hi Vino,
Because I build the project with Maven, maybe I can not use the jars directly download from the web.
If built with Maven, how can I adjust the Hadoop version with the Hadoop version really used?
Thanks a lot!!

Best 
Henry

在 2018年10月26日,上午10:02,vino yang <[hidden email]> 写道:

Hi Henry,

When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.

I want to know is it OK if the version is different? 

> I don't think it is OK, because you will use a lower version of the client to access the higher version of the server.

Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

> I personally recommend that you keep the two versions consistent to eliminate the possibility of causing various potential problems. 
In fact, Flink provides a bundle of Hadoop 2.7.x bundles for you to download.[1]


Thanks, vino.

徐涛 <[hidden email]> 于2018年10月26日周五 上午9:13写道:
Hi Experts
        When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
        One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.
        I want to know is it OK if the version is different? Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

        Thanks a lot.

Best
Henry

Reply | Threaded
Open this post in threaded view
|

Re: How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

vino yang
Hi Henry,

You just need to change the node of "hadoop.version" in the parent pom file.

Thanks, vino.

徐涛 <[hidden email]> 于2018年10月29日周一 下午11:23写道:
Hi Vino,
Because I build the project with Maven, maybe I can not use the jars directly download from the web.
If built with Maven, how can I adjust the Hadoop version with the Hadoop version really used?
Thanks a lot!!

Best 
Henry

在 2018年10月26日,上午10:02,vino yang <[hidden email]> 写道:

Hi Henry,

When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.

I want to know is it OK if the version is different? 

> I don't think it is OK, because you will use a lower version of the client to access the higher version of the server.

Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

> I personally recommend that you keep the two versions consistent to eliminate the possibility of causing various potential problems. 
In fact, Flink provides a bundle of Hadoop 2.7.x bundles for you to download.[1]


Thanks, vino.

徐涛 <[hidden email]> 于2018年10月26日周五 上午9:13写道:
Hi Experts
        When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
        One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.
        I want to know is it OK if the version is different? Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

        Thanks a lot.

Best
Henry

Reply | Threaded
Open this post in threaded view
|

Re: How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

Hequn Cheng
Hi Henry,

You can specify a specific Hadoop version to build against:
mvn clean install -DskipTests -Dhadoop.version=2.6.1
 More details here[1].

Best, Hequn


On Tue, Oct 30, 2018 at 10:02 AM vino yang <[hidden email]> wrote:
Hi Henry,

You just need to change the node of "hadoop.version" in the parent pom file.

Thanks, vino.

徐涛 <[hidden email]> 于2018年10月29日周一 下午11:23写道:
Hi Vino,
Because I build the project with Maven, maybe I can not use the jars directly download from the web.
If built with Maven, how can I adjust the Hadoop version with the Hadoop version really used?
Thanks a lot!!

Best 
Henry

在 2018年10月26日,上午10:02,vino yang <[hidden email]> 写道:

Hi Henry,

When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.

I want to know is it OK if the version is different? 

> I don't think it is OK, because you will use a lower version of the client to access the higher version of the server.

Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

> I personally recommend that you keep the two versions consistent to eliminate the possibility of causing various potential problems. 
In fact, Flink provides a bundle of Hadoop 2.7.x bundles for you to download.[1]


Thanks, vino.

徐涛 <[hidden email]> 于2018年10月26日周五 上午9:13写道:
Hi Experts
        When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
        One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.
        I want to know is it OK if the version is different? Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

        Thanks a lot.

Best
Henry

Reply | Threaded
Open this post in threaded view
|

Re: How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

徐涛
Hi Hequn & Vino,
Finally I rebuild the Flink by change the “hadoop.version” in the pom file. 
Because Flink use maven shaded plugin to shade the Hadoop dependency, this also means I need to rebuild the hadoop shaded jar each time I upgrade Flink version.

Best
Henry

在 2018年10月30日,下午12:35,Hequn Cheng <[hidden email]> 写道:

Hi Henry,

You can specify a specific Hadoop version to build against:
mvn clean install -DskipTests -Dhadoop.version=2.6.1
 More details here[1].

Best, Hequn


On Tue, Oct 30, 2018 at 10:02 AM vino yang <[hidden email]> wrote:
Hi Henry,

You just need to change the node of "hadoop.version" in the parent pom file.

Thanks, vino.

徐涛 <[hidden email]> 于2018年10月29日周一 下午11:23写道:
Hi Vino,
Because I build the project with Maven, maybe I can not use the jars directly download from the web.
If built with Maven, how can I adjust the Hadoop version with the Hadoop version really used?
Thanks a lot!!

Best 
Henry

在 2018年10月26日,上午10:02,vino yang <[hidden email]> 写道:

Hi Henry,

When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.

I want to know is it OK if the version is different? 

> I don't think it is OK, because you will use a lower version of the client to access the higher version of the server.

Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

> I personally recommend that you keep the two versions consistent to eliminate the possibility of causing various potential problems. 
In fact, Flink provides a bundle of Hadoop 2.7.x bundles for you to download.[1]


Thanks, vino.

徐涛 <[hidden email]> 于2018年10月26日周五 上午9:13写道:
Hi Experts
        When running flink on YARN, from ClusterEntrypoint the system environment info is print out.
        One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2.
        I want to know is it OK if the version is different? Is it a best practice to adjust flink Hadoop version to the Hadoop version actually used?

        Thanks a lot.

Best
Henry