hadoop error with flink mesos on startup

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

hadoop error with flink mesos on startup

Jared Stehler
After upgrading to flink 1.4.0 using the hadoop-free build option, I’m seeing the following error on startup in the app master:

2017-12-12 18:23:15.473 [main] ERROR o.a.f.m.r.clusterframework.MesosApplicationMasterRunner - Mesos JobManager initialization failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.UserGroupInformation
at org.apache.flink.runtime.clusterframework.overlays.HadoopUserOverlay$Builder.fromEnvironment(HadoopUserOverlay.java:74)
at org.apache.flink.mesos.entrypoint.MesosEntrypointUtils.applyOverlays(MesosEntrypointUtils.java:145)
Looking at the code, it appears that the HadoopUserOverlay always tries to init the UserGroupInformation class, and is failing. Same error with or without the flink-shaded-hadoop2 library included.

This is my lib dir:

flink-appmaster-1.0-SNAPSHOT.jar   flink-s3-fs-presto-1.4.0.jar       jul-to-slf4j-1.7.25.jar            sentry-1.5.3.jar
flink-dist_2.11-1.4.0.jar          flink-shaded-hadoop2-1.4.0.jar     log4j-over-slf4j-1.7.25.jar        sentry-logback-1.5.3.jar
flink-metrics-prometheus-1.4.0.jar jackson-core-2.8.10.jar            logback-classic-1.1.11.jar
flink-python_2.11-1.4.0.jar        jcl-over-slf4j-1.7.25.jar          logback-core-1.1.11.jar


--
Jared Stehler
Chief Architect - Intellify Learning
o: 617.701.6330 x703



Reply | Threaded
Open this post in threaded view
|

Re: hadoop error with flink mesos on startup

Chesnay Schepler
Could you look into the flink-shaded-hadoop jar to check whether the missing class is actually contained?

Where did the flink-shaded-hadoop jar come from? I'm asking because when building flink-dist from source the jar is called flink-shaded-hadoop2-uber-1.4.0.jar, which does indeed contain the jar. (the uber jar is created by building flink-shaded-hadoop2-uber)

On 12.12.2017 19:28, Jared Stehler wrote:
After upgrading to flink 1.4.0 using the hadoop-free build option, I’m seeing the following error on startup in the app master:

2017-12-12 18:23:15.473 [main] ERROR o.a.f.m.r.clusterframework.MesosApplicationMasterRunner - Mesos JobManager initialization failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.UserGroupInformation
at org.apache.flink.runtime.clusterframework.overlays.HadoopUserOverlay$Builder.fromEnvironment(HadoopUserOverlay.java:74)
at org.apache.flink.mesos.entrypoint.MesosEntrypointUtils.applyOverlays(MesosEntrypointUtils.java:145)
Looking at the code, it appears that the HadoopUserOverlay always tries to init the UserGroupInformation class, and is failing. Same error with or without the flink-shaded-hadoop2 library included.

This is my lib dir:

flink-appmaster-1.0-SNAPSHOT.jar   flink-s3-fs-presto-1.4.0.jar       jul-to-slf4j-1.7.25.jar            sentry-1.5.3.jar
flink-dist_2.11-1.4.0.jar          flink-shaded-hadoop2-1.4.0.jar     log4j-over-slf4j-1.7.25.jar        sentry-logback-1.5.3.jar
flink-metrics-prometheus-1.4.0.jar jackson-core-2.8.10.jar            logback-classic-1.1.11.jar
flink-python_2.11-1.4.0.jar        jcl-over-slf4j-1.7.25.jar          logback-core-1.1.11.jar


--
Jared Stehler
Chief Architect - Intellify Learning
o: 617.701.6330 x703




Reply | Threaded
Open this post in threaded view
|

Re: hadoop error with flink mesos on startup

Jared Stehler
The class is there; this issue is a static initializer error, probably from other missing classes. I’ll try using the uber jar to see if that helps any, and will report back.

I’ve included the shaded jar as a maven dependency:

    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-shaded-hadoop2</artifactId>
      <version>${flink.version}</version>
    </dependency>


--
Jared Stehler
Chief Architect - Intellify Learning
o: 617.701.6330 x703



On Dec 12, 2017, at 2:10 PM, Chesnay Schepler <[hidden email]> wrote:

Could you look into the flink-shaded-hadoop jar to check whether the missing class is actually contained?

Where did the flink-shaded-hadoop jar come from? I'm asking because when building flink-dist from source the jar is called flink-shaded-hadoop2-uber-1.4.0.jar, which does indeed contain the jar. (the uber jar is created by building flink-shaded-hadoop2-uber)

On 12.12.2017 19:28, Jared Stehler wrote:
After upgrading to flink 1.4.0 using the hadoop-free build option, I’m seeing the following error on startup in the app master:

2017-12-12 18:23:15.473 [main] ERROR o.a.f.m.r.clusterframework.MesosApplicationMasterRunner - Mesos JobManager initialization failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.UserGroupInformation
at org.apache.flink.runtime.clusterframework.overlays.HadoopUserOverlay$Builder.fromEnvironment(HadoopUserOverlay.java:74)
at org.apache.flink.mesos.entrypoint.MesosEntrypointUtils.applyOverlays(MesosEntrypointUtils.java:145)
Looking at the code, it appears that the HadoopUserOverlay always tries to init the UserGroupInformation class, and is failing. Same error with or without the flink-shaded-hadoop2 library included.

This is my lib dir:

flink-appmaster-1.0-SNAPSHOT.jar   flink-s3-fs-presto-1.4.0.jar       jul-to-slf4j-1.7.25.jar            sentry-1.5.3.jar
flink-dist_2.11-1.4.0.jar          flink-shaded-hadoop2-1.4.0.jar     log4j-over-slf4j-1.7.25.jar        sentry-logback-1.5.3.jar
flink-metrics-prometheus-1.4.0.jar jackson-core-2.8.10.jar            logback-classic-1.1.11.jar
flink-python_2.11-1.4.0.jar        jcl-over-slf4j-1.7.25.jar          logback-core-1.1.11.jar


--
Jared Stehler
Chief Architect - Intellify Learning
o: 617.701.6330 x703





Reply | Threaded
Open this post in threaded view
|

Re: hadoop error with flink mesos on startup

Jared Stehler
In reply to this post by Chesnay Schepler
I had been excluding all transitive dependencies from the lib dir; it seems to be working when I added the following deps:

    <dependency>
      <groupId>commons-configuration</groupId>
      <artifactId>commons-configuration</artifactId>
      <version>1.7</version>
    </dependency>

    <dependency>
      <groupId>commons-lang</groupId>
      <artifactId>commons-lang</artifactId>
      <version>2.6</version>
    </dependency>


--
Jared Stehler
Chief Architect - Intellify Learning
o: 617.701.6330 x703



On Dec 12, 2017, at 2:10 PM, Chesnay Schepler <[hidden email]> wrote:

Could you look into the flink-shaded-hadoop jar to check whether the missing class is actually contained?

Where did the flink-shaded-hadoop jar come from? I'm asking because when building flink-dist from source the jar is called flink-shaded-hadoop2-uber-1.4.0.jar, which does indeed contain the jar. (the uber jar is created by building flink-shaded-hadoop2-uber)

On 12.12.2017 19:28, Jared Stehler wrote:
After upgrading to flink 1.4.0 using the hadoop-free build option, I’m seeing the following error on startup in the app master:

2017-12-12 18:23:15.473 [main] ERROR o.a.f.m.r.clusterframework.MesosApplicationMasterRunner - Mesos JobManager initialization failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.UserGroupInformation
at org.apache.flink.runtime.clusterframework.overlays.HadoopUserOverlay$Builder.fromEnvironment(HadoopUserOverlay.java:74)
at org.apache.flink.mesos.entrypoint.MesosEntrypointUtils.applyOverlays(MesosEntrypointUtils.java:145)
Looking at the code, it appears that the HadoopUserOverlay always tries to init the UserGroupInformation class, and is failing. Same error with or without the flink-shaded-hadoop2 library included.

This is my lib dir:

flink-appmaster-1.0-SNAPSHOT.jar   flink-s3-fs-presto-1.4.0.jar       jul-to-slf4j-1.7.25.jar            sentry-1.5.3.jar
flink-dist_2.11-1.4.0.jar          flink-shaded-hadoop2-1.4.0.jar     log4j-over-slf4j-1.7.25.jar        sentry-logback-1.5.3.jar
flink-metrics-prometheus-1.4.0.jar jackson-core-2.8.10.jar            logback-classic-1.1.11.jar
flink-python_2.11-1.4.0.jar        jcl-over-slf4j-1.7.25.jar          logback-core-1.1.11.jar


--
Jared Stehler
Chief Architect - Intellify Learning
o: 617.701.6330 x703





Reply | Threaded
Open this post in threaded view
|

Re: hadoop error with flink mesos on startup

Eron Wright
Thanks for investigating this, Jared.  I would summarize it as Flink-on-Mesos cannot be used in Hadoop-free mode in Flink 1.4.0.  I filed an improvement bug to support this scenario: FLINK-8247



On Tue, Dec 12, 2017 at 11:46 AM, Jared Stehler <[hidden email]> wrote:
I had been excluding all transitive dependencies from the lib dir; it seems to be working when I added the following deps:

    <dependency>
      <groupId>commons-configuration</groupId>
      <artifactId>commons-configuration</artifactId>
      <version>1.7</version>
    </dependency>

    <dependency>
      <groupId>commons-lang</groupId>
      <artifactId>commons-lang</artifactId>
      <version>2.6</version>
    </dependency>


--
Jared Stehler
Chief Architect - Intellify Learning
o: <a href="tel:(617)%20701-6330" value="+16177016330" target="_blank">617.701.6330 x703



On Dec 12, 2017, at 2:10 PM, Chesnay Schepler <[hidden email]> wrote:

Could you look into the flink-shaded-hadoop jar to check whether the missing class is actually contained?

Where did the flink-shaded-hadoop jar come from? I'm asking because when building flink-dist from source the jar is called flink-shaded-hadoop2-uber-1.4.0.jar, which does indeed contain the jar. (the uber jar is created by building flink-shaded-hadoop2-uber)

On 12.12.2017 19:28, Jared Stehler wrote:
After upgrading to flink 1.4.0 using the hadoop-free build option, I’m seeing the following error on startup in the app master:

2017-12-12 18:23:15.473 [main] ERROR o.a.f.m.r.clusterframework.MesosApplicationMasterRunner - Mesos JobManager initialization failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.UserGroupInformation
at org.apache.flink.runtime.clusterframework.overlays.HadoopUserOverlay$Builder.fromEnvironment(HadoopUserOverlay.java:74)
at org.apache.flink.mesos.entrypoint.MesosEntrypointUtils.applyOverlays(MesosEntrypointUtils.java:145)
Looking at the code, it appears that the HadoopUserOverlay always tries to init the UserGroupInformation class, and is failing. Same error with or without the flink-shaded-hadoop2 library included.

This is my lib dir:

flink-appmaster-1.0-SNAPSHOT.jar   flink-s3-fs-presto-1.4.0.jar       jul-to-slf4j-1.7.25.jar            sentry-1.5.3.jar
flink-dist_2.11-1.4.0.jar          flink-shaded-hadoop2-1.4.0.jar     log4j-over-slf4j-1.7.25.jar        sentry-logback-1.5.3.jar
flink-metrics-prometheus-1.4.0.jar jackson-core-2.8.10.jar            logback-classic-1.1.11.jar
flink-python_2.11-1.4.0.jar        jcl-over-slf4j-1.7.25.jar          logback-core-1.1.11.jar


--
Jared Stehler
Chief Architect - Intellify Learning
o: <a href="tel:(617)%20701-6330" value="+16177016330" target="_blank">617.701.6330 x703