Using native libraries in Flink EMR jobs

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Using native libraries in Flink EMR jobs

Timur Fayruzov
Hello,

I'm not sure whether it's a Hadoop or Flink-specific question, but since I ran into this in the context of Flink I'm asking here. I would be glad if anyone can suggest a more appropriate place.

I have a native library that I need to use in my Flink batch job that I run on EMR, and I try to point JVM to the location of native library. Normally, I'd do this using java.library.path parameter. So I try to run as follows:
`
HADOOP_CONF_DIR=/etc/hadoop/conf JVM_ARGS=-Djava.library.path=<native_lib_dir> flink-1.0.0/bin/flink run -m yarn-cluster -yn 1 -yjm 768 -ytm 768 <my.jar>
`
It does not work, fails with `java.lang.UnsatisfiedLinkError` when trying to load the native lib. It probably has to do with YARN not not passing this parameter to task nodes, but my understanding of this mechanism is quite limited so far.

I dug up this Jira ticket: https://issues.apache.org/jira/browse/MAPREDUCE-3693, but setting LD_LIBRARY_PATH in mapreduce.admin.user.env did not solve the problem either.

Any help or hint where to look is highly appreciated.

Thanks,
Timur
Reply | Threaded
Open this post in threaded view
|

Re: Using native libraries in Flink EMR jobs

Timur Fayruzov
there is a hack for this issue: copying my native library to $HADOOP_HOME/lib/native makes it discoverable and a program runs, however this is not an appropriate solution and it seems to be fragile.

I tried to find where 'lib/native' path appears in the configuration and found 2 places:
hadoop-env.sh: export JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop-lzo/lib/native
mapred-site.xml: key: mapreduce.admin.user.env

I tried to add path to dir with my native lib in both places, but still no luck.

Thanks,
Timur

On Wed, Apr 6, 2016 at 11:21 PM, Timur Fayruzov <[hidden email]> wrote:
Hello,

I'm not sure whether it's a Hadoop or Flink-specific question, but since I ran into this in the context of Flink I'm asking here. I would be glad if anyone can suggest a more appropriate place.

I have a native library that I need to use in my Flink batch job that I run on EMR, and I try to point JVM to the location of native library. Normally, I'd do this using java.library.path parameter. So I try to run as follows:
`
HADOOP_CONF_DIR=/etc/hadoop/conf JVM_ARGS=-Djava.library.path=<native_lib_dir> flink-1.0.0/bin/flink run -m yarn-cluster -yn 1 -yjm 768 -ytm 768 <my.jar>
`
It does not work, fails with `java.lang.UnsatisfiedLinkError` when trying to load the native lib. It probably has to do with YARN not not passing this parameter to task nodes, but my understanding of this mechanism is quite limited so far.

I dug up this Jira ticket: https://issues.apache.org/jira/browse/MAPREDUCE-3693, but setting LD_LIBRARY_PATH in mapreduce.admin.user.env did not solve the problem either.

Any help or hint where to look is highly appreciated.

Thanks,
Timur

Reply | Threaded
Open this post in threaded view
|

Re: Using native libraries in Flink EMR jobs

Till Rohrmann

Hi Timur,

what you can try doing is to pass the JVM parameter -Djava.library.path=<path> via the env.java.opts to the system. You simply have to add env.java.opts: "-Djava.library.path=<path>" in the flink-conf.yaml or via -Denv.java.opts="-Djava.library.path=<path>", if I’m not mistaken.

Cheers
Till


On Thu, Apr 7, 2016 at 10:07 AM, Timur Fayruzov <[hidden email]> wrote:
there is a hack for this issue: copying my native library to $HADOOP_HOME/lib/native makes it discoverable and a program runs, however this is not an appropriate solution and it seems to be fragile.

I tried to find where 'lib/native' path appears in the configuration and found 2 places:
hadoop-env.sh: export JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop-lzo/lib/native
mapred-site.xml: key: mapreduce.admin.user.env

I tried to add path to dir with my native lib in both places, but still no luck.

Thanks,
Timur

On Wed, Apr 6, 2016 at 11:21 PM, Timur Fayruzov <[hidden email]> wrote:
Hello,

I'm not sure whether it's a Hadoop or Flink-specific question, but since I ran into this in the context of Flink I'm asking here. I would be glad if anyone can suggest a more appropriate place.

I have a native library that I need to use in my Flink batch job that I run on EMR, and I try to point JVM to the location of native library. Normally, I'd do this using java.library.path parameter. So I try to run as follows:
`
HADOOP_CONF_DIR=/etc/hadoop/conf JVM_ARGS=-Djava.library.path=<native_lib_dir> flink-1.0.0/bin/flink run -m yarn-cluster -yn 1 -yjm 768 -ytm 768 <my.jar>
`
It does not work, fails with `java.lang.UnsatisfiedLinkError` when trying to load the native lib. It probably has to do with YARN not not passing this parameter to task nodes, but my understanding of this mechanism is quite limited so far.

I dug up this Jira ticket: https://issues.apache.org/jira/browse/MAPREDUCE-3693, but setting LD_LIBRARY_PATH in mapreduce.admin.user.env did not solve the problem either.

Any help or hint where to look is highly appreciated.

Thanks,
Timur


Reply | Threaded
Open this post in threaded view
|

Re: Using native libraries in Flink EMR jobs

Till Rohrmann

For passing the dynamic property directly when running things on YARN, you have to use -yDenv.java.opts="..."


On Thu, Apr 7, 2016 at 11:42 AM, Till Rohrmann <[hidden email]> wrote:

Hi Timur,

what you can try doing is to pass the JVM parameter -Djava.library.path=<path> via the env.java.opts to the system. You simply have to add env.java.opts: "-Djava.library.path=<path>" in the flink-conf.yaml or via -Denv.java.opts="-Djava.library.path=<path>", if I’m not mistaken.

Cheers
Till


On Thu, Apr 7, 2016 at 10:07 AM, Timur Fayruzov <[hidden email]> wrote:
there is a hack for this issue: copying my native library to $HADOOP_HOME/lib/native makes it discoverable and a program runs, however this is not an appropriate solution and it seems to be fragile.

I tried to find where 'lib/native' path appears in the configuration and found 2 places:
hadoop-env.sh: export JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop-lzo/lib/native
mapred-site.xml: key: mapreduce.admin.user.env

I tried to add path to dir with my native lib in both places, but still no luck.

Thanks,
Timur

On Wed, Apr 6, 2016 at 11:21 PM, Timur Fayruzov <[hidden email]> wrote:
Hello,

I'm not sure whether it's a Hadoop or Flink-specific question, but since I ran into this in the context of Flink I'm asking here. I would be glad if anyone can suggest a more appropriate place.

I have a native library that I need to use in my Flink batch job that I run on EMR, and I try to point JVM to the location of native library. Normally, I'd do this using java.library.path parameter. So I try to run as follows:
`
HADOOP_CONF_DIR=/etc/hadoop/conf JVM_ARGS=-Djava.library.path=<native_lib_dir> flink-1.0.0/bin/flink run -m yarn-cluster -yn 1 -yjm 768 -ytm 768 <my.jar>
`
It does not work, fails with `java.lang.UnsatisfiedLinkError` when trying to load the native lib. It probably has to do with YARN not not passing this parameter to task nodes, but my understanding of this mechanism is quite limited so far.

I dug up this Jira ticket: https://issues.apache.org/jira/browse/MAPREDUCE-3693, but setting LD_LIBRARY_PATH in mapreduce.admin.user.env did not solve the problem either.

Any help or hint where to look is highly appreciated.

Thanks,
Timur



Reply | Threaded
Open this post in threaded view
|

Re: Using native libraries in Flink EMR jobs

Timur Fayruzov
Thank you, Till! setting the flag in flink-conf.yalm worked, I'm very glad that it was resolved. Note, however, passing it as an argument to flink script did not work. I tried to pass it as: `-yDenv.java.opts="-Djava.library.path=<path>"`. I did not investigate any further at this time.

Thanks,
Timur

On Thu, Apr 7, 2016 at 2:48 AM, Till Rohrmann <[hidden email]> wrote:

For passing the dynamic property directly when running things on YARN, you have to use -yDenv.java.opts="..."


On Thu, Apr 7, 2016 at 11:42 AM, Till Rohrmann <[hidden email]> wrote:

Hi Timur,

what you can try doing is to pass the JVM parameter -Djava.library.path=<path> via the env.java.opts to the system. You simply have to add env.java.opts: "-Djava.library.path=<path>" in the flink-conf.yaml or via -Denv.java.opts="-Djava.library.path=<path>", if I’m not mistaken.

Cheers
Till


On Thu, Apr 7, 2016 at 10:07 AM, Timur Fayruzov <[hidden email]> wrote:
there is a hack for this issue: copying my native library to $HADOOP_HOME/lib/native makes it discoverable and a program runs, however this is not an appropriate solution and it seems to be fragile.

I tried to find where 'lib/native' path appears in the configuration and found 2 places:
hadoop-env.sh: export JAVA_LIBRARY_PATH="$JAVA_LIBRARY_PATH:/usr/lib/hadoop-lzo/lib/native
mapred-site.xml: key: mapreduce.admin.user.env

I tried to add path to dir with my native lib in both places, but still no luck.

Thanks,
Timur

On Wed, Apr 6, 2016 at 11:21 PM, Timur Fayruzov <[hidden email]> wrote:
Hello,

I'm not sure whether it's a Hadoop or Flink-specific question, but since I ran into this in the context of Flink I'm asking here. I would be glad if anyone can suggest a more appropriate place.

I have a native library that I need to use in my Flink batch job that I run on EMR, and I try to point JVM to the location of native library. Normally, I'd do this using java.library.path parameter. So I try to run as follows:
`
HADOOP_CONF_DIR=/etc/hadoop/conf JVM_ARGS=-Djava.library.path=<native_lib_dir> flink-1.0.0/bin/flink run -m yarn-cluster -yn 1 -yjm 768 -ytm 768 <my.jar>
`
It does not work, fails with `java.lang.UnsatisfiedLinkError` when trying to load the native lib. It probably has to do with YARN not not passing this parameter to task nodes, but my understanding of this mechanism is quite limited so far.

I dug up this Jira ticket: https://issues.apache.org/jira/browse/MAPREDUCE-3693, but setting LD_LIBRARY_PATH in mapreduce.admin.user.env did not solve the problem either.

Any help or hint where to look is highly appreciated.

Thanks,
Timur