PyFflink UDF Permission Denied

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

PyFflink UDF Permission Denied

Andrew Kramer
Hi,

I am using Flink in Zeppelin and trying to execute a UDF defined in Python.

The problem is I keep getting the following permission denied error in the log:
Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.io.IOException: Cannot run program "/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh": error=13, Permission denied at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:447) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:432) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:299) at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:417) ... 14 more 

The code runs fine if I do not include the UDF. I have modified the java properties to use /test instead of /tmp

Any thoughts?

Thanks,
Andrew 
Reply | Threaded
Open this post in threaded view
|

Re: PyFflink UDF Permission Denied

Xingbo Huang
Hi Andrew,

According to the error, you can try to check the file permission of
"/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh"

Normally, the permission of this script would be
-rwxr-xr-x

Best,
Xingbo

Andrew Kramer <[hidden email]> 于2020年12月27日周日 下午10:29写道:
Hi,

I am using Flink in Zeppelin and trying to execute a UDF defined in Python.

The problem is I keep getting the following permission denied error in the log:
Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.io.IOException: Cannot run program "/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh": error=13, Permission denied at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:447) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:432) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:299) at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:417) ... 14 more 

The code runs fine if I do not include the UDF. I have modified the java properties to use /test instead of /tmp

Any thoughts?

Thanks,
Andrew 
Reply | Threaded
Open this post in threaded view
|

Re: PyFflink UDF Permission Denied

Andrew Kramer
Hi Xingbo,

That file does not exist on the file system. 

Thanks,
Andrew

On Monday, 28 December 2020, Xingbo Huang <[hidden email]> wrote:
Hi Andrew,

According to the error, you can try to check the file permission of
"/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh"

Normally, the permission of this script would be
-rwxr-xr-x

Best,
Xingbo

Andrew Kramer <[hidden email]> 于2020年12月27日周日 下午10:29写道:
Hi,

I am using Flink in Zeppelin and trying to execute a UDF defined in Python.

The problem is I keep getting the following permission denied error in the log:
Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.io.IOException: Cannot run program "/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh": error=13, Permission denied at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:447) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:432) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:299) at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:417) ... 14 more 

The code runs fine if I do not include the UDF. I have modified the java properties to use /test instead of /tmp

Any thoughts?

Thanks,
Andrew 
Reply | Threaded
Open this post in threaded view
|

Re: PyFflink UDF Permission Denied

Xingbo Huang
Hi Andrew,

Sorry for the late reply. It seems that I misunderstood your description. The script `pyflink-udf-runner.sh` you need to check exists in the `bin` directory of the `pyflink` package installed by the python interpreter you are using. You can execute the following command to find the path:

python -c "import pyflink;import os;print(os.path.dirname(os.path.abspath(pyflink.__file__))+'/bin')"

Regarding the question "The code runs fine if I do not include the UDF." is because when using python udf, you need to use `pyflink-udf-runner.sh` to create a python process that executes python udf code. If you don’t use python udf, the contents of the execution are all running on the JVM.

Best,
Xingbo

Andrew Kramer <[hidden email]> 于2020年12月28日周一 下午8:29写道:
Hi Xingbo,

That file does not exist on the file system. 

Thanks,
Andrew

On Monday, 28 December 2020, Xingbo Huang <[hidden email]> wrote:
Hi Andrew,

According to the error, you can try to check the file permission of
"/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh"

Normally, the permission of this script would be
-rwxr-xr-x

Best,
Xingbo

Andrew Kramer <[hidden email]> 于2020年12月27日周日 下午10:29写道:
Hi,

I am using Flink in Zeppelin and trying to execute a UDF defined in Python.

The problem is I keep getting the following permission denied error in the log:
Caused by: org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.io.IOException: Cannot run program "/test/python-dist-78654584-bda6-4c76-8ef7-87b6fd256e4f/python-files/site-packages/site-packages/pyflink/bin/pyflink-udf-runner.sh": error=13, Permission denied at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:447) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:432) at org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:299) at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createStageBundleFactory(BeamPythonFunctionRunner.java:417) ... 14 more 

The code runs fine if I do not include the UDF. I have modified the java properties to use /test instead of /tmp

Any thoughts?

Thanks,
Andrew