This post was updated on .
'm coding with pyflink 1.10.0 and building cluster with flink 1.10.0
I define the source and the sink as following: <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2674/%E6%97%A0%E6%A0%87%E9%A2%98.png> When I run this code only on master, it's OK. When I run this code on cluster, with 1 master and 1 salve, and I submit the task on master like this: sudo flink-1.10.0/bin/flink run -py main.py And error occurs like: Caused by: java.io.FileNotFoundException: The provided file path /opt/raw_data/input_json_65adbe54-cfdc-11ea-9d47-020012970011 does not exist. This file is stored on master's local file system. It seems that the slaves read their own file system instead of the master's. Or maybe there are other points I ignored (maybe some configurations in flink when I start the cluster). The question is how can I avoid the error? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi rookieCOder, You need to make sure that your files can be read by each slaves, so an alternative solution is to put your files on hdfs Best, Xingbo rookieCOder <[hidden email]> 于2020年7月27日周一 下午5:49写道: 'm coding with pyflink 1.10.0 and building cluster with flink 1.10.0 |
Hi, Xingbo
Thanks for your reply. So the point is that simply link the source or the sink to the master's local file system will cause the error that the slaves cannot read the source/sink files? Thus the simplest solution is to make sure that slaves have access to the master's local filesystem (by nfs or hdfs)? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Yes. You are right. Best, Xingbo rookieCOder <[hidden email]> 于2020年7月27日周一 下午6:30写道: Hi, Xingbo |
Hi,
And I've got another question. If I use user-defined function in pyflink, which only depends library A. And what the flink does is using the udf in tables. Does that mean I only need to install library A on the slaves? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi, You can use the `set_python_requirements` method to specify your requirement.txt which you can refer to the documentation[1] for details [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/python/dependency_management.html#python-dependency Best, Xingbo rookieCOder <[hidden email]> 于2020年7月27日周一 下午8:29写道: Hi, |
Free forum by Nabble | Edit this page |