Contradictory docs: python.files config can include not only python files

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Contradictory docs: python.files config can include not only python files

Yik San Chan
Hi community,


Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San
Reply | Threaded
Open this post in threaded view
|

Re: Contradictory docs: python.files config can include not only python files

Dian Fu
Hi Yik San,

1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 

2) validate for the files passed to `--pyFiles`
Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?

Regards,
Dian

2021年4月26日 下午8:45,Yik San Chan <[hidden email]> 写道:

Hi community,


Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San

Reply | Threaded
Open this post in threaded view
|

Re: Contradictory docs: python.files config can include not only python files

Yik San Chan
Hi Dian,

It is still not clear to me - does it only allow Python files (.py), or not?

Best,
Yik San

On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <[hidden email]> wrote:
Hi Yik San,

1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 

2) validate for the files passed to `--pyFiles`
Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?

Regards,
Dian

2021年4月26日 下午8:45,Yik San Chan <[hidden email]> 写道:

Hi community,


Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San

Reply | Threaded
Open this post in threaded view
|

Re: Contradictory docs: python.files config can include not only python files

Dian Fu
Hi Yik San,

All the files which could be put in the PYTHONPATH are allowed here, e.g. .zip, .whl, etc.

Regards,
Dian

2021年4月27日 上午8:16,Yik San Chan <[hidden email]> 写道:

Hi Dian,

It is still not clear to me - does it only allow Python files (.py), or not?

Best,
Yik San

On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <[hidden email]> wrote:
Hi Yik San,

1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 

2) validate for the files passed to `--pyFiles`
Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?

Regards,
Dian

2021年4月26日 下午8:45,Yik San Chan <[hidden email]> 写道:

Hi community,


Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San


Reply | Threaded
Open this post in threaded view
|

Re: Contradictory docs: python.files config can include not only python files

Yik San Chan
Hi Dian,

If that's the case, shall we reword "Attach custom python files for job." into "attach custom files that could be put in PYTHONPATH, e.g., .zip, .whl, etc."

Best,
Yik San

On Tue, Apr 27, 2021 at 10:08 AM Dian Fu <[hidden email]> wrote:
Hi Yik San,

All the files which could be put in the PYTHONPATH are allowed here, e.g. .zip, .whl, etc.

Regards,
Dian

2021年4月27日 上午8:16,Yik San Chan <[hidden email]> 写道:

Hi Dian,

It is still not clear to me - does it only allow Python files (.py), or not?

Best,
Yik San

On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <[hidden email]> wrote:
Hi Yik San,

1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 

2) validate for the files passed to `--pyFiles`
Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?

Regards,
Dian

2021年4月26日 下午8:45,Yik San Chan <[hidden email]> 写道:

Hi community,


Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San


Reply | Threaded
Open this post in threaded view
|

Re: Contradictory docs: python.files config can include not only python files

Dian Fu
Thanks for the suggestion. It makes sense to me~. 

2021年4月27日 上午10:28,Yik San Chan <[hidden email]> 写道:

Hi Dian,

If that's the case, shall we reword "Attach custom python files for job." into "attach custom files that could be put in PYTHONPATH, e.g., .zip, .whl, etc."

Best,
Yik San

On Tue, Apr 27, 2021 at 10:08 AM Dian Fu <[hidden email]> wrote:
Hi Yik San,

All the files which could be put in the PYTHONPATH are allowed here, e.g. .zip, .whl, etc.

Regards,
Dian

2021年4月27日 上午8:16,Yik San Chan <[hidden email]> 写道:

Hi Dian,

It is still not clear to me - does it only allow Python files (.py), or not?

Best,
Yik San

On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <[hidden email]> wrote:
Hi Yik San,

1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 

2) validate for the files passed to `--pyFiles`
Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?

Regards,
Dian

2021年4月26日 下午8:45,Yik San Chan <[hidden email]> 写道:

Hi community,


Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San



Reply | Threaded
Open this post in threaded view
|

Re: Contradictory docs: python.files config can include not only python files

Yik San Chan
Hi Dian,

I created a PR to fix the docs. https://github.com/apache/flink/pull/15779

On Tue, Apr 27, 2021 at 2:08 PM Dian Fu <[hidden email]> wrote:
Thanks for the suggestion. It makes sense to me~. 

2021年4月27日 上午10:28,Yik San Chan <[hidden email]> 写道:

Hi Dian,

If that's the case, shall we reword "Attach custom python files for job." into "attach custom files that could be put in PYTHONPATH, e.g., .zip, .whl, etc."

Best,
Yik San

On Tue, Apr 27, 2021 at 10:08 AM Dian Fu <[hidden email]> wrote:
Hi Yik San,

All the files which could be put in the PYTHONPATH are allowed here, e.g. .zip, .whl, etc.

Regards,
Dian

2021年4月27日 上午8:16,Yik San Chan <[hidden email]> 写道:

Hi Dian,

It is still not clear to me - does it only allow Python files (.py), or not?

Best,
Yik San

On Mon, Apr 26, 2021 at 9:15 PM Dian Fu <[hidden email]> wrote:
Hi Yik San,

1) what `--pyFiles` is used for:
All the files specified via `--pyFiles` will be put in the PYTHONPATH of the Python worker during execution and then they will be available for the Python user-defined functions during execution. 

2) validate for the files passed to `--pyFiles`
Currently it will not validate the files passed to this argument. I also think that it’s not necessary and not able to perform such kind of check. Do you have any advice for this?

Regards,
Dian

2021年4月26日 下午8:45,Yik San Chan <[hidden email]> 写道:

Hi community,


Attach custom python files for job.

This makes readers think only Python files are allowed here. However, in https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#submitting-pyflink-jobs:

./bin/flink run \
      --python examples/python/table/batch/word_count.py \
      --pyFiles file:///user.txt,hdfs:///$namenode_address/username.txt
It is obviously including .txt file that is not Python files.

I believe it is contradictory here. Can anyone confirm?

Best,
Yik San