flink use hdfs DistributedCache

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

flink use hdfs DistributedCache

何春平
hi everyone!
 can flink submit job which read some custom file distributed by hdfs DistributedCache.
 like spark can do that with the follow command:
    bin/spark-submit  --master yarn  --deploy-mode cluster  --files /opt/its007-datacollection-conf.properties#its007-datacollection-conf.properties   ...
 then spark driver can read `its007-datacollection-conf.properties` file in work directory.

thanks!
Reply | Threaded
Open this post in threaded view
|

Re: flink use hdfs DistributedCache

Rong Rong
I am not sure if this suits your use case, but Flink YARN cli does support transferring local resource to all YARN nodes.
Simply use[1]:
bin/flink run -m yarn-cluster -yt <local_resource> 
or 
bin/flink run -m yarn-cluster --yarnship <local_resource> 
should do the trick.

It might have not been using the HDFS DistributedCache API though. 

Thanks,
Rong


On Sun, Sep 2, 2018 at 2:07 AM 何春平 <[hidden email]> wrote:
hi everyone!
 can flink submit job which read some custom file distributed by hdfs DistributedCache.
 like spark can do that with the follow command:
    bin/spark-submit  --master yarn  --deploy-mode cluster  --files /opt/its007-datacollection-conf.properties#its007-datacollection-conf.properties   ...
 then spark driver can read `its007-datacollection-conf.properties` file in work directory.

thanks!
Reply | Threaded
Open this post in threaded view
|

回复: flink use hdfs DistributedCache

何春平
Rong,thanks for your reply!
  This is what i need! 


------------------ 原始邮件 ------------------
发件人: "Rong Rong"<[hidden email]>;
发送时间: 2018年9月3日(星期一) 凌晨0:02
收件人: "何春平"<[hidden email]>;
抄送: "user"<[hidden email]>;
主题: Re: flink use hdfs DistributedCache

I am not sure if this suits your use case, but Flink YARN cli does support transferring local resource to all YARN nodes.
Simply use[1]:
bin/flink run -m yarn-cluster -yt <local_resource> 
or 
bin/flink run -m yarn-cluster --yarnship <local_resource> 
should do the trick.

It might have not been using the HDFS DistributedCache API though. 

Thanks,
Rong


On Sun, Sep 2, 2018 at 2:07 AM 何春平 <[hidden email]> wrote:
hi everyone!
 can flink submit job which read some custom file distributed by hdfs DistributedCache.
 like spark can do that with the follow command:
    bin/spark-submit  --master yarn  --deploy-mode cluster  --files /opt/its007-datacollection-conf.properties#its007-datacollection-conf.properties   ...
 then spark driver can read `its007-datacollection-conf.properties` file in work directory.

thanks!