HDFS Clustering

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

HDFS Clustering

Giacomo Licari
Hi guys,
I'm Giacomo from Italy, I'm newbie with Flink.

I setted up a cluster with Hadoop 1.2 and Flink.

I would like to ask to you how to run the WordCount example taking the input file from hdfs (example myuser/testWordCount/hamlet.
txt) and put the output also inside hdfs (example myuser/testWordCount/output.txt).

I successfully run the example on my local filesystem, I would like to test it with HDSF.

Thanks a lot guys,
Giacomo
Reply | Threaded
Open this post in threaded view
|

Re: HDFS Clustering

Márton Balassi-2
Hey,

Just add the the right prefix pointing to your hdfs filepath:

bin/flink run -v flink-java-examples-*-WordCount.jar hdfs://hostname:port/PATH/TO/INPUT hdfs://hostname:port/PATH/TO/OUTPUT

Best,

Marton

On Tue, Feb 24, 2015 at 11:13 AM, Giacomo Licari <[hidden email]> wrote:
Hi guys,
I'm Giacomo from Italy, I'm newbie with Flink.

I setted up a cluster with Hadoop 1.2 and Flink.

I would like to ask to you how to run the WordCount example taking the input file from hdfs (example myuser/testWordCount/hamlet.
txt) and put the output also inside hdfs (example myuser/testWordCount/output.txt).

I successfully run the example on my local filesystem, I would like to test it with HDSF.

Thanks a lot guys,
Giacomo

Reply | Threaded
Open this post in threaded view
|

Re: HDFS Clustering

Maximilian Michels
In reply to this post by Giacomo Licari
Hi Giacomo,

Congratulations on setting up a Flink cluster with HDFS :) To run the
WordCount example provided with Flink, you should first upload your
input file to HDFS. If you have not done so, please run

> hdfs dfs -put -p file:///home/user/yourinputfile hdfs:///wc_input

Then, you can use the Flink command-line tool to submit the WordCount job.

> ./bin/flink run -v examples/flink-java-examples-*-WordCount.jar hdfs:///wc_input hdfs:///wc_output


This should work if you configured HDFS correctly. If you haven't set
the default hdfs name (fs.default.name), you might have to use the
full HDFS URL. For example, if your namenode's address is
namenode.example.com at port 7777, then use
hdfs://namenode.example.com:7777/wc_input.


Kind regards,
Max

On Tue, Feb 24, 2015 at 11:13 AM, Giacomo Licari
<[hidden email]> wrote:

> Hi guys,
> I'm Giacomo from Italy, I'm newbie with Flink.
>
> I setted up a cluster with Hadoop 1.2 and Flink.
>
> I would like to ask to you how to run the WordCount example taking the input
> file from hdfs (example myuser/testWordCount/hamlet.
> txt) and put the output also inside hdfs (example
> myuser/testWordCount/output.txt).
>
> I successfully run the example on my local filesystem, I would like to test
> it with HDSF.
>
> Thanks a lot guys,
> Giacomo
Reply | Threaded
Open this post in threaded view
|

Re: HDFS Clustering

Giacomo Licari
Thanks a lot Marton and Max, it worked perfectly.

Regards from Italy :)

On Tue, Feb 24, 2015 at 11:31 AM, Max Michels <[hidden email]> wrote:
Hi Giacomo,

Congratulations on setting up a Flink cluster with HDFS :) To run the
WordCount example provided with Flink, you should first upload your
input file to HDFS. If you have not done so, please run

> hdfs dfs -put -p file:///home/user/yourinputfile hdfs:///wc_input

Then, you can use the Flink command-line tool to submit the WordCount job.

> ./bin/flink run -v examples/flink-java-examples-*-WordCount.jar hdfs:///wc_input hdfs:///wc_output


This should work if you configured HDFS correctly. If you haven't set
the default hdfs name (fs.default.name), you might have to use the
full HDFS URL. For example, if your namenode's address is
namenode.example.com at port 7777, then use
hdfs://namenode.example.com:7777/wc_input.


Kind regards,
Max

On Tue, Feb 24, 2015 at 11:13 AM, Giacomo Licari
<[hidden email]> wrote:
> Hi guys,
> I'm Giacomo from Italy, I'm newbie with Flink.
>
> I setted up a cluster with Hadoop 1.2 and Flink.
>
> I would like to ask to you how to run the WordCount example taking the input
> file from hdfs (example myuser/testWordCount/hamlet.
> txt) and put the output also inside hdfs (example
> myuser/testWordCount/output.txt).
>
> I successfully run the example on my local filesystem, I would like to test
> it with HDSF.
>
> Thanks a lot guys,
> Giacomo