(DEPRECATED) Apache Flink User Mailing List archive.

HDFS Clustering

Classic

List

Threaded

4 messages Options

Giacomo Licari

HDFS Clustering

Hi guys,
I'm Giacomo from Italy, I'm newbie with Flink.

I setted up a cluster with Hadoop 1.2 and Flink.

I would like to ask to you how to run the WordCount example taking the input file from hdfs (example myuser/testWordCount/hamlet.

txt) and put the output also inside hdfs (example myuser/testWordCount/output.txt).

I successfully run the example on my local filesystem, I would like to test it with HDSF.

Thanks a lot guys,
Giacomo

Márton Balassi-2

Re: HDFS Clustering

Hey,

Just add the the right prefix pointing to your hdfs filepath:

bin/flink run -v flink-java-examples-*-WordCount.jar hdfs://hostname:port/PATH/TO/INPUT hdfs://hostname:port/PATH/TO/OUTPUT

Best,

Marton

On Tue, Feb 24, 2015 at 11:13 AM, Giacomo Licari <[hidden email]> wrote:

Hi guys,
I'm Giacomo from Italy, I'm newbie with Flink.

I setted up a cluster with Hadoop 1.2 and Flink.

I would like to ask to you how to run the WordCount example taking the input file from hdfs (example myuser/testWordCount/hamlet.
txt) and put the output also inside hdfs (example myuser/testWordCount/output.txt).

I successfully run the example on my local filesystem, I would like to test it with HDSF.

Thanks a lot guys,
Giacomo

Maximilian Michels

Re: HDFS Clustering

In reply to this post by Giacomo Licari

Hi Giacomo,

Congratulations on setting up a Flink cluster with HDFS :) To run the
WordCount example provided with Flink, you should first upload your
input file to HDFS. If you have not done so, please run

> hdfs dfs -put -p file:///home/user/yourinputfile hdfs:///wc_input

Then, you can use the Flink command-line tool to submit the WordCount job.

> ./bin/flink run -v examples/flink-java-examples-*-WordCount.jar hdfs:///wc_input hdfs:///wc_output

This should work if you configured HDFS correctly. If you haven't set
the default hdfs name (fs.default.name), you might have to use the
full HDFS URL. For example, if your namenode's address is
namenode.example.com at port 7777, then use
hdfs://namenode.example.com:7777/wc_input.

Kind regards,
Max

On Tue, Feb 24, 2015 at 11:13 AM, Giacomo Licari
<[hidden email]> wrote:

> Hi guys,
> I'm Giacomo from Italy, I'm newbie with Flink.
>
> I setted up a cluster with Hadoop 1.2 and Flink.
>
> I would like to ask to you how to run the WordCount example taking the input
> file from hdfs (example myuser/testWordCount/hamlet.
> txt) and put the output also inside hdfs (example
> myuser/testWordCount/output.txt).
>
> I successfully run the example on my local filesystem, I would like to test
> it with HDSF.
>
> Thanks a lot guys,
> Giacomo

Giacomo Licari

Re: HDFS Clustering

Thanks a lot Marton and Max, it worked perfectly.

Regards from Italy :)

On Tue, Feb 24, 2015 at 11:31 AM, Max Michels <[hidden email]> wrote:

Hi Giacomo,

Congratulations on setting up a Flink cluster with HDFS :) To run the
WordCount example provided with Flink, you should first upload your
input file to HDFS. If you have not done so, please run

> hdfs dfs -put -p file:///home/user/yourinputfile hdfs:///wc_input

Then, you can use the Flink command-line tool to submit the WordCount job.

> ./bin/flink run -v examples/flink-java-examples-*-WordCount.jar hdfs:///wc_input hdfs:///wc_output

This should work if you configured HDFS correctly. If you haven't set
the default hdfs name (fs.default.name), you might have to use the
full HDFS URL. For example, if your namenode's address is
namenode.example.com at port 7777, then use
hdfs://namenode.example.com:7777/wc_input.

Kind regards,
Max

On Tue, Feb 24, 2015 at 11:13 AM, Giacomo Licari
<[hidden email]> wrote:

> Hi guys,
> I'm Giacomo from Italy, I'm newbie with Flink.
>
> I setted up a cluster with Hadoop 1.2 and Flink.
>
> I would like to ask to you how to run the WordCount example taking the input
> file from hdfs (example myuser/testWordCount/hamlet.
> txt) and put the output also inside hdfs (example
> myuser/testWordCount/output.txt).
>
> I successfully run the example on my local filesystem, I would like to test
> it with HDSF.
>
> Thanks a lot guys,
> Giacomo