flink terasort

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

flink terasort

Bill Sparks
Just asking, is there an implementation of terasort for flink? 

Regards,
   Bill.
-- 
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.
Reply | Threaded
Open this post in threaded view
|

Re: flink terasort

Chiwan Park
There is a terasort implementation with deprecated API.
https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/recordJobs/sort/TeraSort.java

AFAIK, there is no implementation with current API.

Regards,
Chiwan Park



On Jun 4, 2015, at 12:17 AM, Bill Sparks <[hidden email]> wrote:

Just asking, is there an implementation of terasort for flink? 

Regards,
   Bill.
-- 
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

Reply | Threaded
Open this post in threaded view
|

Re: flink terasort

Bill Sparks
Will take a look, thanks.
-- 
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Chiwan Park <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Wednesday, June 3, 2015 10:24 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: flink terasort

There is a terasort implementation with deprecated API.
https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/recordJobs/sort/TeraSort.java

AFAIK, there is no implementation with current API.

Regards,
Chiwan Park



On Jun 4, 2015, at 12:17 AM, Bill Sparks <[hidden email]> wrote:

Just asking, is there an implementation of terasort for flink? 

Regards,
   Bill.
-- 
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

Reply | Threaded
Open this post in threaded view
|

Re: flink terasort

Fabian Hueske-2
A TeraSort implementation for the current DataSet API would look a bit different from the deprecated Record API.
Flink doesn't support automatic range partitioning, but by using a custom partitoner (DataSet.partitionCustom()) which range partitions (distribution of values is known) and a subsequent DataSet.sortPartition() you can do a global sort and implement a TeraSort program.

Just drop a mail if you have further questions.

Cheers, Fabian

2015-06-03 17:34 GMT+02:00 Bill Sparks <[hidden email]>:
Will take a look, thanks.
-- 
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Chiwan Park <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Wednesday, June 3, 2015 10:24 AM
To: "[hidden email]" <[hidden email]>
Subject: Re: flink terasort

There is a terasort implementation with deprecated API.
https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/recordJobs/sort/TeraSort.java

AFAIK, there is no implementation with current API.

Regards,
Chiwan Park



On Jun 4, 2015, at 12:17 AM, Bill Sparks <[hidden email]> wrote:

Just asking, is there an implementation of terasort for flink? 

Regards,
   Bill.
-- 
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.