DataSet.randomSplit()

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

DataSet.randomSplit()

Sourigna Phetsarath
All:

Does Flink DataSet have a randomSplit(weights:Array[Double], seed: Long): Array[DataSet[T]] function?

There is this pull request: https://github.com/apache/flink/pull/921

Does anyone have an update of the progress of this? 

Thank you.

--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna


Reply | Threaded
Open this post in threaded view
|

Re: DataSet.randomSplit()

Ufuk Celebi
Hey Sourigna,

that particular method is not part of Flink yet.

Did you have a look at the sampling methods in DataSetUtils? Maybe they can be helpful for what you are trying to achieve.

– Ufuk

On Wed, Mar 23, 2016 at 5:19 PM, Sourigna Phetsarath <[hidden email]> wrote:
All:

Does Flink DataSet have a randomSplit(weights:Array[Double], seed: Long): Array[DataSet[T]] function?

There is this pull request: https://github.com/apache/flink/pull/921

Does anyone have an update of the progress of this? 

Thank you.

--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna



Reply | Threaded
Open this post in threaded view
|

Re: DataSet.randomSplit()

Sourigna Phetsarath
Ufuk,

Thank you.  Yes, I saw the sampling methods in DataSetUtils and they are helpful. 

Just wanted to see if that particular method is on the road map for a future release.

-Gna

On Mon, Mar 28, 2016 at 6:22 AM, Ufuk Celebi <[hidden email]> wrote:
Hey Sourigna,

that particular method is not part of Flink yet.

Did you have a look at the sampling methods in DataSetUtils? Maybe they can be helpful for what you are trying to achieve.

– Ufuk

On Wed, Mar 23, 2016 at 5:19 PM, Sourigna Phetsarath <[hidden email]> wrote:
All:

Does Flink DataSet have a randomSplit(weights:Array[Double], seed: Long): Array[DataSet[T]] function?

There is this pull request: https://github.com/apache/flink/pull/921

Does anyone have an update of the progress of this? 

Thank you.

--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna






--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: 212.402.4871 // m: 917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna


Reply | Threaded
Open this post in threaded view
|

Re: DataSet.randomSplit()

Ufuk Celebi
Hey Gna! I think that it's not on the road map at the moment. Feel free to ping in the linked PR though. Probably Till can chime in there.

– Ufuk

On Mon, Mar 28, 2016 at 5:16 PM, Sourigna Phetsarath <[hidden email]> wrote:
Ufuk,

Thank you.  Yes, I saw the sampling methods in DataSetUtils and they are helpful. 

Just wanted to see if that particular method is on the road map for a future release.

-Gna

On Mon, Mar 28, 2016 at 6:22 AM, Ufuk Celebi <[hidden email]> wrote:
Hey Sourigna,

that particular method is not part of Flink yet.

Did you have a look at the sampling methods in DataSetUtils? Maybe they can be helpful for what you are trying to achieve.

– Ufuk

On Wed, Mar 23, 2016 at 5:19 PM, Sourigna Phetsarath <[hidden email]> wrote:
All:

Does Flink DataSet have a randomSplit(weights:Array[Double], seed: Long): Array[DataSet[T]] function?

There is this pull request: https://github.com/apache/flink/pull/921

Does anyone have an update of the progress of this? 

Thank you.

--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna






--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna



Reply | Threaded
Open this post in threaded view
|

Re: DataSet.randomSplit()

Till Rohrmann
Hi,

I think Ufuk is completely right. As far as I know, we don't support this function and nobody's currently working on it. If you like, then you could take the lead there.

Cheers,
Till

On Mon, Mar 28, 2016 at 10:50 PM, Ufuk Celebi <[hidden email]> wrote:
Hey Gna! I think that it's not on the road map at the moment. Feel free to ping in the linked PR though. Probably Till can chime in there.

– Ufuk

On Mon, Mar 28, 2016 at 5:16 PM, Sourigna Phetsarath <[hidden email]> wrote:
Ufuk,

Thank you.  Yes, I saw the sampling methods in DataSetUtils and they are helpful. 

Just wanted to see if that particular method is on the road map for a future release.

-Gna

On Mon, Mar 28, 2016 at 6:22 AM, Ufuk Celebi <[hidden email]> wrote:
Hey Sourigna,

that particular method is not part of Flink yet.

Did you have a look at the sampling methods in DataSetUtils? Maybe they can be helpful for what you are trying to achieve.

– Ufuk

On Wed, Mar 23, 2016 at 5:19 PM, Sourigna Phetsarath <[hidden email]> wrote:
All:

Does Flink DataSet have a randomSplit(weights:Array[Double], seed: Long): Array[DataSet[T]] function?

There is this pull request: https://github.com/apache/flink/pull/921

Does anyone have an update of the progress of this? 

Thank you.

--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna






--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna




Reply | Threaded
Open this post in threaded view
|

Re: DataSet.randomSplit()

Trevor Grant
Hey all,

Sorry I missed this thread. 


I checked it out then forgot about it.  I'm cranking on it now.

tg



Trevor Grant
Data Scientist

"Fortunate is he, who is able to know the causes of things."  -Virgil


On Tue, Mar 29, 2016 at 4:33 AM, Till Rohrmann <[hidden email]> wrote:
Hi,

I think Ufuk is completely right. As far as I know, we don't support this function and nobody's currently working on it. If you like, then you could take the lead there.

Cheers,
Till

On Mon, Mar 28, 2016 at 10:50 PM, Ufuk Celebi <[hidden email]> wrote:
Hey Gna! I think that it's not on the road map at the moment. Feel free to ping in the linked PR though. Probably Till can chime in there.

– Ufuk

On Mon, Mar 28, 2016 at 5:16 PM, Sourigna Phetsarath <[hidden email]> wrote:
Ufuk,

Thank you.  Yes, I saw the sampling methods in DataSetUtils and they are helpful. 

Just wanted to see if that particular method is on the road map for a future release.

-Gna

On Mon, Mar 28, 2016 at 6:22 AM, Ufuk Celebi <[hidden email]> wrote:
Hey Sourigna,

that particular method is not part of Flink yet.

Did you have a look at the sampling methods in DataSetUtils? Maybe they can be helpful for what you are trying to achieve.

– Ufuk

On Wed, Mar 23, 2016 at 5:19 PM, Sourigna Phetsarath <[hidden email]> wrote:
All:

Does Flink DataSet have a randomSplit(weights:Array[Double], seed: Long): Array[DataSet[T]] function?

There is this pull request: https://github.com/apache/flink/pull/921

Does anyone have an update of the progress of this? 

Thank you.

--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna






--

Gna Phetsarath
System Architect // AOL Platforms // Data Services // Applied Research Chapter
770 Broadway, 5th Floor, New York, NY 10003
o: <a href="tel:212.402.4871" value="+12124024871" target="_blank">212.402.4871 // m: <a href="tel:917.373.7363" value="+19173737363" target="_blank">917.373.7363
vvmr: 8890237 
aim: sphetsarath20 t: @sourigna