Hi, question about orderBy two columns more

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Hi, question about orderBy two columns more

Philip Lee
Hi, 

I know when applying order by col, it would be sortPartition(col).setParralism(1)

What about orderBy two columns more?
If the sql is to state order by col_1, col_2,  sortPartition().sortPartition () does not solve this SQL.

because orderby in sql is to sort the fisrt coulmn and the second column in the sorted first column. but for flink the funtion totally sorts each column.

Any suggestion?

Thanks,
phil

Reply | Threaded
Open this post in threaded view
|

Re: Hi, question about orderBy two columns more

Stephan Ewen
Actually, sortPartition(col1).sortPartition(col2) results in a single sort that primarily sorts after col1 and secondarily sorts after col2, so it is the same as in SQL when you state "ORDER BY col1, col2".

The SortPartitionOperator created with the first "sortPartition(col1)" call appends further columns, rather than instantiating a new sort.

Greetings,
Stephan


On Sun, Nov 1, 2015 at 11:29 AM, Philip Lee <[hidden email]> wrote:
Hi, 

I know when applying order by col, it would be sortPartition(col).setParralism(1)

What about orderBy two columns more?
If the sql is to state order by col_1, col_2,  sortPartition().sortPartition () does not solve this SQL.

because orderby in sql is to sort the fisrt coulmn and the second column in the sorted first column. but for flink the funtion totally sorts each column.

Any suggestion?

Thanks,
phil


Reply | Threaded
Open this post in threaded view
|

Re: Hi, question about orderBy two columns more

Philip Lee
Thanks for your reply, Stephan.

So you said this is same as SQL
but I got this result from this code. This is what we did not expect, right?

val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))

val outputTuple = env.fromCollection(inputTuple)
.sortPartition(0,Order.DESCENDING)
//.sortPartition(1,Order.ASCENDING)
.print()
Output:
(3,2)
(3,6)
(2,5)
(2,3)
(2,4)

val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))

val outputTuple = env.fromCollection(inputTuple)
.sortPartition(0,Order.DESCENDING)
.sortPartition(1,Order.ASCENDING)
.print()
** 
Actual Output:
(3,2)
(2,3)
(2,4)
(2,5)
(3,6)
Expected Output:
(3,2)
(3,6)
(2,3)
(2,4)
(2,5)


Thanks,
Phil


On Mon, Nov 2, 2015 at 5:54 AM, Stephan Ewen <[hidden email]> wrote:
Actually, sortPartition(col1).sortPartition(col2) results in a single sort that primarily sorts after col1 and secondarily sorts after col2, so it is the same as in SQL when you state "ORDER BY col1, col2".

The SortPartitionOperator created with the first "sortPartition(col1)" call appends further columns, rather than instantiating a new sort.

Greetings,
Stephan


On Sun, Nov 1, 2015 at 11:29 AM, Philip Lee <[hidden email]> wrote:
Hi, 

I know when applying order by col, it would be sortPartition(col).setParralism(1)

What about orderBy two columns more?
If the sql is to state order by col_1, col_2,  sortPartition().sortPartition () does not solve this SQL.

because orderby in sql is to sort the fisrt coulmn and the second column in the sorted first column. but for flink the funtion totally sorts each column.

Any suggestion?

Thanks,
phil



Reply | Threaded
Open this post in threaded view
|

Re: Hi, question about orderBy two columns more

Fabian Hueske-2
Hi Philip,

thanks for reporting the issue. I just verified the problem.
It is working correctly for the Java API, but is broken in Scala.

I will work on a fix and include it in the next RC for 0.10.0.

Thanks, Fabian

2015-11-02 12:58 GMT+01:00 Philip Lee <[hidden email]>:
Thanks for your reply, Stephan.

So you said this is same as SQL
but I got this result from this code. This is what we did not expect, right?

val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))

val outputTuple = env.fromCollection(inputTuple)
.sortPartition(0,Order.DESCENDING)
//.sortPartition(1,Order.ASCENDING)
.print()
Output:
(3,2)
(3,6)
(2,5)
(2,3)
(2,4)

val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))

val outputTuple = env.fromCollection(inputTuple)
.sortPartition(0,Order.DESCENDING)
.sortPartition(1,Order.ASCENDING)
.print()
** 
Actual Output:
(3,2)
(2,3)
(2,4)
(2,5)
(3,6)
Expected Output:
(3,2)
(3,6)
(2,3)
(2,4)
(2,5)


Thanks,
Phil


On Mon, Nov 2, 2015 at 5:54 AM, Stephan Ewen <[hidden email]> wrote:
Actually, sortPartition(col1).sortPartition(col2) results in a single sort that primarily sorts after col1 and secondarily sorts after col2, so it is the same as in SQL when you state "ORDER BY col1, col2".

The SortPartitionOperator created with the first "sortPartition(col1)" call appends further columns, rather than instantiating a new sort.

Greetings,
Stephan


On Sun, Nov 1, 2015 at 11:29 AM, Philip Lee <[hidden email]> wrote:
Hi, 

I know when applying order by col, it would be sortPartition(col).setParralism(1)

What about orderBy two columns more?
If the sql is to state order by col_1, col_2,  sortPartition().sortPartition () does not solve this SQL.

because orderby in sql is to sort the fisrt coulmn and the second column in the sorted first column. but for flink the funtion totally sorts each column.

Any suggestion?

Thanks,
phil




Reply | Threaded
Open this post in threaded view
|

Re: Hi, question about orderBy two columns more

Philip Lee
​​
You are welcome.​

I am  wondering if there is a way of noticing when you update RC solving the sortPartition problem and then how we could apply the new version like just downloading the new relased Flink version?

Thanks, Phil





On Mon, Nov 2, 2015 at 2:09 PM, Fabian Hueske <[hidden email]> wrote:
Hi Philip,

thanks for reporting the issue. I just verified the problem.
It is working correctly for the Java API, but is broken in Scala.

I will work on a fix and include it in the next RC for 0.10.0.

Thanks, Fabian

2015-11-02 12:58 GMT+01:00 Philip Lee <[hidden email]>:
Thanks for your reply, Stephan.

So you said this is same as SQL
but I got this result from this code. This is what we did not expect, right?

val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))

val outputTuple = env.fromCollection(inputTuple)
.sortPartition(0,Order.DESCENDING)
//.sortPartition(1,Order.ASCENDING)
.print()
Output:
(3,2)
(3,6)
(2,5)
(2,3)
(2,4)

val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))

val outputTuple = env.fromCollection(inputTuple)
.sortPartition(0,Order.DESCENDING)
.sortPartition(1,Order.ASCENDING)
.print()
** 
Actual Output:
(3,2)
(2,3)
(2,4)
(2,5)
(3,6)
Expected Output:
(3,2)
(3,6)
(2,3)
(2,4)
(2,5)


Thanks,
Phil


On Mon, Nov 2, 2015 at 5:54 AM, Stephan Ewen <[hidden email]> wrote:
Actually, sortPartition(col1).sortPartition(col2) results in a single sort that primarily sorts after col1 and secondarily sorts after col2, so it is the same as in SQL when you state "ORDER BY col1, col2".

The SortPartitionOperator created with the first "sortPartition(col1)" call appends further columns, rather than instantiating a new sort.

Greetings,
Stephan


On Sun, Nov 1, 2015 at 11:29 AM, Philip Lee <[hidden email]> wrote:
Hi, 

I know when applying order by col, it would be sortPartition(col).setParralism(1)

What about orderBy two columns more?
If the sql is to state order by col_1, col_2,  sortPartition().sortPartition () does not solve this SQL.

because orderby in sql is to sort the fisrt coulmn and the second column in the sorted first column. but for flink the funtion totally sorts each column.

Any suggestion?

Thanks,
phil





Reply | Threaded
Open this post in threaded view
|

Re: Hi, question about orderBy two columns more

Maximilian Michels
Hi Philip,

The issue has been fixed in rc5 which you can get here:
https://people.apache.org/~mxm/flink-0.10.0-rc5/

Note that these files will be removed once 0.10.0 is out.

Kind regards,
Max

On Mon, Nov 2, 2015 at 6:38 PM, Philip Lee <[hidden email]> wrote:

> You are welcome.
>
> I am  wondering if there is a way of noticing when you update RC solving the
> sortPartition problem and then how we could apply the new version like just
> downloading the new relased Flink version?
>
> Thanks, Phil
>
>
>
>
>
> On Mon, Nov 2, 2015 at 2:09 PM, Fabian Hueske <[hidden email]> wrote:
>>
>> Hi Philip,
>>
>> thanks for reporting the issue. I just verified the problem.
>> It is working correctly for the Java API, but is broken in Scala.
>>
>> I will work on a fix and include it in the next RC for 0.10.0.
>>
>> Thanks, Fabian
>>
>> 2015-11-02 12:58 GMT+01:00 Philip Lee <[hidden email]>:
>>>
>>> Thanks for your reply, Stephan.
>>>
>>> So you said this is same as SQL
>>> but I got this result from this code. This is what we did not expect,
>>> right?
>>>
>>> val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))
>>>
>>> val outputTuple = env.fromCollection(inputTuple)
>>>   .sortPartition(0,Order.DESCENDING)
>>>   //.sortPartition(1,Order.ASCENDING)
>>>   .print()
>>>
>>> Output:
>>> (3,2)
>>> (3,6)
>>> (2,5)
>>> (2,3)
>>> (2,4)
>>>
>>> val inputTuple = Seq((2,5),(2,3),(2,4),(3,2),(3,6))
>>>
>>> val outputTuple = env.fromCollection(inputTuple)
>>>   .sortPartition(0,Order.DESCENDING)
>>>   .sortPartition(1,Order.ASCENDING)
>>>   .print()
>>>
>>> **
>>> Actual Output:
>>> (3,2)
>>> (2,3)
>>> (2,4)
>>> (2,5)
>>> (3,6)
>>> Expected Output:
>>> (3,2)
>>> (3,6)
>>> (2,3)
>>> (2,4)
>>> (2,5)
>>>
>>>
>>> Thanks,
>>> Phil
>>>
>>>
>>> On Mon, Nov 2, 2015 at 5:54 AM, Stephan Ewen <[hidden email]> wrote:
>>>>
>>>> Actually, sortPartition(col1).sortPartition(col2) results in a single
>>>> sort that primarily sorts after col1 and secondarily sorts after col2, so it
>>>> is the same as in SQL when you state "ORDER BY col1, col2".
>>>>
>>>> The SortPartitionOperator created with the first "sortPartition(col1)"
>>>> call appends further columns, rather than instantiating a new sort.
>>>>
>>>> Greetings,
>>>> Stephan
>>>>
>>>>
>>>> On Sun, Nov 1, 2015 at 11:29 AM, Philip Lee <[hidden email]> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I know when applying order by col, it would be
>>>>> sortPartition(col).setParralism(1)
>>>>>
>>>>> What about orderBy two columns more?
>>>>> If the sql is to state order by col_1, col_2,
>>>>> sortPartition().sortPartition () does not solve this SQL.
>>>>>
>>>>> because orderby in sql is to sort the fisrt coulmn and the second
>>>>> column in the sorted first column. but for flink the funtion totally sorts
>>>>> each column.
>>>>>
>>>>> Any suggestion?
>>>>>
>>>>> Thanks,
>>>>> phil
>>>>>
>>>>
>>>
>>
>