|
Hi I have a table and I want to rebalance the data so that each partition is equal. I cna convert to dataset and rebalance and then convert to table.I couldnt find any rebalance on table api. Does anyone know any better idea to rebalance table data? |
|
Hi Darshan, You are right. there's currently no rebalancing operation on the Table API. For now, converting to DataSet and back to Table is the only option. 2018-04-13 14:33 GMT+02:00 Darshan Singh <[hidden email]>:
|
|
Hi Darshan, thanks for raising the problem. We do have similar use of rebalancing in Flink SQL, where we want to rebalance the Kafka input with more partitions to increase parallelism in streaming. As Fabian suggests, rebalancing is not relation algebra. The closest use of the operation I can find in databases is in vertica (REBALANCE_TABLE), but it's used more as a one-time rebalance operation of the data after adding/removing nodes. However, on the data processing side, I think this might deserve more attention because we can't easily modify the input data source, e.g. number of Kafka partitions. The closest I can think of to enable this feature is DDL in SQL, e.g. something like ALTER TABLE REBALANCE in SQL. With this DDL statement, it will cause a rebalance() call when StreamTableSource.getDataStrea On Mon, Apr 16, 2018 at 5:41 AM, Fabian Hueske <[hidden email]> wrote:
"So you have to trust that the dots will somehow connect in your future."
|
| Free forum by Nabble | Edit this page |
