Flink Table SQL and writing nested Avro files

classic Classic list List threaded Threaded
3 messages Options
Dan
Reply | Threaded
Open this post in threaded view
|

Flink Table SQL and writing nested Avro files

Dan
Hi!

I want to join two tables and write the results to Avro where the left and right rows are nested in the avro output.  Is it possible to do this with the SQL interface?  

Thanks!
- Dan
 CREATE TABLE `flat_avro` (
`left` ROW<id BIGINT, name STRING>,
`right` ROW<id BIGINT, name STRING>
) WITH (
'connector' = 'filesystem',
'path' = 's3p://blah/blah',
'format' = 'avro'
);

INSERT INTO `flat_avro`
SELECT left.*, right.*
FROM `left`
LEFT JOIN `right`
ON `left`.`id` = `right`.`id`
);
Reply | Threaded
Open this post in threaded view
|

Re: Flink Table SQL and writing nested Avro files

Dawid Wysakowicz-2

Hi Dan,

I think the best what I can suggest is this:

SELECT

    ROW(left.field0, left.field1, left.field2, ...),

    ROW(right.field0, right.field1, right.field2, ...)

FROM ...

You will need to list all the fields manually, as SQL does not allow for asterisks in regular function calls.

If you are willing to give the Table API a try you might workaround some of the manual work with the Column Function[1]

        Table join = t1.join(t2).where($("id1").isEqual($("id2")));
        join
            .select(
                row(withColumns(range(1, t1.getSchema().getFieldCount()))),
                row(withColumns(range(
                    t1.getSchema().getFieldCount() + 1,
                    t1.getSchema().getFieldCount() + t2.getSchema().getFieldCount())))
            )
            .executeInsert("flat_avro")
            .await();


Best,

Dawid

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/systemFunctions.html#column-functions

On 18/09/2020 09:47, Dan Hill wrote:
Hi!

I want to join two tables and write the results to Avro where the left and right rows are nested in the avro output.  Is it possible to do this with the SQL interface?  

Thanks!
- Dan
 CREATE TABLE `flat_avro` (
   `left` ROW<id BIGINT, name STRING>,
   `right` ROW<id BIGINT, name STRING>
) WITH (
   'connector' = 'filesystem',
   'path' = 's3p://blah/blah',
   'format' = 'avro'
);

INSERT INTO `flat_avro` 
SELECT left.*, right.* 
FROM `left` 
LEFT JOIN `right`
ON `left`.`id` = `right`.`id`
);

signature.asc (849 bytes) Download Attachment
Dan
Reply | Threaded
Open this post in threaded view
|

Re: Flink Table SQL and writing nested Avro files

Dan
Nice!  I'll try that.  Thanks, Dawid!

On Mon, Sep 21, 2020 at 2:37 AM Dawid Wysakowicz <[hidden email]> wrote:

Hi Dan,

I think the best what I can suggest is this:

SELECT

    ROW(left.field0, left.field1, left.field2, ...),

    ROW(right.field0, right.field1, right.field2, ...)

FROM ...

You will need to list all the fields manually, as SQL does not allow for asterisks in regular function calls.

If you are willing to give the Table API a try you might workaround some of the manual work with the Column Function[1]

        Table join = t1.join(t2).where($("id1").isEqual($("id2")));
        join
            .select(
                row(withColumns(range(1, t1.getSchema().getFieldCount()))),
                row(withColumns(range(
                    t1.getSchema().getFieldCount() + 1,
                    t1.getSchema().getFieldCount() + t2.getSchema().getFieldCount())))
            )
            .executeInsert("flat_avro")
            .await();


Best,

Dawid

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/systemFunctions.html#column-functions

On 18/09/2020 09:47, Dan Hill wrote:
Hi!

I want to join two tables and write the results to Avro where the left and right rows are nested in the avro output.  Is it possible to do this with the SQL interface?  

Thanks!
- Dan
 CREATE TABLE `flat_avro` (
   `left` ROW<id BIGINT, name STRING>,
   `right` ROW<id BIGINT, name STRING>
) WITH (
   'connector' = 'filesystem',
   'path' = 's3p://blah/blah',
   'format' = 'avro'
);

INSERT INTO `flat_avro` 
SELECT left.*, right.* 
FROM `left` 
LEFT JOIN `right`
ON `left`.`id` = `right`.`id`
);