PyFlink Table API: Interpret datetime field from Kafka as event time

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

PyFlink Table API: Interpret datetime field from Kafka as event time

Sumeet Malhotra
Hi,

Might be a simple, stupid question, but I'm not able to find how to convert/interpret a UTC datetime string like 2021-03-23T07:37:00.613910Z as event-time using a DDL/Table API. I'm ingesting data from Kafka and can read this field as a string, but would like to mark it as event-time by defining a watermark.

I'm able to achieve this using the DataStream API, by defining my own TimestampAssigner that converts the datetime string to milliseconds since epoch. How can I do this using a SQL DDL or Table API?

I tried to directly interpret the string as TIMESTAMP(3) but it fails with the following exception:

java.time.format.DateTimeParseException: Text '2021-03-23T07:37:00.613910Z' could not be parsed...

Any pointers?

Thanks!
Sumeet

Reply | Threaded
Open this post in threaded view
|

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

Piotr Nowojski-4
Hi,

I hope someone else might have a better answer, but one thing that would most likely work is to convert this field and define even time during DataStream to table conversion [1]. You could always pre process this field in the DataStream API.

Piotrek


pon., 29 mar 2021 o 18:07 Sumeet Malhotra <[hidden email]> napisał(a):
Hi,

Might be a simple, stupid question, but I'm not able to find how to convert/interpret a UTC datetime string like 2021-03-23T07:37:00.613910Z as event-time using a DDL/Table API. I'm ingesting data from Kafka and can read this field as a string, but would like to mark it as event-time by defining a watermark.

I'm able to achieve this using the DataStream API, by defining my own TimestampAssigner that converts the datetime string to milliseconds since epoch. How can I do this using a SQL DDL or Table API?

I tried to directly interpret the string as TIMESTAMP(3) but it fails with the following exception:

java.time.format.DateTimeParseException: Text '2021-03-23T07:37:00.613910Z' could not be parsed...

Any pointers?

Thanks!
Sumeet

Reply | Threaded
Open this post in threaded view
|

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

Sumeet Malhotra
Thanks. Yes, that's a possibility. I'd still prefer something that can be done within the Table API. If it's not possible, then there's no other option but to use the DataStream API to read from Kafka, do the time conversion and create a table from it.

..Sumeet

On Mon, Mar 29, 2021 at 10:41 PM Piotr Nowojski <[hidden email]> wrote:
Hi,

I hope someone else might have a better answer, but one thing that would most likely work is to convert this field and define even time during DataStream to table conversion [1]. You could always pre process this field in the DataStream API.

Piotrek


pon., 29 mar 2021 o 18:07 Sumeet Malhotra <[hidden email]> napisał(a):
Hi,

Might be a simple, stupid question, but I'm not able to find how to convert/interpret a UTC datetime string like 2021-03-23T07:37:00.613910Z as event-time using a DDL/Table API. I'm ingesting data from Kafka and can read this field as a string, but would like to mark it as event-time by defining a watermark.

I'm able to achieve this using the DataStream API, by defining my own TimestampAssigner that converts the datetime string to milliseconds since epoch. How can I do this using a SQL DDL or Table API?

I tried to directly interpret the string as TIMESTAMP(3) but it fails with the following exception:

java.time.format.DateTimeParseException: Text '2021-03-23T07:37:00.613910Z' could not be parsed...

Any pointers?

Thanks!
Sumeet

Reply | Threaded
Open this post in threaded view
|

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

Dawid Wysakowicz-2

Hey,

I am not sure which format you use, but if you work with JSON maybe this option[1] could help you.

Best,

Dawid

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/json.html#json-timestamp-format-standard

On 30/03/2021 06:45, Sumeet Malhotra wrote:
Thanks. Yes, that's a possibility. I'd still prefer something that can be done within the Table API. If it's not possible, then there's no other option but to use the DataStream API to read from Kafka, do the time conversion and create a table from it.

..Sumeet

On Mon, Mar 29, 2021 at 10:41 PM Piotr Nowojski <[hidden email]> wrote:
Hi,

I hope someone else might have a better answer, but one thing that would most likely work is to convert this field and define even time during DataStream to table conversion [1]. You could always pre process this field in the DataStream API.

Piotrek


pon., 29 mar 2021 o 18:07 Sumeet Malhotra <[hidden email]> napisał(a):
Hi,

Might be a simple, stupid question, but I'm not able to find how to convert/interpret a UTC datetime string like 2021-03-23T07:37:00.613910Z as event-time using a DDL/Table API. I'm ingesting data from Kafka and can read this field as a string, but would like to mark it as event-time by defining a watermark.

I'm able to achieve this using the DataStream API, by defining my own TimestampAssigner that converts the datetime string to milliseconds since epoch. How can I do this using a SQL DDL or Table API?

I tried to directly interpret the string as TIMESTAMP(3) but it fails with the following exception:

java.time.format.DateTimeParseException: Text '2021-03-23T07:37:00.613910Z' could not be parsed...

Any pointers?

Thanks!
Sumeet


OpenPGP_signature (855 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

Sumeet Malhotra
Thanks Dawid. This looks like what I needed :-)

On Tue, Mar 30, 2021 at 12:28 PM Dawid Wysakowicz <[hidden email]> wrote:

Hey,

I am not sure which format you use, but if you work with JSON maybe this option[1] could help you.

Best,

Dawid

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/json.html#json-timestamp-format-standard

On 30/03/2021 06:45, Sumeet Malhotra wrote:
Thanks. Yes, that's a possibility. I'd still prefer something that can be done within the Table API. If it's not possible, then there's no other option but to use the DataStream API to read from Kafka, do the time conversion and create a table from it.

..Sumeet

On Mon, Mar 29, 2021 at 10:41 PM Piotr Nowojski <[hidden email]> wrote:
Hi,

I hope someone else might have a better answer, but one thing that would most likely work is to convert this field and define even time during DataStream to table conversion [1]. You could always pre process this field in the DataStream API.

Piotrek


pon., 29 mar 2021 o 18:07 Sumeet Malhotra <[hidden email]> napisał(a):
Hi,

Might be a simple, stupid question, but I'm not able to find how to convert/interpret a UTC datetime string like 2021-03-23T07:37:00.613910Z as event-time using a DDL/Table API. I'm ingesting data from Kafka and can read this field as a string, but would like to mark it as event-time by defining a watermark.

I'm able to achieve this using the DataStream API, by defining my own TimestampAssigner that converts the datetime string to milliseconds since epoch. How can I do this using a SQL DDL or Table API?

I tried to directly interpret the string as TIMESTAMP(3) but it fails with the following exception:

java.time.format.DateTimeParseException: Text '2021-03-23T07:37:00.613910Z' could not be parsed...

Any pointers?

Thanks!
Sumeet