TimeZone shift problem in Flink SQL

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

TimeZone shift problem in Flink SQL

徐涛
Hi Experts,
I have the following two UDFs,
        unix_timestamp:   transform from string to Timestamp, with the arguments (value:String, format:String), return Timestamp
       from_unixtime:    transform from Timestamp to String, with the arguments (ts:Long, format:String), return String


select 
         number,
         ts,
         from_unixtime(unix_timestamp(LAST_UPDATE_TIME, 'EEE MMM dd HH:mm:Ss z yyyy'),'yyyy-MM-dd')  as dt
      from 
         test;

     when the LAST_UPDATE_TIME value is "Tue Jan 22 21:03:12 CST 2019”, the unix_timestamp return a Timestamp with value 1548162182001.
      but when from_unixtime is invoked, the timestamp with value 1548190982001 is passed in, there are 8 hours shift between them.
      May I know why there are 8 hours shift between them, and how can I get the timestamp that are passed out originally from the first UDF without changing the code?
      Thanks very much.

Best
Henry
Reply | Threaded
Open this post in threaded view
|

Re: TimeZone shift problem in Flink SQL

phoenixjiangnan
Hi,

Did you consider timezone in conversion in your UDF?


On Tue, Jan 22, 2019 at 5:29 AM 徐涛 <[hidden email]> wrote:
Hi Experts,
I have the following two UDFs,
        unix_timestamp:   transform from string to Timestamp, with the arguments (value:String, format:String), return Timestamp
       from_unixtime:    transform from Timestamp to String, with the arguments (ts:Long, format:String), return String


select 
         number,
         ts,
         from_unixtime(unix_timestamp(LAST_UPDATE_TIME, 'EEE MMM dd HH:mm:Ss z yyyy'),'yyyy-MM-dd')  as dt
      from 
         test;

     when the LAST_UPDATE_TIME value is "Tue Jan 22 21:03:12 CST 2019”, the unix_timestamp return a Timestamp with value 1548162182001.
      but when from_unixtime is invoked, the timestamp with value 1548190982001 is passed in, there are 8 hours shift between them.
      May I know why there are 8 hours shift between them, and how can I get the timestamp that are passed out originally from the first UDF without changing the code?
      Thanks very much.

Best
Henry
Reply | Threaded
Open this post in threaded view
|

Re: TimeZone shift problem in Flink SQL

Rong Rong
Hi Henry,

Unix epoch time values are always under GMT timezone, for example:
- 1548162182001 <=> GMT: Tuesday, January 22, 2019 1:03:02.001 PM, or CST: Tuesday, January 22, 2019 9:03:02.001 PM.
- 1548190982001 <=> GMT: Tuesday, January 22, 2019 9:03:02.001 PM, or CST: Wednesday, January 23, 2019 4:03:02.001 AM.

several things are needed here 
1. your "unix_timestamp" UDF should return actual Unix epoch time [1].
2. as Bowen mentioned, you will have to pass in the desired timezone as argument to your "from_unixtime" UDF.

--
Rong


On Thu, Jan 24, 2019 at 4:43 PM Bowen Li <[hidden email]> wrote:
Hi,

Did you consider timezone in conversion in your UDF?


On Tue, Jan 22, 2019 at 5:29 AM 徐涛 <[hidden email]> wrote:
Hi Experts,
I have the following two UDFs,
        unix_timestamp:   transform from string to Timestamp, with the arguments (value:String, format:String), return Timestamp
       from_unixtime:    transform from Timestamp to String, with the arguments (ts:Long, format:String), return String


select 
         number,
         ts,
         from_unixtime(unix_timestamp(LAST_UPDATE_TIME, 'EEE MMM dd HH:mm:Ss z yyyy'),'yyyy-MM-dd')  as dt
      from 
         test;

     when the LAST_UPDATE_TIME value is "Tue Jan 22 21:03:12 CST 2019”, the unix_timestamp return a Timestamp with value 1548162182001.
      but when from_unixtime is invoked, the timestamp with value 1548190982001 is passed in, there are 8 hours shift between them.
      May I know why there are 8 hours shift between them, and how can I get the timestamp that are passed out originally from the first UDF without changing the code?
      Thanks very much.

Best
Henry
Reply | Threaded
Open this post in threaded view
|

Re: TimeZone shift problem in Flink SQL

徐涛
Hi Rongrong,
The event is really happens in Tuesday, January 22, 2019 9:03:02.001 PM, so I think the first function returns 1548162182001 is correct. It is the Unix epoch time when the event happens.
But why the timestamp passed into the from_unixtime is changed to 1548190982001? If it is not changed, I can still format 1548162182001 then pass a time zone to get the actual date. 
Timestamp is a time-zone independent value, it should not be changed, I think.

Best
Henry

在 2019年1月26日,下午1:21,Rong Rong <[hidden email]> 写道:

Hi Henry,

Unix epoch time values are always under GMT timezone, for example:
- 1548162182001 <=> GMT: Tuesday, January 22, 2019 1:03:02.001 PM, or CST: Tuesday, January 22, 2019 9:03:02.001 PM.
- 1548190982001 <=> GMT: Tuesday, January 22, 2019 9:03:02.001 PM, or CST: Wednesday, January 23, 2019 4:03:02.001 AM.

several things are needed here 
1. your "unix_timestamp" UDF should return actual Unix epoch time [1].
2. as Bowen mentioned, you will have to pass in the desired timezone as argument to your "from_unixtime" UDF.

--
Rong


On Thu, Jan 24, 2019 at 4:43 PM Bowen Li <[hidden email]> wrote:
Hi,

Did you consider timezone in conversion in your UDF?


On Tue, Jan 22, 2019 at 5:29 AM 徐涛 <[hidden email]> wrote:
Hi Experts,
I have the following two UDFs,
        unix_timestamp:   transform from string to Timestamp, with the arguments (value:String, format:String), return Timestamp
       from_unixtime:    transform from Timestamp to String, with the arguments (ts:Long, format:String), return String


select 
         number,
         ts,
         from_unixtime(unix_timestamp(LAST_UPDATE_TIME, 'EEE MMM dd HH:mm:Ss z yyyy'),'yyyy-MM-dd')  as dt
      from 
         test;

     when the LAST_UPDATE_TIME value is "Tue Jan 22 21:03:12 CST 2019”, the unix_timestamp return a Timestamp with value 1548162182001.
      but when from_unixtime is invoked, the timestamp with value 1548190982001 is passed in, there are 8 hours shift between them.
      May I know why there are 8 hours shift between them, and how can I get the timestamp that are passed out originally from the first UDF without changing the code?
      Thanks very much.

Best
Henry