NPE from NullAwareMapIterator in flink-table-runtime-blink

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

NPE from NullAwareMapIterator in flink-table-runtime-blink

Dongwon Kim-2
Hi,

The attached is a stack trace from a member of Korean Flink User Group about which I have no idea.
92164295_3052056384855585_3776552648744894464_o.jpg

Can anyone give me some hint on this exception?
Thanks,

Dongwon
Reply | Threaded
Open this post in threaded view
|

Re: NPE from NullAwareMapIterator in flink-table-runtime-blink

Jark Wu-3
Hi Dongwon,

Thanks for reporting this. Let me ask some more questions first.
1) Are you using the Heap statebackend? 
2) How often / How long will this exception happen after the job is started? 

Best,
Jark

On Sat, 4 Apr 2020 at 21:39, Dongwon Kim <[hidden email]> wrote:
Hi,

The attached is a stack trace from a member of Korean Flink User Group about which I have no idea.
92164295_3052056384855585_3776552648744894464_o.jpg

Can anyone give me some hint on this exception?
Thanks,

Dongwon
Reply | Threaded
Open this post in threaded view
|

Re: NPE from NullAwareMapIterator in flink-table-runtime-blink

Dongwon Kim-2
Thanks Jark,

I asked to the original questioner, and he answered 
1) Yes, he seems to turn on checkpointing but doesn't configure anything further, which means he is using the Heap statebackend.
2) It normally happens 5~10 mins after the job is started.

If this was a bug specific to the Heap statebackend, would it be helpful for him to use other statebackends like FS statebackend and RocksDB statebackend?

Best,
Dongwon

On Sun, Apr 5, 2020 at 1:09 PM Jark Wu <[hidden email]> wrote:
Hi Dongwon,

Thanks for reporting this. Let me ask some more questions first.
1) Are you using the Heap statebackend? 
2) How often / How long will this exception happen after the job is started? 

Best,
Jark

On Sat, 4 Apr 2020 at 21:39, Dongwon Kim <[hidden email]> wrote:
Hi,

The attached is a stack trace from a member of Korean Flink User Group about which I have no idea.
92164295_3052056384855585_3776552648744894464_o.jpg

Can anyone give me some hint on this exception?
Thanks,

Dongwon
Reply | Threaded
Open this post in threaded view
|

Re: NPE from NullAwareMapIterator in flink-table-runtime-blink

Dongwon Kim-2
Hey Jark,

After he disables checkpointing, the exception seems to disappear.
Hope this information helps.

Best,

Dongwon

On Mon, Apr 6, 2020 at 8:21 AM Dongwon Kim <[hidden email]> wrote:
Thanks Jark,

I asked to the original questioner, and he answered 
1) Yes, he seems to turn on checkpointing but doesn't configure anything further, which means he is using the Heap statebackend.
2) It normally happens 5~10 mins after the job is started.

If this was a bug specific to the Heap statebackend, would it be helpful for him to use other statebackends like FS statebackend and RocksDB statebackend?

Best,
Dongwon

On Sun, Apr 5, 2020 at 1:09 PM Jark Wu <[hidden email]> wrote:
Hi Dongwon,

Thanks for reporting this. Let me ask some more questions first.
1) Are you using the Heap statebackend? 
2) How often / How long will this exception happen after the job is started? 

Best,
Jark

On Sat, 4 Apr 2020 at 21:39, Dongwon Kim <[hidden email]> wrote:
Hi,

The attached is a stack trace from a member of Korean Flink User Group about which I have no idea.
92164295_3052056384855585_3776552648744894464_o.jpg

Can anyone give me some hint on this exception?
Thanks,

Dongwon
Reply | Threaded
Open this post in threaded view
|

Re: NPE from NullAwareMapIterator in flink-table-runtime-blink

Jark Wu-3
Hi Dongwon,

Thanks for providing the information. I think this is a bug because the underlying HeapMapState#iterator may return a null iterato, but Rocksdb not. 
We should protect it in  NullAwareMapIterator#hasNext. I created https://issues.apache.org/jira/browse/FLINK-17015 to fix it. 

Best,
Jark

On Mon, 6 Apr 2020 at 07:30, Dongwon Kim <[hidden email]> wrote:
Hey Jark,

After he disables checkpointing, the exception seems to disappear.
Hope this information helps.

Best,

Dongwon

On Mon, Apr 6, 2020 at 8:21 AM Dongwon Kim <[hidden email]> wrote:
Thanks Jark,

I asked to the original questioner, and he answered 
1) Yes, he seems to turn on checkpointing but doesn't configure anything further, which means he is using the Heap statebackend.
2) It normally happens 5~10 mins after the job is started.

If this was a bug specific to the Heap statebackend, would it be helpful for him to use other statebackends like FS statebackend and RocksDB statebackend?

Best,
Dongwon

On Sun, Apr 5, 2020 at 1:09 PM Jark Wu <[hidden email]> wrote:
Hi Dongwon,

Thanks for reporting this. Let me ask some more questions first.
1) Are you using the Heap statebackend? 
2) How often / How long will this exception happen after the job is started? 

Best,
Jark

On Sat, 4 Apr 2020 at 21:39, Dongwon Kim <[hidden email]> wrote:
Hi,

The attached is a stack trace from a member of Korean Flink User Group about which I have no idea.
92164295_3052056384855585_3776552648744894464_o.jpg

Can anyone give me some hint on this exception?
Thanks,

Dongwon