Re: Error "key group must belong to the backend" on restore
Posted by
Stefan Richter on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Error-key-group-must-belong-to-the-backend-on-restore-tp13933p13936.html
Hi,
I have seen the first exception in cases where the key had no proper and stable hash code method, e.g. when the key was an array. What the first exception basically means is that the backend received a key, which it does not expect because determined by the hash the key belongs to a key group for which the backend is not responsible. My guess would be: the hash of the object has changed between the time the checkpoint was taken and now.
Best,
Stefan
Hi all!
I am wondering if anyone has any practical idea why I might get this error when migrating a job from 1.2.1 to 1.3.0? Idea on debugging might help as well.
I have several almost exactly similar jobs (minor config differences) and all of them succeed except for this single job. I have seen similar error when trying to change max parallelism but that's not the case here. I am not changing any parallelism setting.
I know this is a long shot but you might have encountered similar.
Thanks,
Gyula
java.lang.IllegalStateException: Could not initialize keyed state backend.
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:321)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:217)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:676)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:663)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: The key group must belong to the backend
at org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBFullRestoreOperation.restoreKVStateData(RocksDBKeyedStateBackend.java:1185)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBKeyedStateBackend.java:1100)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBFullRestoreOperation.doRestore(RocksDBKeyedStateBackend.java:1081)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.restore(RocksDBKeyedStateBackend.java:968)
at org.apache.flink.streaming.runtime.tasks.StreamTask.createKeyedStateBackend(StreamTask.java:772)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initKeyedState(AbstractStreamOperator.java:311)
... 6 more
or
java.lang.IllegalArgumentException: Key Group 56 does not belong to the local range.
at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:139)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:493)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:104)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:251)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:676)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:663)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:252)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:745)