How to debug flink serialization error?

classic Classic list List threaded Threaded
3 messages Options
tuk
Reply | Threaded
Open this post in threaded view
|

How to debug flink serialization error?

tuk
HI

I am having a ProcessFunction like below which is throwing an error like below whenever I am trying to use it in a opeator . My understanding when flink initializes the operator dag, it serializes things and sends over to the taskmanagers.
So I have marked the  operator state transient, since the operator state will be populated within the open() call that gets invoked in each taskmanager. But I am still getting the serialization exception like below. Can suggest some ways where I can debug this type of serialization error in Flink 1.12?

org.apache.flink.api.common.InvalidProgramException: public com.vnera.programs.metrics.MetricStoreProgramHelper com.vnera.analytics.engine.MetricStoreMapper.getMetricStoreProgramHelper(com.vnera.resourcemanager.ResourceManager,com.vnera.storage.metrics.TsdbMetricStore$Writer,java.lang.String) is not serializable. The object probably contains or references non serializable fields.

at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:164)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:132)
...
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:69)
at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:2000)
at org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:203)
at org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:681)
at org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:661)
at com.vnera.analytics.engine.MetricStoreOperator.linkFrom(MetricStoreOperator.java:27)
at com.vnera.analytics.engine.AnaPipelineStage.link(AnaPipelineStage.java:12)
at com.vnera.analytics.engine.AnalyticsEngine.createPipeline(AnalyticsEngine.java:106)
at com.vnera.analytics.engine.source.DerivedMetricCreatorTest.testPipeline(DerivedMetricCreatorTest.java:98)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
...
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
...
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
...
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
...
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
Caused by: java.io.NotSerializableException: java.lang.reflect.Method
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:624)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:143)
... 45 more

My ProcessFunction looks like below
public class MetricStoreMapper extends ProcessFunction<SelfDescribingMessageDO, GenericMetricV2> {
    private static final String MSG_STALENESS_METRIC_NAME = VneraMetrics.createMetricName(MetricStoreMapper.class,
            "metric_sdm_staleness");
    private transient Histogram stalenessHisto = VneraMetrics.histogram(MSG_STALENESS_METRIC_NAME, VneraMetrics.DD_REPORTER);
    private transient MetricStoreProgramHelper metricStoreHelper;

    @Override
    public void open(Configuration parameters) throws Exception {
        ExecutionConfig.GlobalJobParameters jobParams = getRuntimeContext().getExecutionConfig().getGlobalJobParameters();
        Configuration conf = ParameterTool.fromMap(jobParams.toMap()).getConfiguration();
        String taskInstanceId = getRuntimeContext().getTaskNameWithSubtasks();
        MetricStoreFactory.StoreType metStoreType = conf.getEnum(MetricStoreFactory.StoreType.class, StoreOptions.METRIC_STORE_TYPE);
        TaskManagerState state = TaskManagerState.getTaskManagerState(ConfigStoreFactory.StoreType.MEMORY, MetricStoreFactory.StoreType.MEMORY);
        ResourceManager rm = state.getResourceManager();
        metricStoreHelper = getMetricStoreProgramHelper(rm, state.getTsdbMetricStore().writer(taskInstanceId),
                taskInstanceId);
    }

    @Override
    public void processElement(SelfDescribingMessageDO value, Context ctx, Collector<GenericMetricV2> out) throws Exception {
        MetricStoreProgramHelper.MetricStoreOutput output = metricStoreHelper.execute(value);
        for (SelfDescribingMessageDO sdm : output.outputSdms) {
            ctx.output(SideOutputs.metricStoreEvents, sdm);
        }
    }

    @VisibleForTesting
    public MetricStoreProgramHelper getMetricStoreProgramHelper(final ResourceManager rm,
                                                                final TsdbMetricStore.Writer tsdbWriter,
                                                                final String taskInstanceId) {
        return new MetricStoreProgramHelper(rm.getConfigStore(),
                rm.getDataModel(),
                tsdbWriter,
                taskInstanceId,
                rm.getPolicyManger(),
                stalenessHisto);
    }

    private static ModelKey modelKeyFromConfigKey(ConfigKey ck) {
        return ModelKey.create(ck.customerId, ck.objectType, ck.objectId);
    }
}


My Operator is like below.
public class MetricStoreOperator implements AnaPipelineStage<GenericMetricV2> {
private transient SingleOutputStreamOperator<GenericMetricV2> metricStream;
private transient final MetricStoreMapper metricStoreMapper;

public MetricStoreOperator(final Configuration jobParams, final MetricStoreMapper metricStoreMapper) {
this.metricStoreMapper = metricStoreMapper;
}

@Override
public AnaPipelineStage<GenericMetricV2> linkFrom(AnaPipelineStage<?>... operators) {
AnaPipelineStage<SelfDescribingMessageDO> source = (AnaPipelineStage<SelfDescribingMessageDO>) Stream.of(operators).findFirst().get();
DataStream<SelfDescribingMessageDO> sdmStream = source.getOutputStream();
metricStream = sdmStream.process(metricStoreMapper);
return this;
}

@Override
public DataStream<GenericMetricV2> getOutputStream() {
return metricStream;
}

@Override
public DataStream getSideOutput(OutputTag outTag) {
return metricStream.getSideOutput(outTag);
}
}
Reply | Threaded
Open this post in threaded view
|

Re: How to debug flink serialization error?

rmetzger0
Thanks for reaching out to the Flink ML.

It reports getMetricStoreProgramHelper as a non-serializable field, even though it looks a lot like a method. The only recommendation I have for you is carefully reading the full error message + stack trace.

Your approach of using tagging fields as "transient" is absolutely correct.
There's also this message: NotSerializableException: java.lang.reflect.Method, but I can not find a field of type Method.

Can you provide a minimal reproducible example of this issue?

On Fri, Feb 12, 2021 at 7:06 AM Debraj Manna <[hidden email]> wrote:
HI

I am having a ProcessFunction like below which is throwing an error like below whenever I am trying to use it in a opeator . My understanding when flink initializes the operator dag, it serializes things and sends over to the taskmanagers.
So I have marked the  operator state transient, since the operator state will be populated within the open() call that gets invoked in each taskmanager. But I am still getting the serialization exception like below. Can suggest some ways where I can debug this type of serialization error in Flink 1.12?

org.apache.flink.api.common.InvalidProgramException: public com.vnera.programs.metrics.MetricStoreProgramHelper com.vnera.analytics.engine.MetricStoreMapper.getMetricStoreProgramHelper(com.vnera.resourcemanager.ResourceManager,com.vnera.storage.metrics.TsdbMetricStore$Writer,java.lang.String) is not serializable. The object probably contains or references non serializable fields.

at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:164)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:132)
...
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:69)
at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:2000)
at org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:203)
at org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:681)
at org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:661)
at com.vnera.analytics.engine.MetricStoreOperator.linkFrom(MetricStoreOperator.java:27)
at com.vnera.analytics.engine.AnaPipelineStage.link(AnaPipelineStage.java:12)
at com.vnera.analytics.engine.AnalyticsEngine.createPipeline(AnalyticsEngine.java:106)
at com.vnera.analytics.engine.source.DerivedMetricCreatorTest.testPipeline(DerivedMetricCreatorTest.java:98)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
...
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
...
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
...
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
...
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
Caused by: java.io.NotSerializableException: java.lang.reflect.Method
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:624)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:143)
... 45 more

My ProcessFunction looks like below
public class MetricStoreMapper extends ProcessFunction<SelfDescribingMessageDO, GenericMetricV2> {
    private static final String MSG_STALENESS_METRIC_NAME = VneraMetrics.createMetricName(MetricStoreMapper.class,
            "metric_sdm_staleness");
    private transient Histogram stalenessHisto = VneraMetrics.histogram(MSG_STALENESS_METRIC_NAME, VneraMetrics.DD_REPORTER);
    private transient MetricStoreProgramHelper metricStoreHelper;

    @Override
    public void open(Configuration parameters) throws Exception {
        ExecutionConfig.GlobalJobParameters jobParams = getRuntimeContext().getExecutionConfig().getGlobalJobParameters();
        Configuration conf = ParameterTool.fromMap(jobParams.toMap()).getConfiguration();
        String taskInstanceId = getRuntimeContext().getTaskNameWithSubtasks();
        MetricStoreFactory.StoreType metStoreType = conf.getEnum(MetricStoreFactory.StoreType.class, StoreOptions.METRIC_STORE_TYPE);
        TaskManagerState state = TaskManagerState.getTaskManagerState(ConfigStoreFactory.StoreType.MEMORY, MetricStoreFactory.StoreType.MEMORY);
        ResourceManager rm = state.getResourceManager();
        metricStoreHelper = getMetricStoreProgramHelper(rm, state.getTsdbMetricStore().writer(taskInstanceId),
                taskInstanceId);
    }

    @Override
    public void processElement(SelfDescribingMessageDO value, Context ctx, Collector<GenericMetricV2> out) throws Exception {
        MetricStoreProgramHelper.MetricStoreOutput output = metricStoreHelper.execute(value);
        for (SelfDescribingMessageDO sdm : output.outputSdms) {
            ctx.output(SideOutputs.metricStoreEvents, sdm);
        }
    }

    @VisibleForTesting
    public MetricStoreProgramHelper getMetricStoreProgramHelper(final ResourceManager rm,
                                                                final TsdbMetricStore.Writer tsdbWriter,
                                                                final String taskInstanceId) {
        return new MetricStoreProgramHelper(rm.getConfigStore(),
                rm.getDataModel(),
                tsdbWriter,
                taskInstanceId,
                rm.getPolicyManger(),
                stalenessHisto);
    }

    private static ModelKey modelKeyFromConfigKey(ConfigKey ck) {
        return ModelKey.create(ck.customerId, ck.objectType, ck.objectId);
    }
}


My Operator is like below.
public class MetricStoreOperator implements AnaPipelineStage<GenericMetricV2> {
private transient SingleOutputStreamOperator<GenericMetricV2> metricStream;
private transient final MetricStoreMapper metricStoreMapper;

public MetricStoreOperator(final Configuration jobParams, final MetricStoreMapper metricStoreMapper) {
this.metricStoreMapper = metricStoreMapper;
}

@Override
public AnaPipelineStage<GenericMetricV2> linkFrom(AnaPipelineStage<?>... operators) {
AnaPipelineStage<SelfDescribingMessageDO> source = (AnaPipelineStage<SelfDescribingMessageDO>) Stream.of(operators).findFirst().get();
DataStream<SelfDescribingMessageDO> sdmStream = source.getOutputStream();
metricStream = sdmStream.process(metricStoreMapper);
return this;
}

@Override
public DataStream<GenericMetricV2> getOutputStream() {
return metricStream;
}

@Override
public DataStream getSideOutput(OutputTag outTag) {
return metricStream.getSideOutput(outTag);
}
}
tuk
Reply | Threaded
Open this post in threaded view
|

Re: How to debug flink serialization error?

tuk
Thanks Robert for the pointers. 

It is some issue with mockito which I was using to mock getMetricStoreProgramHelper method in my unit test. For now I have modified my unit test to not use mockito. 

I will try to provide a reproducible example.

On Fri, Feb 12, 2021 at 8:56 PM Robert Metzger <[hidden email]> wrote:
Thanks for reaching out to the Flink ML.

It reports getMetricStoreProgramHelper as a non-serializable field, even though it looks a lot like a method. The only recommendation I have for you is carefully reading the full error message + stack trace.

Your approach of using tagging fields as "transient" is absolutely correct.
There's also this message: NotSerializableException: java.lang.reflect.Method, but I can not find a field of type Method.

Can you provide a minimal reproducible example of this issue?

On Fri, Feb 12, 2021 at 7:06 AM Debraj Manna <[hidden email]> wrote:
HI

I am having a ProcessFunction like below which is throwing an error like below whenever I am trying to use it in a opeator . My understanding when flink initializes the operator dag, it serializes things and sends over to the taskmanagers.
So I have marked the  operator state transient, since the operator state will be populated within the open() call that gets invoked in each taskmanager. But I am still getting the serialization exception like below. Can suggest some ways where I can debug this type of serialization error in Flink 1.12?

org.apache.flink.api.common.InvalidProgramException: public com.vnera.programs.metrics.MetricStoreProgramHelper com.vnera.analytics.engine.MetricStoreMapper.getMetricStoreProgramHelper(com.vnera.resourcemanager.ResourceManager,com.vnera.storage.metrics.TsdbMetricStore$Writer,java.lang.String) is not serializable. The object probably contains or references non serializable fields.

at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:164)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:132)
...
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:69)
at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:2000)
at org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:203)
at org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:681)
at org.apache.flink.streaming.api.datastream.DataStream.process(DataStream.java:661)
at com.vnera.analytics.engine.MetricStoreOperator.linkFrom(MetricStoreOperator.java:27)
at com.vnera.analytics.engine.AnaPipelineStage.link(AnaPipelineStage.java:12)
at com.vnera.analytics.engine.AnalyticsEngine.createPipeline(AnalyticsEngine.java:106)
at com.vnera.analytics.engine.source.DerivedMetricCreatorTest.testPipeline(DerivedMetricCreatorTest.java:98)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
...
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
...
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
...
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
...
at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
Caused by: java.io.NotSerializableException: java.lang.reflect.Method
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:624)
at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:143)
... 45 more

My ProcessFunction looks like below
public class MetricStoreMapper extends ProcessFunction<SelfDescribingMessageDO, GenericMetricV2> {
    private static final String MSG_STALENESS_METRIC_NAME = VneraMetrics.createMetricName(MetricStoreMapper.class,
            "metric_sdm_staleness");
    private transient Histogram stalenessHisto = VneraMetrics.histogram(MSG_STALENESS_METRIC_NAME, VneraMetrics.DD_REPORTER);
    private transient MetricStoreProgramHelper metricStoreHelper;

    @Override
    public void open(Configuration parameters) throws Exception {
        ExecutionConfig.GlobalJobParameters jobParams = getRuntimeContext().getExecutionConfig().getGlobalJobParameters();
        Configuration conf = ParameterTool.fromMap(jobParams.toMap()).getConfiguration();
        String taskInstanceId = getRuntimeContext().getTaskNameWithSubtasks();
        MetricStoreFactory.StoreType metStoreType = conf.getEnum(MetricStoreFactory.StoreType.class, StoreOptions.METRIC_STORE_TYPE);
        TaskManagerState state = TaskManagerState.getTaskManagerState(ConfigStoreFactory.StoreType.MEMORY, MetricStoreFactory.StoreType.MEMORY);
        ResourceManager rm = state.getResourceManager();
        metricStoreHelper = getMetricStoreProgramHelper(rm, state.getTsdbMetricStore().writer(taskInstanceId),
                taskInstanceId);
    }

    @Override
    public void processElement(SelfDescribingMessageDO value, Context ctx, Collector<GenericMetricV2> out) throws Exception {
        MetricStoreProgramHelper.MetricStoreOutput output = metricStoreHelper.execute(value);
        for (SelfDescribingMessageDO sdm : output.outputSdms) {
            ctx.output(SideOutputs.metricStoreEvents, sdm);
        }
    }

    @VisibleForTesting
    public MetricStoreProgramHelper getMetricStoreProgramHelper(final ResourceManager rm,
                                                                final TsdbMetricStore.Writer tsdbWriter,
                                                                final String taskInstanceId) {
        return new MetricStoreProgramHelper(rm.getConfigStore(),
                rm.getDataModel(),
                tsdbWriter,
                taskInstanceId,
                rm.getPolicyManger(),
                stalenessHisto);
    }

    private static ModelKey modelKeyFromConfigKey(ConfigKey ck) {
        return ModelKey.create(ck.customerId, ck.objectType, ck.objectId);
    }
}


My Operator is like below.
public class MetricStoreOperator implements AnaPipelineStage<GenericMetricV2> {
private transient SingleOutputStreamOperator<GenericMetricV2> metricStream;
private transient final MetricStoreMapper metricStoreMapper;

public MetricStoreOperator(final Configuration jobParams, final MetricStoreMapper metricStoreMapper) {
this.metricStoreMapper = metricStoreMapper;
}

@Override
public AnaPipelineStage<GenericMetricV2> linkFrom(AnaPipelineStage<?>... operators) {
AnaPipelineStage<SelfDescribingMessageDO> source = (AnaPipelineStage<SelfDescribingMessageDO>) Stream.of(operators).findFirst().get();
DataStream<SelfDescribingMessageDO> sdmStream = source.getOutputStream();
metricStream = sdmStream.process(metricStoreMapper);
return this;
}

@Override
public DataStream<GenericMetricV2> getOutputStream() {
return metricStream;
}

@Override
public DataStream getSideOutput(OutputTag outTag) {
return metricStream.getSideOutput(outTag);
}
}