Weird serialization bug?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Weird serialization bug?

Sebastian Neef
Hello Apache Flink users,

I implemented a FilterFunction some months ago that worked quite well
back then. However, I wanted to check it out right now and it somehow
broke in the sense that Flink can't serialize it anymore. I might be
mistaken, but afaik I didn't touch the code at all.

I think that I've tracked down the problem to the following minimal
working PoC:

- A simple interface:

>     interface testFunc extends Serializable {
>         boolean filter();
>     }

- A TestFilterFunction which is applied on a DataSet:

>  public void doSomeFiltering() {
>         class fooo implements testFunc {
>             public boolean filter() {
>                 return false;
>             }
>         }
>
>         class TestFilterFunction implements FilterFunction<IPage> {
>
>             testFunc filter;
>
>             class fooo2 implements testFunc {
>                 public boolean filter() {
>                     return false;
>                 }
>             }
>
>             TestFilterFunction() {
>                 // WORKS with fooo2()
>                 // DOES NOT WORK with fooo()
>                 this.filter = new fooo2();
>             }
>             @Override
>             public boolean filter(IPage iPage) throws Exception {
>                 return filter.filter();
>             }
>         }
> filteredDataSet = DataSet.filter(new TestFilterFunction(null))> }

Flink will work fine when the "fooo2" class is used. However, when using
the "fooo()" class, I get the following error:

> ------------------------------------------------------------
>  The program finished with the following exception:
>
> The implementation of the FilterFunction is not serializable. The object probably contains or references non serializable fields.
> org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:100)
> org.apache.flink.api.java.DataSet.clean(DataSet.java:186)
> org.apache.flink.api.java.DataSet.filter(DataSet.java:287)
> testpackage.testclass.applyFilters(testclass.java:105)

I'm a little bit confused, why Flink manages to serialize the "fooo2"
class, but not the "fooo" class. Is this is a bug or do I miss something
here?

Cheers,
Sebastian

Reply | Threaded
Open this post in threaded view
|

Re: Weird serialization bug?

Ted Yu
Have you tried making fooo static ?

Cheers

On Sat, Apr 29, 2017 at 4:26 AM, Sebastian Neef <[hidden email]> wrote:
Hello Apache Flink users,

I implemented a FilterFunction some months ago that worked quite well
back then. However, I wanted to check it out right now and it somehow
broke in the sense that Flink can't serialize it anymore. I might be
mistaken, but afaik I didn't touch the code at all.

I think that I've tracked down the problem to the following minimal
working PoC:

- A simple interface:

>     interface testFunc extends Serializable {
>         boolean filter();
>     }

- A TestFilterFunction which is applied on a DataSet:

>  public void doSomeFiltering() {
>         class fooo implements testFunc {
>             public boolean filter() {
>                 return false;
>             }
>         }
>
>         class TestFilterFunction implements FilterFunction<IPage> {
>
>             testFunc filter;
>
>             class fooo2 implements testFunc {
>                 public boolean filter() {
>                     return false;
>                 }
>             }
>
>             TestFilterFunction() {
>                 // WORKS with fooo2()
>                 // DOES NOT WORK with fooo()
>                 this.filter = new fooo2();
>             }
>             @Override
>             public boolean filter(IPage iPage) throws Exception {
>                 return filter.filter();
>             }
>         }
>       filteredDataSet = DataSet.filter(new TestFilterFunction(null))> }

Flink will work fine when the "fooo2" class is used. However, when using
the "fooo()" class, I get the following error:

> ------------------------------------------------------------
>  The program finished with the following exception:
>
> The implementation of the FilterFunction is not serializable. The object probably contains or references non serializable fields.
>       org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:100)
>       org.apache.flink.api.java.DataSet.clean(DataSet.java:186)
>       org.apache.flink.api.java.DataSet.filter(DataSet.java:287)
>       testpackage.testclass.applyFilters(testclass.java:105)

I'm a little bit confused, why Flink manages to serialize the "fooo2"
class, but not the "fooo" class. Is this is a bug or do I miss something
here?

Cheers,
Sebastian


Reply | Threaded
Open this post in threaded view
|

Re: Weird serialization bug?

Aljoscha Krettek
To elaborate on what Ted said: fooo is defined inside a method and probably has references to outer (non serialisable) classes.

On 30. Apr 2017, at 01:15, Ted Yu <[hidden email]> wrote:

Have you tried making fooo static ?

Cheers

On Sat, Apr 29, 2017 at 4:26 AM, Sebastian Neef <[hidden email]> wrote:
Hello Apache Flink users,

I implemented a FilterFunction some months ago that worked quite well
back then. However, I wanted to check it out right now and it somehow
broke in the sense that Flink can't serialize it anymore. I might be
mistaken, but afaik I didn't touch the code at all.

I think that I've tracked down the problem to the following minimal
working PoC:

- A simple interface:

>     interface testFunc extends Serializable {
>         boolean filter();
>     }

- A TestFilterFunction which is applied on a DataSet:

>  public void doSomeFiltering() {
>         class fooo implements testFunc {
>             public boolean filter() {
>                 return false;
>             }
>         }
>
>         class TestFilterFunction implements FilterFunction<IPage> {
>
>             testFunc filter;
>
>             class fooo2 implements testFunc {
>                 public boolean filter() {
>                     return false;
>                 }
>             }
>
>             TestFilterFunction() {
>                 // WORKS with fooo2()
>                 // DOES NOT WORK with fooo()
>                 this.filter = new fooo2();
>             }
>             @Override
>             public boolean filter(IPage iPage) throws Exception {
>                 return filter.filter();
>             }
>         }
>       filteredDataSet = DataSet.filter(new TestFilterFunction(null))> }

Flink will work fine when the "fooo2" class is used. However, when using
the "fooo()" class, I get the following error:

> ------------------------------------------------------------
>  The program finished with the following exception:
>
> The implementation of the FilterFunction is not serializable. The object probably contains or references non serializable fields.
>       org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:100)
>       org.apache.flink.api.java.DataSet.clean(DataSet.java:186)
>       org.apache.flink.api.java.DataSet.filter(DataSet.java:287)
>       testpackage.testclass.applyFilters(testclass.java:105)

I'm a little bit confused, why Flink manages to serialize the "fooo2"
class, but not the "fooo" class. Is this is a bug or do I miss something
here?

Cheers,
Sebastian



Reply | Threaded
Open this post in threaded view
|

Re: Weird serialization bug?

Sebastian Neef
Hi,

thanks for the help!

Making the class fooo static did the trick.

I was just a bit confused, because I'm using a similar contruction
somewhere else in the code and it works flawlessy.

Best regards,
Sebastian