Testing DataStreams

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Testing DataStreams

Juan Rodríguez Hortalá
Hi,

I'm new to Flink, and I'm trying to write my first unit test  for a simple DataStreams job. In https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/util/package-summary.html I see several promising classes, but for example I cannot import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase from the artifacts obtained by the following Maven dependencies:

         <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-java</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_2.10</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients_2.10</artifactId>
            <version>${flink.version}</version>
        </dependency>

I also see that the page https://cwiki.apache.org/confluence/display/FLINK/Testing+Utilities+and+Mini+Clusters is empty. Is there any documentation or tutorial about writing simple unit tests running in local mode? I'm looking for something similar to http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/, where you can specify the expected output as a collection to define an assertion, but for Flink.

By the way I have also implemented source function similar to StreamExecutionEnvironment.fromElements but that allows to add time gaps between the generated elements, that I think could be useful for testing, in case someone is interested https://github.com/juanrh/flink-state-eviction/blob/master/src/main/java/com/github/juanrh/streaming/source/ElementsWithGapsSource.java.

Thanks,

Juan
Reply | Threaded
Open this post in threaded view
|

Re: Testing DataStreams

Maximilian Michels
Hi Juan,

StreamingMultipleProgramsTestBase is in the testing scope. Thus, is it
not bundled in the normal jars. You would have to add the
flink-test-utils_2.10 module.

It is true that there is no guide. There is
https://github.com/ottogroup/flink-spector for testing streaming
pipelines.

For unit tests and integration tests please have a look at the Flink
source code which contains many such tests.

-Max


On Wed, Nov 2, 2016 at 4:58 PM, Juan Rodríguez Hortalá
<[hidden email]> wrote:

> Hi,
>
> I'm new to Flink, and I'm trying to write my first unit test  for a simple
> DataStreams job. In
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/util/package-summary.html
> I see several promising classes, but for example I cannot import
> org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase from the
> artifacts obtained by the following Maven dependencies:
>
>          <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-java</artifactId>
>             <version>${flink.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-streaming-java_2.10</artifactId>
>             <version>${flink.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-clients_2.10</artifactId>
>             <version>${flink.version}</version>
>         </dependency>
>
> I also see that the page
> https://cwiki.apache.org/confluence/display/FLINK/Testing+Utilities+and+Mini+Clusters
> is empty. Is there any documentation or tutorial about writing simple unit
> tests running in local mode? I'm looking for something similar to
> http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/,
> where you can specify the expected output as a collection to define an
> assertion, but for Flink.
>
> By the way I have also implemented source function similar to
> StreamExecutionEnvironment.fromElements but that allows to add time gaps
> between the generated elements, that I think could be useful for testing, in
> case someone is interested
> https://github.com/juanrh/flink-state-eviction/blob/master/src/main/java/com/github/juanrh/streaming/source/ElementsWithGapsSource.java.
>
> Thanks,
>
> Juan
Reply | Threaded
Open this post in threaded view
|

Re: Testing DataStreams

Juan Rodríguez Hortalá
Hi Max,

Thanks for your help. Flink-spector looks just like what I need.

Greetings,

Juan

On Thu, Nov 3, 2016 at 11:05 AM, Maximilian Michels <[hidden email]> wrote:
Hi Juan,

StreamingMultipleProgramsTestBase is in the testing scope. Thus, is it
not bundled in the normal jars. You would have to add the
flink-test-utils_2.10 module.

It is true that there is no guide. There is
https://github.com/ottogroup/flink-spector for testing streaming
pipelines.

For unit tests and integration tests please have a look at the Flink
source code which contains many such tests.

-Max


On Wed, Nov 2, 2016 at 4:58 PM, Juan Rodríguez Hortalá
<[hidden email]> wrote:
> Hi,
>
> I'm new to Flink, and I'm trying to write my first unit test  for a simple
> DataStreams job. In
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/util/package-summary.html
> I see several promising classes, but for example I cannot import
> org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase from the
> artifacts obtained by the following Maven dependencies:
>
>          <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-java</artifactId>
>             <version>${flink.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-streaming-java_2.10</artifactId>
>             <version>${flink.version}</version>
>         </dependency>
>         <dependency>
>             <groupId>org.apache.flink</groupId>
>             <artifactId>flink-clients_2.10</artifactId>
>             <version>${flink.version}</version>
>         </dependency>
>
> I also see that the page
> https://cwiki.apache.org/confluence/display/FLINK/Testing+Utilities+and+Mini+Clusters
> is empty. Is there any documentation or tutorial about writing simple unit
> tests running in local mode? I'm looking for something similar to
> http://blog.cloudera.com/blog/2015/09/making-apache-spark-testing-easy-with-spark-testing-base/,
> where you can specify the expected output as a collection to define an
> assertion, but for Flink.
>
> By the way I have also implemented source function similar to
> StreamExecutionEnvironment.fromElements but that allows to add time gaps
> between the generated elements, that I think could be useful for testing, in
> case someone is interested
> https://github.com/juanrh/flink-state-eviction/blob/master/src/main/java/com/github/juanrh/streaming/source/ElementsWithGapsSource.java.
>
> Thanks,
>
> Juan