Base Scala Code for Flink 0.7

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Base Scala Code for Flink 0.7

Maximilian Alber
Hi Flinkers,

I tried to migrate to 0.7 for new features aka Broadcast Variables in Scala. Unfortunately the code structure seemed to have changed:

[INFO] Compiling 2 source files to /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes at 1411719085848
[ERROR] /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7: error: object TextFile is not a member of package org.apache.flink.api.scala
[ERROR] import org.apache.flink.api.scala.TextFile
[ERROR] ^
[ERROR] /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8: error: object ScalaPlan is not a member of package org.apache.flink.api.scala
[ERROR] import org.apache.flink.api.scala.ScalaPlan
[ERROR] ^
[ERROR] /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13: error: object functions is not a member of package org.apache.flink.api.scala
[ERROR] import org.apache.flink.api.scala.functions.MapFunction


Unfortunately the quickstart script still creates a project for 0.6 and using Maven results in an error too:

[INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @ standalone-pom ---
[INFO] Generating project in Interactive mode
[INFO] Archetype repository missing. Using the one from [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in catalog remote
Downloading: http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 18.857s
[INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
[INFO] Final Memory: 12M/104M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-archetype-plugin:2.2:generate (default-cli) on project standalone-pom: The desired archetype does not exist (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]


Is there some explanation how to move code from 0.6 to 0.7? Or an example which creates a Plan?

Thanks!
Cheers,
Max


Reply | Threaded
Open this post in threaded view
|

Re: Base Scala Code for Flink 0.7

Fabian Hueske
Hi Max,

yes, we changed the Scala API a bit and synced its features with the Java API.
There are several example programs written in the new Scala API under ./flink-examples/flink-scala-examples that show you how to use the API.
I think also the Scala API documentation was updated. Not sure if the changes are already reflected on the website, but you can build the docs locally from the sources with ./docs/build-docs.sh (or similar).

Best, Fabian

2014-09-26 10:20 GMT+02:00 Maximilian Alber <[hidden email]>:
Hi Flinkers,

I tried to migrate to 0.7 for new features aka Broadcast Variables in Scala. Unfortunately the code structure seemed to have changed:

[INFO] Compiling 2 source files to /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes at 1411719085848
[ERROR] /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7: error: object TextFile is not a member of package org.apache.flink.api.scala
[ERROR] import org.apache.flink.api.scala.TextFile
[ERROR] ^
[ERROR] /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8: error: object ScalaPlan is not a member of package org.apache.flink.api.scala
[ERROR] import org.apache.flink.api.scala.ScalaPlan
[ERROR] ^
[ERROR] /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13: error: object functions is not a member of package org.apache.flink.api.scala
[ERROR] import org.apache.flink.api.scala.functions.MapFunction


Unfortunately the quickstart script still creates a project for 0.6 and using Maven results in an error too:

[INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @ standalone-pom ---
[INFO] Generating project in Interactive mode
[INFO] Archetype repository missing. Using the one from [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in catalog remote
Downloading: http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 18.857s
[INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
[INFO] Final Memory: 12M/104M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-archetype-plugin:2.2:generate (default-cli) on project standalone-pom: The desired archetype does not exist (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]


Is there some explanation how to move code from 0.6 to 0.7? Or an example which creates a Plan?

Thanks!
Cheers,
Max



Reply | Threaded
Open this post in threaded view
|

Re: Base Scala Code for Flink 0.7

Aljoscha Krettek
In reply to this post by Maximilian Alber
Hi,
since the 0.7 is just the latest snapshot release the quickstart
script is not yet updated to create a 0.7 project.

If you want, you can create a 0.7 quickstart using:
curl https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart-scala-SNAPSHOT.sh
| bash

There are complete Scala API examples here:
http://flink.incubator.apache.org/docs/0.7-incubating/examples.html.
Just click the Scala Tab.

Let us know if you need more information.

Cheres,
Aljoscha

On Fri, Sep 26, 2014 at 10:20 AM, Maximilian Alber
<[hidden email]> wrote:

> Hi Flinkers,
>
> I tried to migrate to 0.7 for new features aka Broadcast Variables in Scala.
> Unfortunately the code structure seemed to have changed:
>
> [INFO] Compiling 2 source files to
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes
> at 1411719085848
> [ERROR]
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7:
> error: object TextFile is not a member of package org.apache.flink.api.scala
> [ERROR] import org.apache.flink.api.scala.TextFile
> [ERROR] ^
> [ERROR]
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8:
> error: object ScalaPlan is not a member of package
> org.apache.flink.api.scala
> [ERROR] import org.apache.flink.api.scala.ScalaPlan
> [ERROR] ^
> [ERROR]
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13:
> error: object functions is not a member of package
> org.apache.flink.api.scala
> [ERROR] import org.apache.flink.api.scala.functions.MapFunction
>
>
> Thus I tried to examine the new quickstart code on
> https://flink.incubator.apache.org/docs/0.7-incubating/scala_api_quickstart.html
> Unfortunately the quickstart script still creates a project for 0.6 and
> using Maven results in an error too:
>
> [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @
> standalone-pom ---
> [INFO] Generating project in Interactive mode
> [INFO] Archetype repository missing. Using the one from
> [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in catalog
> remote
> Downloading:
> http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Total time: 18.857s
> [INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
> [INFO] Final Memory: 12M/104M
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-archetype-plugin:2.2:generate (default-cli)
> on project standalone-pom: The desired archetype does not exist
> (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]
>
>
> Is there some explanation how to move code from 0.6 to 0.7? Or an example
> which creates a Plan?
>
> Thanks!
> Cheers,
> Max
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Base Scala Code for Flink 0.7

Maximilian Alber
Hi!

Thanks for the quick help!
Seems I have to change the code a bit for the migration. Or basically substitute new ScalaPlan with env.createProgramPlan :-)

Cheers,
Max

On Fri, Sep 26, 2014 at 11:06 AM, Aljoscha Krettek <[hidden email]> wrote:
Hi,
since the 0.7 is just the latest snapshot release the quickstart
script is not yet updated to create a 0.7 project.

If you want, you can create a 0.7 quickstart using:
curl https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart-scala-SNAPSHOT.sh
| bash

There are complete Scala API examples here:
http://flink.incubator.apache.org/docs/0.7-incubating/examples.html.
Just click the Scala Tab.

Let us know if you need more information.

Cheres,
Aljoscha

On Fri, Sep 26, 2014 at 10:20 AM, Maximilian Alber
<[hidden email]> wrote:
> Hi Flinkers,
>
> I tried to migrate to 0.7 for new features aka Broadcast Variables in Scala.
> Unfortunately the code structure seemed to have changed:
>
> [INFO] Compiling 2 source files to
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes
> at 1411719085848
> [ERROR]
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7:
> error: object TextFile is not a member of package org.apache.flink.api.scala
> [ERROR] import org.apache.flink.api.scala.TextFile
> [ERROR] ^
> [ERROR]
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8:
> error: object ScalaPlan is not a member of package
> org.apache.flink.api.scala
> [ERROR] import org.apache.flink.api.scala.ScalaPlan
> [ERROR] ^
> [ERROR]
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13:
> error: object functions is not a member of package
> org.apache.flink.api.scala
> [ERROR] import org.apache.flink.api.scala.functions.MapFunction
>
>
> Thus I tried to examine the new quickstart code on
> https://flink.incubator.apache.org/docs/0.7-incubating/scala_api_quickstart.html
> Unfortunately the quickstart script still creates a project for 0.6 and
> using Maven results in an error too:
>
> [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @
> standalone-pom ---
> [INFO] Generating project in Interactive mode
> [INFO] Archetype repository missing. Using the one from
> [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in catalog
> remote
> Downloading:
> http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
> [INFO]
> ------------------------------------------------------------------------
> [INFO] BUILD FAILURE
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Total time: 18.857s
> [INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
> [INFO] Final Memory: 12M/104M
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-archetype-plugin:2.2:generate (default-cli)
> on project standalone-pom: The desired archetype does not exist
> (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]
>
>
> Is there some explanation how to move code from 0.6 to 0.7? Or an example
> which creates a Plan?
>
> Thanks!
> Cheers,
> Max
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Base Scala Code for Flink 0.7

Aljoscha Krettek
Hi,
yes, there are two ways to go. In the past we had PactProgram with the
getPlan method. This is not necessary anymore in the new version.

With the new version the code is simply put in a main method. Instead
of creating a Scala Plan you can simply call env.execute() to execute
the program.

The basic program skeleton is this:

object MyProgram {
  def main(args: Array[String]) {
    val env = ExecutionEnvironment.getExecutionEnvironment

    val text = env.readTextFile(...)
    val mapped = text.map {...}

    // other operations

    mapped.writeAsText() // or writeAsCsv(), or something else

   env.execute("My Program")
  }

Cheers,
Aljoscha

On Fri, Sep 26, 2014 at 12:06 PM, Maximilian Alber
<[hidden email]> wrote:

> Hi!
>
> Thanks for the quick help!
> Seems I have to change the code a bit for the migration. Or basically
> substitute new ScalaPlan with env.createProgramPlan :-)
>
> Cheers,
> Max
>
> On Fri, Sep 26, 2014 at 11:06 AM, Aljoscha Krettek <[hidden email]>
> wrote:
>>
>> Hi,
>> since the 0.7 is just the latest snapshot release the quickstart
>> script is not yet updated to create a 0.7 project.
>>
>> If you want, you can create a 0.7 quickstart using:
>> curl
>> https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart-scala-SNAPSHOT.sh
>> | bash
>>
>> There are complete Scala API examples here:
>> http://flink.incubator.apache.org/docs/0.7-incubating/examples.html.
>> Just click the Scala Tab.
>>
>> Let us know if you need more information.
>>
>> Cheres,
>> Aljoscha
>>
>> On Fri, Sep 26, 2014 at 10:20 AM, Maximilian Alber
>> <[hidden email]> wrote:
>> > Hi Flinkers,
>> >
>> > I tried to migrate to 0.7 for new features aka Broadcast Variables in
>> > Scala.
>> > Unfortunately the code structure seemed to have changed:
>> >
>> > [INFO] Compiling 2 source files to
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes
>> > at 1411719085848
>> > [ERROR]
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7:
>> > error: object TextFile is not a member of package
>> > org.apache.flink.api.scala
>> > [ERROR] import org.apache.flink.api.scala.TextFile
>> > [ERROR] ^
>> > [ERROR]
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8:
>> > error: object ScalaPlan is not a member of package
>> > org.apache.flink.api.scala
>> > [ERROR] import org.apache.flink.api.scala.ScalaPlan
>> > [ERROR] ^
>> > [ERROR]
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13:
>> > error: object functions is not a member of package
>> > org.apache.flink.api.scala
>> > [ERROR] import org.apache.flink.api.scala.functions.MapFunction
>> >
>> >
>> > Thus I tried to examine the new quickstart code on
>> >
>> > https://flink.incubator.apache.org/docs/0.7-incubating/scala_api_quickstart.html
>> > Unfortunately the quickstart script still creates a project for 0.6 and
>> > using Maven results in an error too:
>> >
>> > [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @
>> > standalone-pom ---
>> > [INFO] Generating project in Interactive mode
>> > [INFO] Archetype repository missing. Using the one from
>> > [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in
>> > catalog
>> > remote
>> > Downloading:
>> >
>> > http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
>> > [INFO]
>> > ------------------------------------------------------------------------
>> > [INFO] BUILD FAILURE
>> > [INFO]
>> > ------------------------------------------------------------------------
>> > [INFO] Total time: 18.857s
>> > [INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
>> > [INFO] Final Memory: 12M/104M
>> > [INFO]
>> > ------------------------------------------------------------------------
>> > [ERROR] Failed to execute goal
>> > org.apache.maven.plugins:maven-archetype-plugin:2.2:generate
>> > (default-cli)
>> > on project standalone-pom: The desired archetype does not exist
>> > (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]
>> >
>> >
>> > Is there some explanation how to move code from 0.6 to 0.7? Or an
>> > example
>> > which creates a Plan?
>> >
>> > Thanks!
>> > Cheers,
>> > Max
>> >
>> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Base Scala Code for Flink 0.7

Maximilian Alber
Hi, 

Thanks. 

Something different regarding file input.

My former code was:

val Y = DataSource(config.yFile, DelimitedInputFormat[Vector]{ x: String => Vector.parseFromString(1, x)})

with:

object Vector {
    def parseFromString(s: String) = {
      Vector.parseFromString(1, s)
    }

    def parseFromString(dimension: Int, s: String) = {
      val tokens = s split "\\s" filter { _.trim().size > 0 }
      val id = tokens(0).toInt
      val values = tokens drop 1 map { _ toFloat }

      assert(values.size == dimension)

      new Vector(id, values) 
    }
}

Thus my input is heterogeneous (integer and floats). In the new version the DataSource class is gone as DelimitedInputFormat is. Is the code just not added yet or is there another way to do it now?
F.e. I could in imagine to read it first as CSV with Float base type and afterwards creating the Vector objects with a map function.


Cheers,
Max
 



On Fri, Sep 26, 2014 at 12:17 PM, Aljoscha Krettek <[hidden email]> wrote:
Hi,
yes, there are two ways to go. In the past we had PactProgram with the
getPlan method. This is not necessary anymore in the new version.

With the new version the code is simply put in a main method. Instead
of creating a Scala Plan you can simply call env.execute() to execute
the program.

The basic program skeleton is this:

object MyProgram {
  def main(args: Array[String]) {
    val env = ExecutionEnvironment.getExecutionEnvironment

    val text = env.readTextFile(...)
    val mapped = text.map {...}

    // other operations

    mapped.writeAsText() // or writeAsCsv(), or something else

   env.execute("My Program")
  }

Cheers,
Aljoscha

On Fri, Sep 26, 2014 at 12:06 PM, Maximilian Alber
<[hidden email]> wrote:
> Hi!
>
> Thanks for the quick help!
> Seems I have to change the code a bit for the migration. Or basically
> substitute new ScalaPlan with env.createProgramPlan :-)
>
> Cheers,
> Max
>
> On Fri, Sep 26, 2014 at 11:06 AM, Aljoscha Krettek <[hidden email]>
> wrote:
>>
>> Hi,
>> since the 0.7 is just the latest snapshot release the quickstart
>> script is not yet updated to create a 0.7 project.
>>
>> If you want, you can create a 0.7 quickstart using:
>> curl
>> https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart-scala-SNAPSHOT.sh
>> | bash
>>
>> There are complete Scala API examples here:
>> http://flink.incubator.apache.org/docs/0.7-incubating/examples.html.
>> Just click the Scala Tab.
>>
>> Let us know if you need more information.
>>
>> Cheres,
>> Aljoscha
>>
>> On Fri, Sep 26, 2014 at 10:20 AM, Maximilian Alber
>> <[hidden email]> wrote:
>> > Hi Flinkers,
>> >
>> > I tried to migrate to 0.7 for new features aka Broadcast Variables in
>> > Scala.
>> > Unfortunately the code structure seemed to have changed:
>> >
>> > [INFO] Compiling 2 source files to
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes
>> > at 1411719085848
>> > [ERROR]
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7:
>> > error: object TextFile is not a member of package
>> > org.apache.flink.api.scala
>> > [ERROR] import org.apache.flink.api.scala.TextFile
>> > [ERROR] ^
>> > [ERROR]
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8:
>> > error: object ScalaPlan is not a member of package
>> > org.apache.flink.api.scala
>> > [ERROR] import org.apache.flink.api.scala.ScalaPlan
>> > [ERROR] ^
>> > [ERROR]
>> >
>> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13:
>> > error: object functions is not a member of package
>> > org.apache.flink.api.scala
>> > [ERROR] import org.apache.flink.api.scala.functions.MapFunction
>> >
>> >
>> > Thus I tried to examine the new quickstart code on
>> >
>> > https://flink.incubator.apache.org/docs/0.7-incubating/scala_api_quickstart.html
>> > Unfortunately the quickstart script still creates a project for 0.6 and
>> > using Maven results in an error too:
>> >
>> > [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @
>> > standalone-pom ---
>> > [INFO] Generating project in Interactive mode
>> > [INFO] Archetype repository missing. Using the one from
>> > [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in
>> > catalog
>> > remote
>> > Downloading:
>> >
>> > http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
>> > [INFO]
>> > ------------------------------------------------------------------------
>> > [INFO] BUILD FAILURE
>> > [INFO]
>> > ------------------------------------------------------------------------
>> > [INFO] Total time: 18.857s
>> > [INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
>> > [INFO] Final Memory: 12M/104M
>> > [INFO]
>> > ------------------------------------------------------------------------
>> > [ERROR] Failed to execute goal
>> > org.apache.maven.plugins:maven-archetype-plugin:2.2:generate
>> > (default-cli)
>> > on project standalone-pom: The desired archetype does not exist
>> > (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]
>> >
>> >
>> > Is there some explanation how to move code from 0.6 to 0.7? Or an
>> > example
>> > which creates a Plan?
>> >
>> > Thanks!
>> > Cheers,
>> > Max
>> >
>> >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Base Scala Code for Flink 0.7

Aljoscha Krettek
Hi,
yes, some things were thrown out while making the APIs equal. What you
could do with the delimited input format you can now do with a
TextInput, i.e. env.readTextFile() and a mapper. There won't even be
an actual map operator since the map is fused (chained) to the data
source.

Cheers,
Aljoscha

On Fri, Sep 26, 2014 at 12:59 PM, Maximilian Alber
<[hidden email]> wrote:

> Hi,
>
> Thanks.
>
> Something different regarding file input.
>
> My former code was:
>
> val Y = DataSource(config.yFile, DelimitedInputFormat[Vector]{ x: String =>
> Vector.parseFromString(1, x)})
>
> with:
>
> object Vector {
>     def parseFromString(s: String) = {
>       Vector.parseFromString(1, s)
>     }
>
>     def parseFromString(dimension: Int, s: String) = {
>       val tokens = s split "\\s" filter { _.trim().size > 0 }
>       val id = tokens(0).toInt
>       val values = tokens drop 1 map { _ toFloat }
>
>       assert(values.size == dimension)
>
>       new Vector(id, values)
>     }
> }
>
> Thus my input is heterogeneous (integer and floats). In the new version the
> DataSource class is gone as DelimitedInputFormat is. Is the code just not
> added yet or is there another way to do it now?
> F.e. I could in imagine to read it first as CSV with Float base type and
> afterwards creating the Vector objects with a map function.
>
>
> Cheers,
> Max
>
>
>
>
> On Fri, Sep 26, 2014 at 12:17 PM, Aljoscha Krettek <[hidden email]>
> wrote:
>>
>> Hi,
>> yes, there are two ways to go. In the past we had PactProgram with the
>> getPlan method. This is not necessary anymore in the new version.
>>
>> With the new version the code is simply put in a main method. Instead
>> of creating a Scala Plan you can simply call env.execute() to execute
>> the program.
>>
>> The basic program skeleton is this:
>>
>> object MyProgram {
>>   def main(args: Array[String]) {
>>     val env = ExecutionEnvironment.getExecutionEnvironment
>>
>>     val text = env.readTextFile(...)
>>     val mapped = text.map {...}
>>
>>     // other operations
>>
>>     mapped.writeAsText() // or writeAsCsv(), or something else
>>
>>    env.execute("My Program")
>>   }
>>
>> Cheers,
>> Aljoscha
>>
>> On Fri, Sep 26, 2014 at 12:06 PM, Maximilian Alber
>> <[hidden email]> wrote:
>> > Hi!
>> >
>> > Thanks for the quick help!
>> > Seems I have to change the code a bit for the migration. Or basically
>> > substitute new ScalaPlan with env.createProgramPlan :-)
>> >
>> > Cheers,
>> > Max
>> >
>> > On Fri, Sep 26, 2014 at 11:06 AM, Aljoscha Krettek <[hidden email]>
>> > wrote:
>> >>
>> >> Hi,
>> >> since the 0.7 is just the latest snapshot release the quickstart
>> >> script is not yet updated to create a 0.7 project.
>> >>
>> >> If you want, you can create a 0.7 quickstart using:
>> >> curl
>> >>
>> >> https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart-scala-SNAPSHOT.sh
>> >> | bash
>> >>
>> >> There are complete Scala API examples here:
>> >> http://flink.incubator.apache.org/docs/0.7-incubating/examples.html.
>> >> Just click the Scala Tab.
>> >>
>> >> Let us know if you need more information.
>> >>
>> >> Cheres,
>> >> Aljoscha
>> >>
>> >> On Fri, Sep 26, 2014 at 10:20 AM, Maximilian Alber
>> >> <[hidden email]> wrote:
>> >> > Hi Flinkers,
>> >> >
>> >> > I tried to migrate to 0.7 for new features aka Broadcast Variables in
>> >> > Scala.
>> >> > Unfortunately the code structure seemed to have changed:
>> >> >
>> >> > [INFO] Compiling 2 source files to
>> >> >
>> >> >
>> >> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes
>> >> > at 1411719085848
>> >> > [ERROR]
>> >> >
>> >> >
>> >> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7:
>> >> > error: object TextFile is not a member of package
>> >> > org.apache.flink.api.scala
>> >> > [ERROR] import org.apache.flink.api.scala.TextFile
>> >> > [ERROR] ^
>> >> > [ERROR]
>> >> >
>> >> >
>> >> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8:
>> >> > error: object ScalaPlan is not a member of package
>> >> > org.apache.flink.api.scala
>> >> > [ERROR] import org.apache.flink.api.scala.ScalaPlan
>> >> > [ERROR] ^
>> >> > [ERROR]
>> >> >
>> >> >
>> >> > /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13:
>> >> > error: object functions is not a member of package
>> >> > org.apache.flink.api.scala
>> >> > [ERROR] import org.apache.flink.api.scala.functions.MapFunction
>> >> >
>> >> >
>> >> > Thus I tried to examine the new quickstart code on
>> >> >
>> >> >
>> >> > https://flink.incubator.apache.org/docs/0.7-incubating/scala_api_quickstart.html
>> >> > Unfortunately the quickstart script still creates a project for 0.6
>> >> > and
>> >> > using Maven results in an error too:
>> >> >
>> >> > [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @
>> >> > standalone-pom ---
>> >> > [INFO] Generating project in Interactive mode
>> >> > [INFO] Archetype repository missing. Using the one from
>> >> > [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in
>> >> > catalog
>> >> > remote
>> >> > Downloading:
>> >> >
>> >> >
>> >> > http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
>> >> > [INFO]
>> >> >
>> >> > ------------------------------------------------------------------------
>> >> > [INFO] BUILD FAILURE
>> >> > [INFO]
>> >> >
>> >> > ------------------------------------------------------------------------
>> >> > [INFO] Total time: 18.857s
>> >> > [INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
>> >> > [INFO] Final Memory: 12M/104M
>> >> > [INFO]
>> >> >
>> >> > ------------------------------------------------------------------------
>> >> > [ERROR] Failed to execute goal
>> >> > org.apache.maven.plugins:maven-archetype-plugin:2.2:generate
>> >> > (default-cli)
>> >> > on project standalone-pom: The desired archetype does not exist
>> >> > (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]
>> >> >
>> >> >
>> >> > Is there some explanation how to move code from 0.6 to 0.7? Or an
>> >> > example
>> >> > which creates a Plan?
>> >> >
>> >> > Thanks!
>> >> > Cheers,
>> >> > Max
>> >> >
>> >> >
>> >
>> >
>
>