Flink-ml multiple linear regression fit

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink-ml multiple linear regression fit

Florian Heyl
Hey Guys need your help again,
I am currently having problems with the multiple linear regression from the flink-ml on the HDFS. 
Locally it works fine with the 0.9-SNAPSHOT. The cluster runs with the 0.10-SNAPSHOT. The code is the following:
// set linear regression with parameters:
val mlr = MultipleLinearRegression()
.setStepsize(0.001)
.setIterations(1000000000)
.setConvergenceThreshold(0.001)

// do linear regression and time the method
val model = mlr.fit(transformTrain)

// The fitted model can now be used to make predictions
val predictions = mlr.predict(tranformTest)
The dataset transformTrain has the following form (filled with doubles):
LabeledVector(numList(0), DenseVector(numList(1),numList(2)))
Mainly the line where the fit method (mlr.fit) is called causes the following error:

An error occurred while invoking the program:


The program caused an error:

java.lang.NoClassDefFoundError: breeze/storage/Zero
	at org.apache.flink.ml.pipeline.Estimator$class.fit(Estimator.scala:53)
	at org.apache.flink.ml.regression.MultipleLinearRegression.fit(MultipleLinearRegression.scala:88)
	at Regression2$.buildModelRegression(Regression2.scala:37)
	at Regression2$$anonfun$mainRegression$1.apply$mcVI$sp(Regression2.scala:116)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
	at Regression2$.mainRegression(Regression2.scala:103)
	at MainClass$.main(MainClass.scala:47)
	at MainClass.main(MainClass.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
	at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:192)
	at org.apache.flink.client.CliFrontend.info(CliFrontend.java:399)
	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:959)
	at org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:174)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
	at org.eclipse.jetty.server.Server.handle(Server.java:352)
	at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
	at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
	at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: breeze.storage.Zero
	at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 39 more
Thanks for any help.

Best wishes,
Flo
Reply | Threaded
Open this post in threaded view
|

Re: Flink-ml multiple linear regression fit

Stephan Ewen
Hi!

Looks like you submitted the program JAR, but it did not contain all required libraries, like the breeze JAR.

Did you build a proper fat jar, or how did you package the program?

Greetings,
Stephan

On Fri, Sep 18, 2015 at 8:22 PM, Florian Heyl <[hidden email]> wrote:
Hey Guys need your help again,
I am currently having problems with the multiple linear regression from the flink-ml on the HDFS. 
Locally it works fine with the 0.9-SNAPSHOT. The cluster runs with the 0.10-SNAPSHOT. The code is the following:
// set linear regression with parameters:
val mlr = MultipleLinearRegression()
.setStepsize(0.001)
.setIterations(1000000000)
.setConvergenceThreshold(0.001)

// do linear regression and time the method
val model = mlr.fit(transformTrain)

// The fitted model can now be used to make predictions
val predictions = mlr.predict(tranformTest)
The dataset transformTrain has the following form (filled with doubles):
LabeledVector(numList(0), DenseVector(numList(1),numList(2)))
Mainly the line where the fit method (mlr.fit) is called causes the following error:

An error occurred while invoking the program:


The program caused an error:

java.lang.NoClassDefFoundError: breeze/storage/Zero
	at org.apache.flink.ml.pipeline.Estimator$class.fit(Estimator.scala:53)
	at org.apache.flink.ml.regression.MultipleLinearRegression.fit(MultipleLinearRegression.scala:88)
	at Regression2$.buildModelRegression(Regression2.scala:37)
	at Regression2$$anonfun$mainRegression$1.apply$mcVI$sp(Regression2.scala:116)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
	at Regression2$.mainRegression(Regression2.scala:103)
	at MainClass$.main(MainClass.scala:47)
	at MainClass.main(MainClass.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
	at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:192)
	at org.apache.flink.client.CliFrontend.info(CliFrontend.java:399)
	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:959)
	at org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:174)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
	at org.eclipse.jetty.server.Server.handle(Server.java:352)
	at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
	at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
	at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: breeze.storage.Zero
	at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 39 more
Thanks for any help.

Best wishes,
Flo

Reply | Threaded
Open this post in threaded view
|

Re: Flink-ml multiple linear regression fit

Florian Heyl
Hi Stephan,

Yeah I forgot the breeze library. Thanks. Unfortunately there is still another problem when I am running the pipeline on the hdfs. 
I tried to figure out what the cause of the problem is and I am mainly stuck at the collect method for the datasets. 
// List( (1.0, 1.0), (2.0, 2.0), ... (1.0,1.0) )
val list_JoinPredictionAndOriginal = JoinPredictionAndOriginal.collect
The line causes the errors (see below). Maybe I am still missing some libraries. The jar is packed now with the breeze, netlib, flink-ml, flink-core kryo and minlog libraries.
Thank you for any help and your time.

Best wishes,
Flo

Error: java.lang.NoClassDefFoundError: org/netlib/blas/Ddot
at com.github.fommil.netlib.F2jBLAS.ddot(F2jBLAS.java:71)
at org.apache.flink.ml.math.BLAS$.dot(BLAS.scala:123)
at org.apache.flink.ml.math.BLAS$.dot(BLAS.scala:106)
at org.apache.flink.ml.optimization.LinearPrediction$.predict(PredictionFunction.scala:34)
at org.apache.flink.ml.optimization.GenericLossFunction.lossGradient(LossFunction.scala:83)
at org.apache.flink.ml.optimization.LossFunction$class.loss(LossFunction.scala:43)
at org.apache.flink.ml.optimization.GenericLossFunction.loss(LossFunction.scala:71)
at org.apache.flink.ml.optimization.GradientDescent$$anonfun$org$apache$flink$ml$optimization$GradientDescent$$calculateLoss$1.apply(GradientDescent.scala:237)
at org.apache.flink.ml.optimization.GradientDescent$$anonfun$org$apache$flink$ml$optimization$GradientDescent$$calculateLoss$1.apply(GradientDescent.scala:237)
at org.apache.flink.ml.package$BroadcastSingleElementMapper.map(package.scala:86)
at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.java:97)
at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:489)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.netlib.blas.Ddot
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 15 more

Error: java.io.IOException: Materialization of the broadcast variable failed.
at org.apache.flink.runtime.broadcast.BroadcastVariableMaterialization.materializeVariable(BroadcastVariableMaterialization.java:154)
at org.apache.flink.runtime.broadcast.BroadcastVariableManager.materializeBroadcastVariable(BroadcastVariableManager.java:50)
at org.apache.flink.runtime.operators.RegularPactTask.readAndSetBroadcastInput(RegularPactTask.java:432)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:350)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.io.network.partition.ProducerFailedException
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.getNextLookAhead(LocalInputChannel.java:270)
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.onNotification(LocalInputChannel.java:238)
at org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.release(PipelinedSubpartition.java:158)
at org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:300)
at org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartitionsProducedBy(ResultPartitionManager.java:95)
at org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:357)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:679)
... 1 more



Am 20.09.2015 um 02:01 schrieb Stephan Ewen <[hidden email]>:

Hi!

Looks like you submitted the program JAR, but it did not contain all required libraries, like the breeze JAR.

Did you build a proper fat jar, or how did you package the program?

Greetings,
Stephan

On Fri, Sep 18, 2015 at 8:22 PM, Florian Heyl <[hidden email]> wrote:
Hey Guys need your help again,
I am currently having problems with the multiple linear regression from the flink-ml on the HDFS. 
Locally it works fine with the 0.9-SNAPSHOT. The cluster runs with the 0.10-SNAPSHOT. The code is the following:
// set linear regression with parameters:
val mlr = MultipleLinearRegression()
.setStepsize(0.001)
.setIterations(1000000000)
.setConvergenceThreshold(0.001)

// do linear regression and time the method
val model = mlr.fit(transformTrain)

// The fitted model can now be used to make predictions
val predictions = mlr.predict(tranformTest)
The dataset transformTrain has the following form (filled with doubles):
LabeledVector(numList(0), DenseVector(numList(1),numList(2)))
Mainly the line where the fit method (mlr.fit) is called causes the following error:

An error occurred while invoking the program:


The program caused an error:

java.lang.NoClassDefFoundError: breeze/storage/Zero
	at org.apache.flink.ml.pipeline.Estimator$class.fit(Estimator.scala:53)
	at org.apache.flink.ml.regression.MultipleLinearRegression.fit(MultipleLinearRegression.scala:88)
	at Regression2$.buildModelRegression(Regression2.scala:37)
	at Regression2$$anonfun$mainRegression$1.apply$mcVI$sp(Regression2.scala:116)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
	at Regression2$.mainRegression(Regression2.scala:103)
	at MainClass$.main(MainClass.scala:47)
	at MainClass.main(MainClass.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
	at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:192)
	at org.apache.flink.client.CliFrontend.info(CliFrontend.java:399)
	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:959)
	at org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:174)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
	at org.eclipse.jetty.server.Server.handle(Server.java:352)
	at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
	at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
	at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: breeze.storage.Zero
	at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 39 more
Thanks for any help.

Best wishes,
Flo


Reply | Threaded
Open this post in threaded view
|

Re: Flink-ml multiple linear regression fit

Stephan Ewen
You are again missing a library.

There seems so be something quite complicated about your build setup.

I would go for the ML quickstart or Maven template, which will package a correct fat jar automatically.



On Sun, Sep 20, 2015 at 2:15 PM, Florian Heyl <[hidden email]> wrote:
Hi Stephan,

Yeah I forgot the breeze library. Thanks. Unfortunately there is still another problem when I am running the pipeline on the hdfs. 
I tried to figure out what the cause of the problem is and I am mainly stuck at the collect method for the datasets. 
// List( (1.0, 1.0), (2.0, 2.0), ... (1.0,1.0) )
val list_JoinPredictionAndOriginal = JoinPredictionAndOriginal.collect
The line causes the errors (see below). Maybe I am still missing some libraries. The jar is packed now with the breeze, netlib, flink-ml, flink-core kryo and minlog libraries.
Thank you for any help and your time.

Best wishes,
Flo

Error: java.lang.NoClassDefFoundError: org/netlib/blas/Ddot
at com.github.fommil.netlib.F2jBLAS.ddot(F2jBLAS.java:71)
at org.apache.flink.ml.math.BLAS$.dot(BLAS.scala:123)
at org.apache.flink.ml.math.BLAS$.dot(BLAS.scala:106)
at org.apache.flink.ml.optimization.LinearPrediction$.predict(PredictionFunction.scala:34)
at org.apache.flink.ml.optimization.GenericLossFunction.lossGradient(LossFunction.scala:83)
at org.apache.flink.ml.optimization.LossFunction$class.loss(LossFunction.scala:43)
at org.apache.flink.ml.optimization.GenericLossFunction.loss(LossFunction.scala:71)
at org.apache.flink.ml.optimization.GradientDescent$$anonfun$org$apache$flink$ml$optimization$GradientDescent$$calculateLoss$1.apply(GradientDescent.scala:237)
at org.apache.flink.ml.optimization.GradientDescent$$anonfun$org$apache$flink$ml$optimization$GradientDescent$$calculateLoss$1.apply(GradientDescent.scala:237)
at org.apache.flink.ml.package$BroadcastSingleElementMapper.map(package.scala:86)
at org.apache.flink.runtime.operators.MapDriver.run(MapDriver.java:97)
at org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:489)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.netlib.blas.Ddot
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 15 more

Error: java.io.IOException: Materialization of the broadcast variable failed.
at org.apache.flink.runtime.broadcast.BroadcastVariableMaterialization.materializeVariable(BroadcastVariableMaterialization.java:154)
at org.apache.flink.runtime.broadcast.BroadcastVariableManager.materializeBroadcastVariable(BroadcastVariableManager.java:50)
at org.apache.flink.runtime.operators.RegularPactTask.readAndSetBroadcastInput(RegularPactTask.java:432)
at org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:350)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:581)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.runtime.io.network.partition.ProducerFailedException
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.getNextLookAhead(LocalInputChannel.java:270)
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.onNotification(LocalInputChannel.java:238)
at org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.release(PipelinedSubpartition.java:158)
at org.apache.flink.runtime.io.network.partition.ResultPartition.release(ResultPartition.java:300)
at org.apache.flink.runtime.io.network.partition.ResultPartitionManager.releasePartitionsProducedBy(ResultPartitionManager.java:95)
at org.apache.flink.runtime.io.network.NetworkEnvironment.unregisterTask(NetworkEnvironment.java:357)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:679)
... 1 more



Am 20.09.2015 um 02:01 schrieb Stephan Ewen <[hidden email]>:

Hi!

Looks like you submitted the program JAR, but it did not contain all required libraries, like the breeze JAR.

Did you build a proper fat jar, or how did you package the program?

Greetings,
Stephan

On Fri, Sep 18, 2015 at 8:22 PM, Florian Heyl <[hidden email]> wrote:
Hey Guys need your help again,
I am currently having problems with the multiple linear regression from the flink-ml on the HDFS. 
Locally it works fine with the 0.9-SNAPSHOT. The cluster runs with the 0.10-SNAPSHOT. The code is the following:
// set linear regression with parameters:
val mlr = MultipleLinearRegression()
.setStepsize(0.001)
.setIterations(1000000000)
.setConvergenceThreshold(0.001)

// do linear regression and time the method
val model = mlr.fit(transformTrain)

// The fitted model can now be used to make predictions
val predictions = mlr.predict(tranformTest)
The dataset transformTrain has the following form (filled with doubles):
LabeledVector(numList(0), DenseVector(numList(1),numList(2)))
Mainly the line where the fit method (mlr.fit) is called causes the following error:

An error occurred while invoking the program:


The program caused an error:

java.lang.NoClassDefFoundError: breeze/storage/Zero
	at org.apache.flink.ml.pipeline.Estimator$class.fit(Estimator.scala:53)
	at org.apache.flink.ml.regression.MultipleLinearRegression.fit(MultipleLinearRegression.scala:88)
	at Regression2$.buildModelRegression(Regression2.scala:37)
	at Regression2$$anonfun$mainRegression$1.apply$mcVI$sp(Regression2.scala:116)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
	at Regression2$.mainRegression(Regression2.scala:103)
	at MainClass$.main(MainClass.scala:47)
	at MainClass.main(MainClass.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:483)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
	at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:192)
	at org.apache.flink.client.CliFrontend.info(CliFrontend.java:399)
	at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:959)
	at org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:174)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:734)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:847)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
	at org.eclipse.jetty.server.Server.handle(Server.java:352)
	at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
	at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
	at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: breeze.storage.Zero
	at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 39 more
Thanks for any help.

Best wishes,
Flo