Understanding the behavior

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding the behavior

Maximilian Alber
Hi Flinksters,

after mapping a data set, the only value seems to disappear. I cannot explain this behavior. Maybe someone can help me?

In this code I have 4 versions, the first two do basically nothing, but ensure us that there is actually a value inside the dataset.
Version 2 maps the vector to a new vector. But the result set is empty.
Version 3 the same.

What I would like to achieve is version 3 aka change the id value of the vector. But somehow the vector disappears and the result is always an empty set.

val startWidth = env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) * config.startWidth)) map {x => new Vector(0, x.values)}
val startUpdate = env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) * 0.01F)) map {x => new Vector(1, x.values)}
val startLastGradient = env.fromCollection[Vector](Seq(Vector.zeros(config.dimensions))) map {x => new Vector(2, x.values)}

var stepSet = startWidth union startUpdate union startLastGradient
stepSet = stepSet.iterate(1){
    stepSet =>
    // version 1
    val width = stepSet filter {_.id == 0};// works
    // version 2
    val width = stepSet filter {_.id == 0} map {x => x};// works
    // version 3
    val width = stepSet filter {_.id == 0} map {x => new Vector(-1, x.values)};// does not work
    // version 4
    val width = stepSet filter {_.id == 0} map {x: Vector => new Vector(23, Array(1.0F, 2.0F))};// does not work
  width
}


I append you jar, source code and input files.
The program writes into the the out_file the width dataset.
You can change the code "versions" at line 353 cont.

May call the program with (you need to update jar, in_file, random_file, set out_file as you want):
flink run the_jar_file '-c', 'bumpboost.Job', 'in_file=/tmp/tmpdW3O98', 'out_file=/tmp/tmp2RISRF', 'random_file=/tmp/tmpEN9XU7', 'dimensions=1', 'N=100', 'iterations=30', 'multi_bump_boost=1', 'gradient_descent_iterations=30', 'cache=False', 'start_width=1.0', 'min_width=-4', 'max_width=6', 'min_width_update=1e-08', 'max_width_update=10'

Thank you!
Cheers,
Max

in_file (5K) Download Attachment
random_file (2K) Download Attachment
bump_boost.tar.gz (767K) Download Attachment
bump_boost-0.1.jar (647K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the behavior

Aljoscha Krettek
Hi,
the reasons is that you filter out the items. In your first iteration,
a new element is created that has a -1 and a 23 as the first field,
respectively for version 3 and version 4. In the second iteration, you
filter out all elements that do not have a "0" as the first field.
Thus you arrive at an empty set.

Cheers,
Aljoscha

On Thu, Dec 11, 2014 at 12:15 PM, Maximilian Alber
<[hidden email]> wrote:

> Hi Flinksters,
>
> after mapping a data set, the only value seems to disappear. I cannot
> explain this behavior. Maybe someone can help me?
>
> In this code I have 4 versions, the first two do basically nothing, but
> ensure us that there is actually a value inside the dataset.
> Version 2 maps the vector to a new vector. But the result set is empty.
> Version 3 the same.
>
> What I would like to achieve is version 3 aka change the id value of the
> vector. But somehow the vector disappears and the result is always an empty
> set.
>
> val startWidth =
> env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) *
> config.startWidth)) map {x => new Vector(0, x.values)}
> val startUpdate =
> env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) * 0.01F)) map
> {x => new Vector(1, x.values)}
> val startLastGradient =
> env.fromCollection[Vector](Seq(Vector.zeros(config.dimensions))) map {x =>
> new Vector(2, x.values)}
>
> var stepSet = startWidth union startUpdate union startLastGradient
> stepSet = stepSet.iterate(1){
>     stepSet =>
>     // version 1
>     val width = stepSet filter {_.id == 0};// works
>     // version 2
>     val width = stepSet filter {_.id == 0} map {x => x};// works
>     // version 3
>     val width = stepSet filter {_.id == 0} map {x => new Vector(-1,
> x.values)};// does not work
>     // version 4
>     val width = stepSet filter {_.id == 0} map {x: Vector => new Vector(23,
> Array(1.0F, 2.0F))};// does not work
>   width
> }
>
>
> I append you jar, source code and input files.
> The program writes into the the out_file the width dataset.
> You can change the code "versions" at line 353 cont.
>
> May call the program with (you need to update jar, in_file, random_file, set
> out_file as you want):
> flink run the_jar_file '-c', 'bumpboost.Job', 'in_file=/tmp/tmpdW3O98',
> 'out_file=/tmp/tmp2RISRF', 'random_file=/tmp/tmpEN9XU7', 'dimensions=1',
> 'N=100', 'iterations=30', 'multi_bump_boost=1',
> 'gradient_descent_iterations=30', 'cache=False', 'start_width=1.0',
> 'min_width=-4', 'max_width=6', 'min_width_update=1e-08',
> 'max_width_update=10'
>
> Thank you!
> Cheers,
> Max
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the behavior

Maximilian Alber
Hi!

Damn, I thought I set the iteration count at 1, but I did it in the wrong place. My bad.
Thanks!!

Cheers,
max

On Thu, Dec 11, 2014 at 12:41 PM, Aljoscha Krettek <[hidden email]> wrote:
Hi,
the reasons is that you filter out the items. In your first iteration,
a new element is created that has a -1 and a 23 as the first field,
respectively for version 3 and version 4. In the second iteration, you
filter out all elements that do not have a "0" as the first field.
Thus you arrive at an empty set.

Cheers,
Aljoscha

On Thu, Dec 11, 2014 at 12:15 PM, Maximilian Alber
<[hidden email]> wrote:
> Hi Flinksters,
>
> after mapping a data set, the only value seems to disappear. I cannot
> explain this behavior. Maybe someone can help me?
>
> In this code I have 4 versions, the first two do basically nothing, but
> ensure us that there is actually a value inside the dataset.
> Version 2 maps the vector to a new vector. But the result set is empty.
> Version 3 the same.
>
> What I would like to achieve is version 3 aka change the id value of the
> vector. But somehow the vector disappears and the result is always an empty
> set.
>
> val startWidth =
> env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) *
> config.startWidth)) map {x => new Vector(0, x.values)}
> val startUpdate =
> env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) * 0.01F)) map
> {x => new Vector(1, x.values)}
> val startLastGradient =
> env.fromCollection[Vector](Seq(Vector.zeros(config.dimensions))) map {x =>
> new Vector(2, x.values)}
>
> var stepSet = startWidth union startUpdate union startLastGradient
> stepSet = stepSet.iterate(1){
>     stepSet =>
>     // version 1
>     val width = stepSet filter {_.id == 0};// works
>     // version 2
>     val width = stepSet filter {_.id == 0} map {x => x};// works
>     // version 3
>     val width = stepSet filter {_.id == 0} map {x => new Vector(-1,
> x.values)};// does not work
>     // version 4
>     val width = stepSet filter {_.id == 0} map {x: Vector => new Vector(23,
> Array(1.0F, 2.0F))};// does not work
>   width
> }
>
>
> I append you jar, source code and input files.
> The program writes into the the out_file the width dataset.
> You can change the code "versions" at line 353 cont.
>
> May call the program with (you need to update jar, in_file, random_file, set
> out_file as you want):
> flink run the_jar_file '-c', 'bumpboost.Job', 'in_file=/tmp/tmpdW3O98',
> 'out_file=/tmp/tmp2RISRF', 'random_file=/tmp/tmpEN9XU7', 'dimensions=1',
> 'N=100', 'iterations=30', 'multi_bump_boost=1',
> 'gradient_descent_iterations=30', 'cache=False', 'start_width=1.0',
> 'min_width=-4', 'max_width=6', 'min_width_update=1e-08',
> 'max_width_update=10'
>
> Thank you!
> Cheers,
> Max