Re: Recursive Traversal of the Input Path Directory, Not working

Posted by Adarsh Jain on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Recursive-Traversal-of-the-Input-Path-Directory-Not-working-tp13926p14012.html

Thanks Stefan, my colleague Shashank has filed a bug for the same in jira

https://issues.apache.org/jira/browse/FLINK-6993

Regards,
Adarsh

On Fri, Jun 23, 2017 at 8:19 PM, Stefan Richter <[hidden email]> wrote:
Hi,

I suggest that you simply open an issue for this in our jira, describing the improvement idea. That should be the fastest way to get this changed.

Best,
Stefan

Am 23.06.2017 um 15:08 schrieb Adarsh Jain <[hidden email]>:

<img width="0" height="0" class="m_-8136222731565930424mailtrack-img" style="float:right" alt="" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7">Hi Stefan,

I think I found the problem, try it with a file which starts with underscore in the name like "_part-1-0.csv".

While saving Flink appends a "_" to the file name however while reading at folder level it does not pick those files.

Can you suggest if we can do a setting so that it does not pre appends underscore while saving a file.

Regards,
Adarsh

On Fri, Jun 23, 2017 at 3:24 PM, Stefan Richter <[hidden email]> wrote:
No, that doesn’t make a difference and also works.

Am 23.06.2017 um 11:40 schrieb Adarsh Jain <[hidden email]>:

I am using "val env = ExecutionEnvironment.getExecutionEnvironment", can this be the problem?

With "import org.apache.flink.api.scala.ExecutionEnvironment"

Using scala in my program.

Regards,
Adarsh 

On Fri, Jun 23, 2017 at 3:01 PM, Stefan Richter <[hidden email]> wrote:
I just copy pasted your code, adding the missing "val env = LocalEnvironment.createLocalEnvironment()" and exchanged the string with a local directory for some test files that I created. No other changes.

Am 23.06.2017 um 11:25 schrieb Adarsh Jain <[hidden email]>:

Hi Stefan,

Thanks for your efforts in checking the same, still doesn't work for me. 

Can you copy paste the code you used maybe I am doing some silly mistake and am not able to figure out the same.

Thanks again.

Regards,
Adarsh


On Fri, Jun 23, 2017 at 2:32 PM, Stefan Richter <[hidden email]> wrote:
Hi,

I tried this out on the current master and the 1.3 release and both work for me everything works exactly as expected, for file names, a directory, and even nested directories.

Best,
Stefan

Am 22.06.2017 um 21:13 schrieb Adarsh Jain <[hidden email]>:

Hi Stefan,

Yes your understood right, when I give full path till the filename it works fine however when I give path till 
directory it does not read the data, doesn't print any exceptions too ... I am also not sure why it is behaving like this.

Should be easily replicable, in case you can try. Will be really helpful.

Regards,
Adarsh

On Thu, Jun 22, 2017 at 9:00 PM, Stefan Richter <[hidden email]> wrote:
Hi,

I am not sure I am getting the problem right: the code works if you use a file name, but it does not work for directories? What exactly is not working? Do you get any exceptions?

Best,
Stefan

Am 22.06.2017 um 17:01 schrieb Adarsh Jain <[hidden email]>:

Hi,

I am trying to use "Recursive Traversal of the Input Path Directory" in Flink 1.3 using scala. Snippet of my code below. If I give exact file name it is working fine. Ref https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/index.html

import org.apache.flink.api.java.utils.ParameterTool
import org.apache.flink.api.java.{DataSet, ExecutionEnvironment}
import org.apache.flink.configuration.Configuration

val config = new Configuration
    config.setBoolean("recursive.file.enumeration",true)


val testInput = env.readTextFile(featuresSource).withParameters(config)
testInput.print()

Please guide how to fix this.

Regards,
Adarsh