Hi all,
this is probably related to the problem that I reported in December. In case it helps you can find a self contained example below. I haven't looked deeply into the problem but it seems like the correct file splits are determined but somehow not processed. If I read from HDFS nested files are skipped as well which is a real problem for me at the moment. Cheers, Lukas import org.apache.flink.api.java.io.TextInputFormat;
|
Hi Lukas,
have you tried to set the parameter " recursive.file.enumeration" to true?
If this also does not work, I think this could be a bug. You can
open an issue for it and attach your sample code.Timo Am 09/01/17 um 13:47 schrieb Lukas Kircher: Hi all,
|
Hi Lukas,
Are you sure that the tempFile.deleteOnExit() does not remove the files before the test completes. I am just asking to be sure. Also from the code, I suppose that you run it locally. I suspect that the problem is in the way the input format scans nested files, but could you see if in the code that is executed by the tasks, the nestedFileEnumeration parameter is still true? I am asking in order to pin down if the problem is in the way we ship the code to the tasks or in reading the nested files. Thanks, Kostas
|
Thanks for your suggestions:
@Timo 1) Regarding the recursive.file.enumeration parameter: I think what counts here is the enumerateNestedFiles parameter in FileInputFormat.java. Calling the setter for enumerateNestedFiles is expected to overwrite recursive.file.enumeration. Not literally - I think recursive.file.enumeration is simply to be ignored here. This is tested in TextInputFormatTest.testNestedFileRead(). @Kostas 2) tempFile.deleteOnExit(): If I remove the line I get the same result. Only the content of the file in the top-level tmp directory is printed. I derived the SSCCE from my real use-case where I encountered the problem originally. I don't mess with the input files there in any way. 3) The given example is run locally. In TextInputFormat.readRecord(String, byte[], int, int) the nestedFileEnumeration parameter is true during execution. Is this what you meant? Cheers, Lukas
|
Yes, thanks for the effort. I will look into it.
Kostas
|
Free forum by Nabble | Edit this page |