Blobstore exceptions.

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Blobstore exceptions.

Lasse Nedergaard
Hi.

We sometimes see job fails with a blob store exception, like the one below. Anyone has an idea why we get them, and how to avoid them?.
In this case the job has run without any problems for a week and then we get the error. Only this job are affected right now all other running as expected and next time it can be one of the other jobs that get the exception. 

We running Flink 1.4.2, on AWS EMR cluster, but we have seen the same problems on 1.3.2 too.

Anyone 

java.io.IOException: Failed to fetch BLOB ff5d324719fb4caf3a0dba3fbcfa795e/p-812d84ea013302dbd24da1d32e732cc01582dabc-3198b6f63d293d2756f4cf5b8eebe7a2 from ip-10-1-1-192.eu-west-1.compute.internal/10.1.1.192:46781 and store it under /tmp/blobStore-3e90d7b0-2f40-4e28-b2b0-01d9ba96ac55/incoming/temp-00000173
	at org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:191)
	at org.apache.flink.runtime.blob.AbstractBlobCache.getFileInternal(AbstractBlobCache.java:177)
	at org.apache.flink.runtime.blob.PermanentBlobCache.getFile(PermanentBlobCache.java:205)
	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:119)
	at org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:878)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: GET operation failed: Server side error: /tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
	at org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:253)
	at org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:166)
	... 6 more
Caused by: java.io.IOException: Server side error: /tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
	at org.apache.flink.runtime.blob.BlobClient.receiveAndCheckGetResponse(BlobClient.java:306)
	at org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:247)
	... 7 more
Caused by: java.nio.file.NoSuchFileException: /tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
	at java.nio.file.Files.move(Files.java:1395)
	at org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:452)
	at org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:521)
	at org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231)
	at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117)

Reply | Threaded
Open this post in threaded view
|

Re: Blobstore exceptions.

Renjie Liu
You need to check that whether you disk is full.

On Thu, Jun 14, 2018 at 2:15 PM Lasse Nedergaard <[hidden email]> wrote:
Hi.

We sometimes see job fails with a blob store exception, like the one below. Anyone has an idea why we get them, and how to avoid them?.
In this case the job has run without any problems for a week and then we get the error. Only this job are affected right now all other running as expected and next time it can be one of the other jobs that get the exception. 

We running Flink 1.4.2, on AWS EMR cluster, but we have seen the same problems on 1.3.2 too.

Anyone 

java.io.IOException: Failed to fetch BLOB ff5d324719fb4caf3a0dba3fbcfa795e/p-812d84ea013302dbd24da1d32e732cc01582dabc-3198b6f63d293d2756f4cf5b8eebe7a2 from ip-10-1-1-192.eu-west-1.compute.internal/10.1.1.192:46781 and store it under /tmp/blobStore-3e90d7b0-2f40-4e28-b2b0-01d9ba96ac55/incoming/temp-00000173
	at org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:191)
	at org.apache.flink.runtime.blob.AbstractBlobCache.getFileInternal(AbstractBlobCache.java:177)
	at org.apache.flink.runtime.blob.PermanentBlobCache.getFile(PermanentBlobCache.java:205)
	at org.apache.flink.runtime.execution.librarycache.BlobLibraryCacheManager.registerTask(BlobLibraryCacheManager.java:119)
	at org.apache.flink.runtime.taskmanager.Task.createUserCodeClassloader(Task.java:878)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:589)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: GET operation failed: Server side error: /tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
	at org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:253)
	at org.apache.flink.runtime.blob.BlobClient.downloadFromBlobServer(BlobClient.java:166)
	... 6 more
Caused by: java.io.IOException: Server side error: /tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
	at org.apache.flink.runtime.blob.BlobClient.receiveAndCheckGetResponse(BlobClient.java:306)
	at org.apache.flink.runtime.blob.BlobClient.getInternal(BlobClient.java:247)
	... 7 more
Caused by: java.nio.file.NoSuchFileException: /tmp/blobStore-a83b8ca6-c01a-496a-8997-31687f37b95d/incoming/temp-00049050
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
	at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
	at java.nio.file.Files.move(Files.java:1395)
	at org.apache.flink.runtime.blob.BlobUtils.moveTempFileToStore(BlobUtils.java:452)
	at org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:521)
	at org.apache.flink.runtime.blob.BlobServerConnection.get(BlobServerConnection.java:231)
	at org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:117)

--
Liu, Renjie
Software Engineer, MVAD