FabianThank you,Bowen Li (in CC) closed the issue but there is no fix (or at least it is not linked in the JIRA).Maybe it was resolved in another issue or can be differently resolved.@Bowen, can you comment on how to fix this problem? Will it work in Flink 1.4.0?2017-12-13 5:28 GMT+01:00 Hao Sun <[hidden email]>:https://issues.apache.org/jira/browse/FLINK-7590
I have a similar situation with Flink 1.3.2 on K8S
=========
2017-12-13 00:57:12,403 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: KafkaSource(maxwell.tickets) -> MaxwellFilter->Maxwell(maxwell .tickets) -> FixedDelayWatermark(maxwell.ti ckets) -> MaxwellFPSEvent->InfluxDBData( maxwell.tickets) -> Sink: influxdbSink(maxwell.tickets) (1/3) (6ad009755a6009975d197e75afa05 e14) switched from RUNNING to FAILED. AsynchronousException{java.lan g.Exception: Could not materialize checkpoint 803 for operator Source: KafkaSource(maxwell.tickets) -> MaxwellFilter->Maxwell(maxwell .tickets) -> FixedDelayWatermark(maxwell.ti ckets) -> MaxwellFPSEvent->InfluxDBData( maxwell.tickets) -> Sink: influxdbSink(maxwell.tickets) (1/3).} at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:970) at java.util.concurrent.Executors $RunnableAdapter.call( Executors.java:511) at java.util.concurrent.FutureTas k.run(FutureTask.java:266) at java.util.concurrent.ThreadPoo lExecutor.runWorker(ThreadPool Executor.java:1149) at java.util.concurrent.ThreadPoo lExecutor$Worker.run(ThreadPoo lExecutor.java:624) at java.lang.Thread.run(Thread.ja va:748) Caused by: java.lang.Exception: Could not materialize checkpoint 803 for operator Source: KafkaSource(maxwell.tickets) -> MaxwellFilter->Maxwell(maxwell .tickets) -> FixedDelayWatermark(maxwell.ti ckets) -> MaxwellFPSEvent->InfluxDBData( maxwell.tickets) -> Sink: influxdbSink(maxwell.tickets) (1/3). ... 6 more Caused by: java.util.concurrent.Execution Exception: java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle at java.util.concurrent.FutureTas k.report(FutureTask.java:122) at java.util.concurrent.FutureTas k.get(FutureTask.java:192) at org.apache.flink.util.FutureUt il.runIfNotDoneAndGet(FutureUt il.java:43) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:906) ... 5 more Suppressed: java.lang.Exception: Could not properly cancel managed operator state future. at org.apache.flink.streaming.api .operators.OperatorSnapshotRes ult.cancel(OperatorSnapshotRes ult.java:98) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.cleanup( StreamTask.java:1023) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:961) ... 5 more Caused by: java.util.concurrent.Execution Exception: java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle at java.util.concurrent.FutureTas k.report(FutureTask.java:122) at java.util.concurrent.FutureTas k.get(FutureTask.java:192) at org.apache.flink.util.FutureUt il.runIfNotDoneAndGet(FutureUt il.java:43) at org.apache.flink.runtime.state .StateUtil.discardStateFuture( StateUtil.java:85) at org.apache.flink.streaming.api .operators.OperatorSnapshotRes ult.cancel(OperatorSnapshotRes ult.java:96) ... 7 more Caused by: java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle at org.apache.flink.runtime.state .filesystem.FsCheckpointStream Factory$FsCheckpointStateOutpu tStream.closeAndGetHandle(FsCh eckpointStreamFactory.java: 336) at org.apache.flink.runtime.check point.AbstractAsyncSnapshotIOC allable.closeStreamAndGetState Handle(AbstractAsyncSnapshotIO Callable.java:100) at org.apache.flink.runtime.state .DefaultOperatorStateBackend$ 1.performOperation(DefaultOper atorStateBackend.java:270) at org.apache.flink.runtime.state .DefaultOperatorStateBackend$ 1.performOperation(DefaultOper atorStateBackend.java:233) at org.apache.flink.runtime.io.as ync.AbstractAsyncIOCallable.ca ll(AbstractAsyncIOCallable.jav a:72) at java.util.concurrent.FutureTas k.run(FutureTask.java:266) at org.apache.flink.util.FutureUt il.runIfNotDoneAndGet(FutureUt il.java:40) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:906) ... 5 more Caused by: com.amazonaws.services.s3.mode l.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 751174B20E6C6A0A, AWS Error Code: RequestTimeout, AWS Error Message: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed., S3 Extended Request ID: dADBPVGflB29xtFb7ydxD2SU3LzHw2 cBkumOK5EX4TYgt+LVErSOShxPkZmG rCvmT39FHDbIryc= at com.amazonaws.http.AmazonHttpC lient.handleErrorResponse(Amaz onHttpClient.java:798) at com.amazonaws.http.AmazonHttpC lient.executeHelper(AmazonHttp Client.java:421) at com.amazonaws.http.AmazonHttpC lient.execute(AmazonHttpClient .java:232) at com.amazonaws.services.s3.Amaz onS3Client.invoke(AmazonS3Clie nt.java:3528) at com.amazonaws.services.s3.Amaz onS3Client.putObject(AmazonS3C lient.java:1393) at com.amazonaws.services.s3.tran sfer.internal.UploadCallable.u ploadInOneChunk(UploadCallable .java:108) at com.amazonaws.services.s3.tran sfer.internal.UploadCallable. call(UploadCallable.java:100) at com.amazonaws.services.s3.tran sfer.internal.UploadMonitor. upload(UploadMonitor.java:192) at com.amazonaws.services.s3.tran sfer.internal.UploadMonitor. call(UploadMonitor.java:150) at com.amazonaws.services.s3.tran sfer.internal.UploadMonitor. call(UploadMonitor.java:50) ... 4 more [CIRCULAR REFERENCE:java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle] 2017-12-13 00:57:12,404 INFO org.apache.flink.runtime.execu tiongraph.ExecutionGraph - Job KafkaDemo maxwell.tickets (env:production) (d5a8b2ab61625cf0aa1e66360b7ad 0af) switched from state RUNNING to FAILING. AsynchronousException{java.lan g.Exception: Could not materialize checkpoint 803 for operator Source: KafkaSource(maxwell.tickets) -> MaxwellFilter->Maxwell(maxwell .tickets) -> FixedDelayWatermark(maxwell.ti ckets) -> MaxwellFPSEvent->InfluxDBData( maxwell.tickets) -> Sink: influxdbSink(maxwell.tickets) (1/3).} at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:970) at java.util.concurrent.Executors $RunnableAdapter.call( Executors.java:511) at java.util.concurrent.FutureTas k.run(FutureTask.java:266) at java.util.concurrent.ThreadPoo lExecutor.runWorker(ThreadPool Executor.java:1149) at java.util.concurrent.ThreadPoo lExecutor$Worker.run(ThreadPoo lExecutor.java:624) at java.lang.Thread.run(Thread.ja va:748) Caused by: java.lang.Exception: Could not materialize checkpoint 803 for operator Source: KafkaSource(maxwell.tickets) -> MaxwellFilter->Maxwell(maxwell .tickets) -> FixedDelayWatermark(maxwell.ti ckets) -> MaxwellFPSEvent->InfluxDBData( maxwell.tickets) -> Sink: influxdbSink(maxwell.tickets) (1/3). ... 6 more Caused by: java.util.concurrent.Execution Exception: java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle at java.util.concurrent.FutureTas k.report(FutureTask.java:122) at java.util.concurrent.FutureTas k.get(FutureTask.java:192) at org.apache.flink.util.FutureUt il.runIfNotDoneAndGet(FutureUt il.java:43) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:906) ... 5 more Suppressed: java.lang.Exception: Could not properly cancel managed operator state future. at org.apache.flink.streaming.api .operators.OperatorSnapshotRes ult.cancel(OperatorSnapshotRes ult.java:98) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.cleanup( StreamTask.java:1023) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:961) ... 5 more Caused by: java.util.concurrent.Execution Exception: java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle at java.util.concurrent.FutureTas k.report(FutureTask.java:122) at java.util.concurrent.FutureTas k.get(FutureTask.java:192) at org.apache.flink.util.FutureUt il.runIfNotDoneAndGet(FutureUt il.java:43) at org.apache.flink.runtime.state .StateUtil.discardStateFuture( StateUtil.java:85) at org.apache.flink.streaming.api .operators.OperatorSnapshotRes ult.cancel(OperatorSnapshotRes ult.java:96) ... 7 more Caused by: java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle at org.apache.flink.runtime.state .filesystem.FsCheckpointStream Factory$FsCheckpointStateOutpu tStream.closeAndGetHandle(FsCh eckpointStreamFactory.java: 336) at org.apache.flink.runtime.check point.AbstractAsyncSnapshotIOC allable.closeStreamAndGetState Handle(AbstractAsyncSnapshotIO Callable.java:100) at org.apache.flink.runtime.state .DefaultOperatorStateBackend$ 1.performOperation(DefaultOper atorStateBackend.java:270) at org.apache.flink.runtime.state .DefaultOperatorStateBackend$ 1.performOperation(DefaultOper atorStateBackend.java:233) at org.apache.flink.runtime.io.as ync.AbstractAsyncIOCallable.ca ll(AbstractAsyncIOCallable.jav a:72) at java.util.concurrent.FutureTas k.run(FutureTask.java:266) at org.apache.flink.util.FutureUt il.runIfNotDoneAndGet(FutureUt il.java:40) at org.apache.flink.streaming.run time.tasks.StreamTask$AsyncChe ckpointRunnable.run(StreamTask .java:906) ... 5 more Caused by: com.amazonaws.services.s3.mode l.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 751174B20E6C6A0A, AWS Error Code: RequestTimeout, AWS Error Message: Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed., S3 Extended Request ID: dADBPVGflB29xtFb7ydxD2SU3LzHw2 cBkumOK5EX4TYgt+LVErSOShxPkZmG rCvmT39FHDbIryc= at com.amazonaws.http.AmazonHttpC lient.handleErrorResponse(Amaz onHttpClient.java:798) at com.amazonaws.http.AmazonHttpC lient.executeHelper(AmazonHttp Client.java:421) at com.amazonaws.http.AmazonHttpC lient.execute(AmazonHttpClient .java:232) at com.amazonaws.services.s3.Amaz onS3Client.invoke(AmazonS3Clie nt.java:3528) at com.amazonaws.services.s3.Amaz onS3Client.putObject(AmazonS3C lient.java:1393) at com.amazonaws.services.s3.tran sfer.internal.UploadCallable.u ploadInOneChunk(UploadCallable .java:108) at com.amazonaws.services.s3.tran sfer.internal.UploadCallable. call(UploadCallable.java:100) at com.amazonaws.services.s3.tran sfer.internal.UploadMonitor. upload(UploadMonitor.java:192) at com.amazonaws.services.s3.tran sfer.internal.UploadMonitor. call(UploadMonitor.java:150) at com.amazonaws.services.s3.tran sfer.internal.UploadMonitor. call(UploadMonitor.java:50) ... 4 more [CIRCULAR REFERENCE:java.io.IOException: Could not flush and close the file system output stream to s3a://zendesk-euc1-fraud-preve ntion-production/checkpoints/d 5a8b2ab61625cf0aa1e66360b7ad0a f/chk-803/4f485204-3ec5-402a-a 57d-fab13e068cbc in order to obtain the stream state handle]
Free forum by Nabble | Edit this page |