Re: Error reporting for Flink jobs
Posted by
Timo Walther on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Error-reporting-for-Flink-jobs-tp36256p36264.html
Hi Satyam,
I'm not aware of an API to solve all your problems at once. A common
pattern for failures in user-code is to catch errors in user-code and
define a side output for an operator to pipe the errors to dedicated
sinks. However, such a functionality does not exist in SQL yet. For the
sink part, it might be useful to look into the StreamingFileSink [1]
which provides better failure handling guarantees. Flink 1.11 will be
shipped with a SQL streaming file sink.
Regards,
Timo
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.htmlOn 28.06.20 12:27, Satyam Shekhar wrote:
> Hello,
>
> I am using Flink as the query engine for running SQL queries on both
> batch and streaming data. I use the Blink planner in batch and streaming
> mode respectively for the two cases.
>
> In my current setup, I execute the batch queries synchronously via
> StreamTableEnvironment::execute method. The job uses OutputFormat to
> consume results in StreamTableSink and send it to the user. In case
> there is an error/exception in the pipeline (possibly to user code), it
> is not reported to OutputFormat or the Sink. If an error occurs after
> the invocation of the write method on OutputFormat, the implementation
> may falsely assume that the result successful and complete since close
> is called in both success and failure cases. I can work around this, by
> checking for exceptions thrown by the execute method but that adds extra
> latency due to job tear down cost.
>
> A similar problem also exists for streaming jobs. In my setup, streaming
> jobs are executed asynchronously via
> StreamExecuteEnvironment::executeAsync. Since the sink interface has no
> methods to receive errors in the pipeline, the user code has to
> periodically track and manage persistent failures.
>
> Have I missed something in the API? Or Is there some other way to get
> access to error status in user code?
>
> Regards,
> Satyam