Hi to all,
I'm using Flink 1.2.0 and I have a job that (at some point) calls dataset.first(1M). Sometimes the records sent displayed by the UI are less than 1M (lik 999709). Is it possible that the UI (or the internal Flink counters) miss some record? Best, Flavio Flavio Pompermaier Development Department OKKAM S.r.l. Tel. +(39) 0461 1823908 |
Hey Flavio,
it's unlikely that the counters skip a record. For the webUI these metrics are transported in 2 different ways: For running tasks they are fetched through the metric system; this provides no guarantee that the final count is ever displayed. For finished tasks the final count is stored in the ExecutionGraph and should show an accurate final count. So, the question is in which state your task is. Regards, Chesnay On 05.04.2017 09:55, Flavio Pompermaier wrote:
|
My job is a batch one. Here's an image of two different execution of the job. The third line is where the first(1M) is called. In the left side the count is what I expect, in the second is slightly less :( Any idea? On Wed, Apr 5, 2017 at 12:51 PM, Chesnay Schepler <[hidden email]> wrote:
Flavio Pompermaier Development Department OKKAM S.r.l. Tel. +(39) 0461 1823908 |
Could you verify with a custom UDF that
actually 1m records are being produced?
Since 3 separate tasks report a consistent number of incoming/outgoing records I would rule out an issue in the metric system. These metrics are all counted separately from each other; having the same inconsistency everywhere is nigh impossible. Is this reproducible, and is it possible for you to provide me with the job used? On 05.04.2017 13:52, Flavio Pompermaier wrote:
|
Free forum by Nabble | Edit this page |