count the amount of data successfully processed by flink

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

count the amount of data successfully processed by flink

zzzyw
Dear all:
  I use flink for real-time data synchronization(mysql,oracle --> kafka -->
mysql,oracle). I want to count how many pieces of data are synchronized
every day(maybe need to count the last n days ).

  I am doing this now: flink metrics send to pushgateway, and then sum the
metrics (flink_taskmanager_job_task_numRecordsOut)  to count how many pieces
of data are synchronized every day,
but i found this metrics will reset after the flink job restart, how should
I deal with this problem? or is there any better way to count?

<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t3167/Snipaste_2021-05-24_10-51-22.png>

Best regards



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: count the amount of data successfully processed by flink

Guowei Ma
Hi
I think you are right that the metrics are reset after the job restart. It is because the metrics are only stored in the memory.
I think you could store the metrics to the Flink's state[1], which could be restored after the job restarted.


On Mon, May 24, 2021 at 10:59 AM zzzyw <[hidden email]> wrote:
Dear all:
  I use flink for real-time data synchronization(mysql,oracle --> kafka -->
mysql,oracle). I want to count how many pieces of data are synchronized
every day(maybe need to count the last n days ).

  I am doing this now: flink metrics send to pushgateway, and then sum the
metrics (flink_taskmanager_job_task_numRecordsOut)  to count how many pieces
of data are synchronized every day,
but i found this metrics will reset after the flink job restart, how should
I deal with this problem? or is there any better way to count?

<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t3167/Snipaste_2021-05-24_10-51-22.png>

Best regards



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re:Re: count the amount of data successfully processed by flink

zzzyw


Hi Guowei:
  Thanks for your help, it  solves my problem.




Best regards

At 2021-05-24 15:15:57, "Guowei Ma" <[hidden email]> wrote:

Hi
I think you are right that the metrics are reset after the job restart. It is because the metrics are only stored in the memory.
I think you could store the metrics to the Flink's state[1], which could be restored after the job restarted.


On Mon, May 24, 2021 at 10:59 AM zzzyw <[hidden email]> wrote:
Dear all:
  I use flink for real-time data synchronization(mysql,oracle --> kafka -->
mysql,oracle). I want to count how many pieces of data are synchronized
every day(maybe need to count the last n days ).

  I am doing this now: flink metrics send to pushgateway, and then sum the
metrics (flink_taskmanager_job_task_numRecordsOut)  to count how many pieces
of data are synchronized every day,
but i found this metrics will reset after the flink job restart, how should
I deal with this problem? or is there any better way to count?

<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t3167/Snipaste_2021-05-24_10-51-22.png>

Best regards



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/