what is the hash function that Flink creates the UID?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

what is the hash function that Flink creates the UID?

Felipe Gutierrez
Hi there!

I am tracking the latency of my operators using "setLatencyTrackingInterval(10000)" and I can see the latency metrics on the browser http://127.0.0.1:8081/jobs/<JOB_ID>/metrics . For each logical operator I set a .uid("operator_name") and I know that Flink uses the UidHash to create a string for each operator. For example my operator "A" has the hash code "2e588ce1c86a9d46e2e85186773ce4fd".

What is the hash function used to define this hash code?

I want to use the same hash function to be able to automatically monitor the 99th percentile latency. AFAIK, Flink does not provide a way to create an operator ID that has the operator name included [1][2]. Is there a specific reason for that?


--
-- Felipe Gutierrez
-- skype: felipe.o.gutierrez
Reply | Threaded
Open this post in threaded view
|

Re: what is the hash function that Flink creates the UID?

Tzu-Li (Gordon) Tai
Hi,

Flink currently performs a 128-bit murmur hash on the user-provided uids to
generate the final node hashes in the stream graph. Specifically, this
library is being used [1] as the hash function.

If what you are looking for is for Flink to use exactly the provided hash,
you can use `setUidHash` for that - Flink will use that provided uid hash as
is for the generated node hashes.
However, that was exposed as a means for manual workarounds to allow for
backwards compatibility in legacy breaking cases, so it is not advised to
use that in your case.

BR,
Gordon

[1]
https://guava.dev/releases/19.0/api/docs/com/google/common/hash/Hashing.html#murmur3_128(int)



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/