Hi,
I'm running a long-running flink job in cluster mode and I'm interested in using the queryable state functionality.
I have the following problem: when I query the flink task managers (i.e. the queryable state proxy), it is possible to hit a task manager which doesn't have the requested state, because the job is not running on that task manager.
For example, I might have a cluster with 5 task managers, but the job is deployed only on 3 of those. If my query hits any of the two idle task managers, I naturally get an error message that the job does not exist.
My current solution is to size the cluster appropriately so that there are no idle task managers. I was wondering if there was a better solution or if this could be handled better in the future?
Thanks in advance.
Kind regards,
Martin