Hi,
I came across an issue during job submission via Flink Cli Client with Flink 1.7.1 in high availability mode.
Setup:
Flink version:: 1.7.1
Cluster:: K8s
Mode:: High availability with 2 jobmanagers
CLI Command
./bin/flink run -d -c MyExample /myexample.jar
The CLI runs inside a K8s job and submits the Flink job to the Flink cluster. The K8s job spec allows it to try 3 times to submit the job.
Result:
2019-09-11 22:32:12.908 [Flink-RestClusterClient-IO-thread-4] level=DEBUG org.apache.flink.runtime.rest.RestClient - Sending request of class class org.apache.flink.runtime.rest.messages.job.JobSubmitRequestBody
to job-jm-1.job-jm-svc.job-namespace.svc.cluster.local:8081/v1/jobs
2019-09-11 22:32:14.186 [flink-rest-client-netty-thread-1] level=ERROR org.apache.flink.runtime.rest.RestClient - Response was not valid JSON.
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
at [Source: org.apache.flink.shaded.netty4.io.netty.buffer.ByteBufInputStream@2b88f8bb; line: 1, column: 0]
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:256)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3851)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3792)
at org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2272)
at org.apache.flink.runtime.rest.RestClient$ClientHandler.readRawResponse(RestClient.java:504)
at org.apache.flink.runtime.rest.RestClient$ClientHandler.channelRead0(RestClient.java:452)
………
2019-09-11 22:32:14.186 [flink-rest-client-netty-thread-1] level=ERROR org.apache.flink.runtime.rest.RestClient - Unexpected plain-text response:
……..
The job submission fails after exhausting the number of retries.
Observations:
I looked into the debug logs & Flink code to come to below conclusions –
Open questions:
Can someone help check and confirm my observations above and help answer the questions?
Highly appreciate your time and help.
~ Abhinav Bajaj
CLI Logs
-
2019-09-11 22:30:31.077 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - Leader node has changed.
2019-09-11 22:30:31.171 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - New leader information: Leader=http://job-jm-1.job-jm-svc.job-namespace.svc.cluster.local:8081, session ID=c1422a1b-a6b8-43b0-85d7-87b95af16932.
……
2019-09-11 22:30:31.270 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - Leader node has changed
2019-09-11 22:30:31.270 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - New leader information: Leader=akka.tcp://[hidden email]:6126/user/dispatcher, session ID=4e4d03d5-2abe-449c-af2e-df2e0cd80e26
job-jm-0 Logs -
2019-09-11 22:29:59.781 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - New leader information: Leader=akka.tcp://[hidden email]:6126/user/resourcemanager, session ID=e1f026b1-e368-4524-9fab-2e031423f74f.
2019-09-11 22:29:59.876 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - New leader information: Leader=akka.tcp://[hidden email]:6126/user/dispatcher, session ID=4e4d03d5-2abe-449c-af2e-df2e0cd80e26.
job-jm-1 Logs -
2019-09-11 22:29:59.889 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - New leader information: Leader=akka.tcp://[hidden email]:6126/user/resourcemanager, session ID=e1f026b1-e368-4524-9fab-2e031423f74f.
2019-09-11 22:29:59.976 [main-EventThread] level=DEBUG o.a.f.r.leaderretrieval.ZooKeeperLeaderRetrievalService - New leader information: Leader=akka.tcp://[hidden email]:6126/user/dispatcher, session ID=4e4d03d5-2abe-449c-af2e-df2e0cd80e26.
Free forum by Nabble | Edit this page |