Re: Jira issue Flink-11127
Posted by
Boris Lublinsky on
URL: http://deprecated-apache-flink-user-mailing-list-archive.369.s1.nabble.com/Jira-issue-Flink-11127-tp26180p26282.html
Hi Boris,
the exact command depends on the docker-entrypoint.sh script and the image you are using. For the example contained in the Flink repository it is "task-manager", I think. The important thing is to pass "taskmanager.host" to the Taskmanager process. You can verify by checking the Taskmanager logs. These should contain lines like below:
2019-02-21 08:03:00,004 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] - Program Arguments:
2019-02-21 08:03:00,008 INFO org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] - -Dtaskmanager.host=10.12.10.173
In the Jobmanager logs you should see that the Taskmanager is registered under the IP above in a line similar to:
2019-02-21 08:03:26,874 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Registering TaskManager with ResourceID a0513ba2c472d2d1efc07626da9c1bda (akka.tcp://
flink@10.12.10.173:46531/user/taskmanager_0) at ResourceManager
A service per Taskmanager is not required. The purpose of the config parameter is that the Jobmanager addresses the taskmanagers by IP instead of hostname.
Hope this helps!
Cheers,
Konstantin
On Wed, Feb 20, 2019 at 4:37 PM Boris Lublinsky <
[hidden email]> wrote:
Also, The suggested workaround does not quite work.
I think the problem is that its trying to connect to flink-task-manager-1
Using busybody to experiment with nslookup, I can see
/ # nslookup flink-taskmanager-1.flink-taskmanager
Server: 10.0.11.151
Name: flink-taskmanager-1.flink-taskmanager
Address 1: 10.131.2.136 flink-taskmanager-1.flink-taskmanager.flink.svc.cluster.local
/ # nslookup flink-taskmanager-1
Server: 10.0.11.151
nslookup: can't resolve 'flink-taskmanager-1'
/ # nslookup flink-taskmanager-0.flink-taskmanager
Server: 10.0.11.151
Name: flink-taskmanager-0.flink-taskmanager
Address 1: 10.131.0.111 flink-taskmanager-0.flink-taskmanager.flink.svc.cluster.local
/ # nslookup flink-taskmanager-0
Server: 10.0.11.151
nslookup: can't resolve 'flink-taskmanager-0'
/ #
So the name should be postfixed with the service name. How do I force it? I suspect I am missing config parameter
Hi Boris,
the solution is actually simpler than it sounds from the ticket. The only thing you need to do is to set the "taskmanager.host" to the Pod's IP address in the Flink configuration. The easiest way to do this is to pass this config dynamically via a command-line parameter.
The Deployment spec could looks something like this:
containers:
- name: taskmanager
[...]
args:
- "taskmanager.sh"
- "start-foreground"
- "-Dtaskmanager.host=$(K8S_POD_IP)"
[...]
env:
- name: K8S_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
Hope this helps and let me know if this works.
Best,
Konstantin
On Sun, Feb 17, 2019 at 9:51 PM Boris Lublinsky <
[hidden email]> wrote:
Apparently there is a workaround for it.
Is it possible provide the complete helm chart for it.
Bits and pieces are in the ticket, but it would be nice to see the full chart
--
Konstantin Knauf | Solutions Architect
+49 160 91394525
Follow us @VervericaData
--
Stream Processing | Event Driven | Real Time
--
Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen
--
Konstantin Knauf | Solutions Architect
+49 160 91394525
Follow us @VervericaData
--
Stream Processing | Event Driven | Real Time
--
Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen