I am trying to run flink on kubernetes, and trying to push checkpoints to
Google Cloud Storage. Below is the docker file
`FROM flink:1.6.2-hadoop28-scala_2.11-alpine
RUN wget -O lib/gcs-connector-latest-hadoop2.jar
https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jarRUN wget -O lib/gcs-connector-latest-hadoop2.jar
https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar&& \
wget
http://ftp.fau.de/apache/flink/flink-1.6.2/flink-1.6.2-bin-hadoop28-scala_2.11.tgz&& \
tar xf flink-1.6.2-bin-hadoop28-scala_2.11.tgz && \
mv flink-1.6.2/lib/flink-shaded-hadoop2* lib/ && \
rm -r flink-1.6.2*`
But the checkpoints are taking around 2-3 seconds on average and around 25
seconds at max, even the state size is around 100 KB.
Even the jobs are getting restarted with the error
`AsynchronousException{java.lang.Exception: Could not materialize checkpoint
1640 for operator groupBy` and sometimes losing connections with task
managers.
Currently, I have given the heap size of 4096 MB.
--
Sent from:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/