I am trying to use Flink (1.3.0) with MapR(5.2.1). Accordingly, I built Flink for Mapr as follows with maven 3.1:
mvn clean install -DskipTests -Dscala.version=2.10.6 -Pvendor-repos,mapr -Dhadoop.version=2.7.0-mapr-1703 -Dzookeeper.version=3.4.5-mapr-1604 I, then added /opt/mapr/lib/* to Flink classpath, added Datadog metrics entry to config and to test the config, started flink service via: ./bin/jobmanager.sh start local. In the jobmanager logs, I see the following error: ERROR org.apache.flink.runtime.metrics.MetricRegistry - Could not instantiate metrics reporter dghttp. Metrics might not be exposed/reported. java.lang.IllegalStateException: Failed contacting Datadog to validate API key at org.apache.flink.metrics.datadog.DatadogHttpClient.validateApiKey(DatadogHttpClient.java:73) at org.apache.flink.metrics.datadog.DatadogHttpClient.<init>(DatadogHttpClient.java:61) at org.apache.flink.metrics.datadog.DatadogHttpReporter.open(DatadogHttpReporter.java:104) at org.apache.flink.runtime.metrics.MetricRegistry.<init>(MetricRegistry.java:129) at org.apache.flink.runtime.taskexecutor.TaskManagerServices.fromConfiguration(TaskManagerServices.java:188) at org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1921) at org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:2322) at org.apache.flink.runtime.jobmanager.JobManager$.liftedTree3$1(JobManager.scala:2053) at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:2052) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$2.apply$mcV$sp(JobManager.scala:2139) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$2.apply(JobManager.scala:2117) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$2.apply(JobManager.scala:2117) at scala.util.Try$.apply(Try.scala:161) at org.apache.flink.runtime.jobmanager.JobManager$.retryOnBindException(JobManager.scala:2172) at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:2117) at org.apache.flink.runtime.jobmanager.JobManager$$anon$10.call(JobManager.scala:1992) at org.apache.flink.runtime.jobmanager.JobManager$$anon$10.call(JobManager.scala:1990) at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40) at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1990) at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala) Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certi fication path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at org.apache.flink.shaded.okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:268) at org.apache.flink.shaded.okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:238) at org.apache.flink.shaded.okhttp3.internal.connection.RealConnection.connect(RealConnection.java:149) at org.apache.flink.shaded.okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:192) at org.apache.flink.shaded.okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121) at org.apache.flink.shaded.okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100) This error disappears when I remove the mapr libs from the Flink_Classpath. I encounter a similar error (SSL handshake exception, PKIX path build failed) when I try to use aws-sdk(1.11.123) jar in my code and submit that code to flink. I think the shaded libs are causing this error. Am I right in assuming that? Is there any workaround for this? |
This looks more like a certification problem as described here:
https://github.com/square/okhttp/issues/2746 I don't think that shading could have anything to do with this. On 26.06.2017 00:09, ani.desh1512 wrote: > I am trying to use Flink (1.3.0) with MapR(5.2.1). Accordingly, I built Flink > for Mapr as follows with maven 3.1: > > /mvn clean install -DskipTests -Dscala.version=2.10.6 -Pvendor-repos,mapr > -Dhadoop.version=2.7.0-mapr-1703 -Dzookeeper.version=3.4.5-mapr-1604/ > > I, then added /opt/mapr/lib/* to Flink classpath, added Datadog metrics > entry to config and to test the config, started flink service via: > /./bin/jobmanager.sh start local/. > In the jobmanager logs, I see the following error: > > /ERROR org.apache.flink.runtime.metrics.MetricRegistry - Could > not instantiate metrics reporter dghttp. Metrics might not be > exposed/reported. > java.lang.IllegalStateException: Failed contacting Datadog to validate API > key > at > org.apache.flink.metrics.datadog.DatadogHttpClient.validateApiKey(DatadogHttpClient.java:73) > at > org.apache.flink.metrics.datadog.DatadogHttpClient.<init>(DatadogHttpClient.java:61) > at > org.apache.flink.metrics.datadog.DatadogHttpReporter.open(DatadogHttpReporter.java:104) > at > org.apache.flink.runtime.metrics.MetricRegistry.<init>(MetricRegistry.java:129) > at > org.apache.flink.runtime.taskexecutor.TaskManagerServices.fromConfiguration(TaskManagerServices.java:188) > at > org.apache.flink.runtime.taskmanager.TaskManager$.startTaskManagerComponentsAndActor(TaskManager.scala:1921) > at > org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:2322) > at > org.apache.flink.runtime.jobmanager.JobManager$.liftedTree3$1(JobManager.scala:2053) > at > org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:2052) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$2.apply$mcV$sp(JobManager.scala:2139) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$2.apply(JobManager.scala:2117) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$2.apply(JobManager.scala:2117) > at scala.util.Try$.apply(Try.scala:161) > at > org.apache.flink.runtime.jobmanager.JobManager$.retryOnBindException(JobManager.scala:2172) > at > org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:2117) > at > org.apache.flink.runtime.jobmanager.JobManager$$anon$10.call(JobManager.scala:1992) > at > org.apache.flink.runtime.jobmanager.JobManager$$anon$10.call(JobManager.scala:1990) > at > org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40) > at > org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1990) > at > org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala) > Caused by: javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certi > fication path to requested target > at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) > at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) > at > sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514) > at > sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) > at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) > at sun.security.ssl.Handshaker.process_record(Handshaker.java:961) > at > sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) > at > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) > at > org.apache.flink.shaded.okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:268) > at > org.apache.flink.shaded.okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:238) > at > org.apache.flink.shaded.okhttp3.internal.connection.RealConnection.connect(RealConnection.java:149) > at > org.apache.flink.shaded.okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:192) > at > org.apache.flink.shaded.okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121) > at > org.apache.flink.shaded.okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)/ > > This error disappears when I remove the mapr libs from the Flink_Classpath. > I encounter a similar error (SSL handshake exception, PKIX path build > failed) when I try to use aws-sdk(1.11.123) jar in my code and submit that > code to flink. > > I think the shaded libs are causing this error. Am I right in assuming that? > Is there any workaround for this? > > > > > > > > -- > View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/MapR-libraries-shading-issue-tp13988.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com. > |
Okay, just curious because the guy mentioned the behavior changes with removing the MapR dependencies.
Maybe these dependencies change the trust-store or certificate-store providers? On Mon, Jun 26, 2017 at 2:35 PM, Chesnay Schepler <[hidden email]> wrote: This looks more like a certification problem as described here: https://github.com/square/okht |
In reply to this post by Chesnay Schepler
As Stephan pointed out, this seems more like a MapR libs meddling with some jar. As I had mentioned in the original question, I run across the same problem when i use the aws sdk jar in my program. The error is as follows:
shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1069) at shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035) at shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742) at shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716) at shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) at shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) at shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) at shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) at shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4169) at shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4116) at shaded.com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1365) at com.kabbage.common.S3Utility.readContentFromFilePath(S3Utility.java:32) at com.kabbage.s3Importer.StreamReader$2.map(StreamReader.java:77) at com.kabbage.s3Importer.StreamReader$2.map(StreamReader.java:68) at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:206) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:262) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) at java.lang.Thread.run(Thread.java:748) Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) at sun.security.ssl.Handshaker.process_record(Handshaker.java:961) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at shaded.org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:394) at shaded.org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:353) at shaded.com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:132) at shaded.org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141) at shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353) This error also disappears when I remove MapR libs from Flink classpath. But removing MapR libs from classpath means I CANNOT use maprfs for storing flink checkpoints and recovery. I have also asked this question on the MapR community. https://community.mapr.com/message/60591-flink-with-mapr-shading-issues |
The error that you mentioned seem to indicate that some certificates of certification authorities could not be found. You may want to add them to the trust store of the application.
> On 26. Jun 2017, at 16:55, ani.desh1512 <[hidden email]> wrote: > > As Stephan pointed out, this seems more like a MapR libs meddling with some > jar. As I had mentioned in the original question, I run across the same > problem when i use the aws sdk jar in my program. The error is as follows: > > /shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > at > shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1069) > at > shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1035) > at > shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742) > at > shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716) > at > shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) > at > shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) > at > shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) > at > shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) > at > shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4169) > at > shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4116) > at > shaded.com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1365) > at > com.kabbage.common.S3Utility.readContentFromFilePath(S3Utility.java:32) > at com.kabbage.s3Importer.StreamReader$2.map(StreamReader.java:77) > at com.kabbage.s3Importer.StreamReader$2.map(StreamReader.java:68) > at > org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) > at > org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:206) > at > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:262) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) > at java.lang.Thread.run(Thread.java:748) > Caused by: javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) > at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) > at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) > at > sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1514) > at > sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) > at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1026) > at sun.security.ssl.Handshaker.process_record(Handshaker.java:961) > at > sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) > at > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) > at > shaded.org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:394) > at > shaded.org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:353) > at > shaded.com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:132) > at > shaded.org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141) > at > shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)/ > > This error also disappears when I remove MapR libs from Flink classpath. But > removing MapR libs from classpath means I CANNOT use maprfs for storing > flink checkpoints and recovery. > > I have also asked this question on the MapR community. > https://community.mapr.com/message/60591-flink-with-mapr-shading-issues > <https://community.mapr.com/message/60591-flink-with-mapr-shading-issues> > > > > -- > View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/MapR-libraries-shading-issue-tp13988p14001.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com. |
Pasting my reply from the MapR community thread:
So, I have found a temporary workaround for this. Heres what I did: For the client, I found out the password of /opt/mapr/conf/ssl_trustore via ssl-client.xml For both Datadog and Amazon S3, I found out the certs chain that they were using (i could do this easily in firefox). I then dowloaded these certs and added them to the mapr trustore i.e. /opt/mapr/conf/ssl_trustore e.g: keytool -import -alias s3 -file s3.cer -keystore /opt/mapr/conf/ssl_truststore -storepass <map_client_trustore_pass>) After doing this, the ssl handshake error that i was getting was resolved. I say this is a temporary fix because, lets say in future Amazon S3 change their root certificate issuer or they change their ssl certificate, then we ll need to do these steps again. Also, it is cumbersome to add these certs for EACH AND EVERY website that we are trying to access. So, I think the issue is that once we add mapr libs to flink classpath, MapR's keystore/truststore get used and that is what is causing the problem. Is this the expected behaviour? Would this be a bug on Flink side or mapr side or (worse) even on the base apache http client used by aws and datadog? |
Again as I mentioned in the MapR thread,
So, after some more digging, I found out that you can make flink use the default java truststore by passing -Djavax.net.ssl.trustStore=$JAVA_HOME/jre/lib/security/cacerts as JVM_ARGS for Flink. I tested this approach with AWS, datadog along with MapR Streams and Tables and it seems to have worked as of now. I am not sure if this is the right approach, but if it indeed is then we should include it in the Flink Mapr documentation. |
I would say that this is a MapR issue.
It's a good idea to add it to the docs in case someone else stumbles upon this. Would be great if you could open a JIRA for that. On 27.06.2017 19:35, ani.desh1512 wrote: > Again as I mentioned in the MapR thread, > > So, after some more digging, I found out that you can make flink use the > default java truststore by passing > -Djavax.net.ssl.trustStore=$JAVA_HOME/jre/lib/security/cacerts as JVM_ARGS > for Flink. > I tested this approach with AWS, datadog along with MapR Streams and Tables > and it seems to have worked as of now. > > I am not sure if this is the right approach, but if it indeed is then we > should include it in the Flink Mapr documentation. > > > > -- > View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/MapR-libraries-shading-issue-tp13988p14027.html > Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com. > |
Cool.
For future reference, I created a JIRA ticket: https://issues.apache.org/jira/browse/FLINK-7033 Thanks for all the help, guys. |
Free forum by Nabble | Edit this page |