Hello,
When upgrading from flink-1.3.2 to flink-1.4.2, I faced this error on runtime of a Flink job :
java.util.ServiceConfigurationError: An SPI class of type org.apache.lucene.codecs.PostingsFormat
with classname org.apache.lucene.search.suggest.document.Completion50PostingsFormat does not exist, please fix the file 'META-INF/services/org.apache.lucene.codecs.PostingsFormat' in your classpath.
I added lucene-suggest dependency and then I encountered this :
java.lang.ClassCastException: class org.elasticsearch.search.suggest.completion2x.Completion090PostingsFormat
The Flink job runs Lucene queries on a data stream which ends up in an Elasticsearch index.
It seems to me that this exception is a side effect of shading flink-connector-elasticsearch-5
dependencies. Actually, the only solution I have found is to re-build flink-connector-elasticsearch-5
jar excluding META-INF/services/org.apache.lucene.codecs.*
I would highly appreciate any opinion on this workaround. Could it have side effect ?
Thanks. And by the way, congrats to all Flink contributors, this is a pretty good piece of technology !
Regards,
Manuel Haddadi
|
Hi Manuel, thanks for reporting this issue. It sounds to me like a bug we should fix. I've pulled Gordon into the conversation since he will most likely know more about the ElasticSearch connector shading. Cheers, Till On Thu, Mar 22, 2018 at 5:09 PM, Haddadi Manuel <[hidden email]> wrote:
|
Hi Manuel, Thanks a lot for reporting this! Yes, this issue is most likely related to the recent changes to shading the Elasticsearch connector dependencies, though it is a bit curious why I didn’t bump into it before while testing it.
Could you explain a bit more where the Lucene queries are executed? Were there other dependencies required for this?
Cheers, Gordon
On 23 March 2018 at 12:43:31 AM, Till Rohrmann ([hidden email]) wrote:
|
Hi Gordon, hi Till,
Thanks for your feedback. I am happy to contibute by precising how the bug occured, if it might help.
First, to describe a bit more what does my Flink job, there is in a part of its execution plan a ProcessFunction which basically stores the events as Lucene documents in an in-memory Lucene index. When the number of documents reaches a threshold, the process function fires Lucene queries to filter the documents (then the events) according to user models.
Therefore this process function is dependent on Lucene modules lucene-core, lucene-queryparser, lucene-analyzers-common in version 6.3.0 (as a precaution we chose the same version than elasticsearch:5.1.2).
Later the event stream is sent in an Elasticseach index via the module flink-connector-elasticsearch5.
I have updgraded Flink dependencies from version 1.3.2 to 1.4.2. When the job was deployed on a Yarn cluster, it raised the error : java.util.ServiceConfigurationError: An SPI class of type org.apache.lucene.codecs.PostingsFormat with classname org.apache.lucene.search.suggest.document.Completion50PostingsFormat does not exist, please fix the file 'META-INF/services/org.apache.lucene.codecs.PostingsFormat' in your classpath.
So I checked the META-INF/services/org.apache.lucene.codecs.PostingsFormat in my job's fat jar. It contained several implementation of PostingsFormat to be loaded :
org.apache.lucene.search.suggest.document.Completion50PostingsFormat
org.elasticsearch.search.suggest.completion2x.Completion090PostingsFormat org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat org.apache.lucene.codecs.idversion.IDVersionPostingsFormat
I don't know how the maven-shade-plugin operates but it seems to me that it aggregates the same configuration files from different modules in one file.
For example, in elasticsearch-5.1.2.jar, the file org.apache.lucene.codecs.PostingsFormat is :
org.apache.lucene.search.suggest.document.Completion50PostingsFormatIn flink-connector-elasticsearch5_2.11-1.4.2.jar, the file org.apache.lucene.codecs.PostingsFormat is : org.apache.lucene.search.suggest.document.Completion50PostingsFormat
org.elasticsearch.search.suggest.completion2x.Completion090PostingsFormat # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. org.apache.lucene.codecs.idversion.IDVersionPostingsFormat
#
# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. Since my job's fat jar inherits configuration files in META-INF/services from
its dependencies, I guess this is the reason why on runtime the Lucene API tries to load some classes that are not in the classpath. I had confirmation of this intuition when I tried to exclude META-INF/services/org.apache.lucene.codecs.* files
from flink-connector-elasticsearch5. The file org.apache.lucene.codecs.PostingsFormat of my jar did not lead to runtime exception anymore :
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. org.apache.lucene.codecs.idversion.IDVersionPostingsFormat I hope my explanation is clear enough. Don't hesitate to ask for more information if needed. I would be also be glad if you would point some misunderstanding from my part, or even misusages of Flink framework (maybe the fact we use a Lucene index as a micro-batch inside a Flink transformation). Cheers, Manuel
De : Tzu-Li (Gordon) Tai <[hidden email]>
Envoyé : vendredi 23 mars 2018 10:40:52 À : Till Rohrmann; Haddadi Manuel Cc : [hidden email] Objet : Re: Lucene SPI class loading fails with shaded flink-connector-elasticsearch
Hi Manuel,
Thanks a lot for reporting this!
Yes, this issue is most likely related to the recent changes to shading the Elasticsearch connector dependencies, though it is a bit curious why I didn’t bump into it before while testing it.
Could you explain a bit more where the Lucene queries are executed? Were there other dependencies required for this?
Cheers,
Gordon
On 23 March 2018 at 12:43:31 AM, Till Rohrmann ([hidden email]) wrote:
|
Free forum by Nabble | Edit this page |