Flink 1.12.2 sql api use parquet format error

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink 1.12.2 sql api use parquet format error

太平洋
ref: https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/parquet.html

env and error:

Flink version: 1.12.2
deployment: standalone kubernetes session
dependency:
        <dependency>
          <groupId>org.apache.flink</groupId>
          <artifactId>flink-parquet_2.11</artifactId>
          <version>1.12.2</version>
        </dependency>

错误信息:
Caused by: org.apache.flink.table.api.ValidationException: Could not find any format factory for identifier 'parquet' in the classpath. at org.apache.flink.table.filesystem.FileSystemTableSink.<init>(FileSystemTableSink.java:124) at org.apache.flink.table.filesystem.FileSystemTableFactory.createDynamicTableSink(FileSystemTableFactory.java:85)

Reply | Threaded
Open this post in threaded view
|

Re: Flink 1.12.2 sql api use parquet format error

Timo Walther
Hi,

can you check the content of the JAR file that you are submitting? There
should be a `META-INF/services` directory with a
`org.apache.flink.table.factories.Factory` file that should list the
Parque format.

See also here:

https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/overview/#transform-table-connectorformat-resources

Regards,
Timo


On 06.04.21 10:25, 太平洋 wrote:

> ref: https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/parquet.html
>
> env and error:
>
> Flink version: 1.12.2
> deployment: standalone kubernetes session
> dependency:
>          <dependency>
>            <groupId>org.apache.flink</groupId>
> <artifactId>flink-parquet_2.11</artifactId>
> <version>1.12.2</version>
>          </dependency>
>
> 错误信息:
> Caused by: org.apache.flink.table.api.ValidationException: Could not
> find any format factory for identifier 'parquet' in the classpath. at
> org.apache.flink.table.filesystem.FileSystemTableSink.<init>(FileSystemTableSink.java:124)
> at
> org.apache.flink.table.filesystem.FileSystemTableFactory.createDynamicTableSink(FileSystemTableFactory.java:85)
>

Reply | Threaded
Open this post in threaded view
|

回复: Flink 1.12.2 sql api use parquet format error

太平洋
After change pom.xml, new error:

org.apache.flink.runtime.rest.handler.RestHandlerException: Could not execute application. at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$1(JarRunHandler.java:108) at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) at java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1609) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.CompletionException: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606) ... 7 more Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at org.apache.flink.formats.parquet.ParquetFileFormatFactory.getParquetConfiguration(ParquetFileFormatFactory.java:115) at org.apache.flink.formats.parquet.ParquetFileFormatFactory.access$000(ParquetFileFormatFactory.java:51) at org.apache.flink.formats.parquet.ParquetFileFormatFactory$2.createRuntimeEncoder(ParquetFileFormatFactory.java:103) at org.apache.flink.formats.parquet.ParquetFileFormatFactory$2.createRuntimeEncoder(ParquetFileFormatFactory.java:97) at org.apache.flink.table.filesystem.FileSystemTableSink.createWriter(FileSystemTableSink.java:373) at org.apache.flink.table.filesystem.FileSystemTableSink.createStreamingSink(FileSystemTableSink.java:183) at org.apache.flink.table.filesystem.FileSystemTableSink.consume(FileSystemTableSink.java:145) at org.apache.flink.table.filesystem.FileSystemTableSink.lambda$getSinkRuntimeProvider$0(FileSystemTableSink.java:134) at org.apache.flink.table.planner.plan.nodes.common.CommonPhysicalSink.createSinkTransformation(CommonPhysicalSink.scala:95) at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:103) at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlanInternal(StreamExecSink.scala:43) at org.apache.flink.table.planner.plan.nodes.exec.ExecNode$class.translateToPlan(ExecNode.scala:59) at org.apache.flink.table.planner.plan.nodes.physical.stream.StreamExecSink.translateToPlan(StreamExecSink.scala:43) at org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:66) at org.apache.flink.table.planner.delegation.StreamPlanner$$anonfun$translateToPlan$1.apply(StreamPlanner.scala:65) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.Iterator$class.foreach(Iterator.scala:891) at scala.collection.AbstractIterator.foreach(Iterator.scala:1334) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.flink.table.planner.delegation.StreamPlanner.translateToPlan(StreamPlanner.scala:65) at org.apache.flink.table.planner.delegation.PlannerBase.translate(PlannerBase.scala:162) at org.apache.flink.table.api.internal.TableEnvironmentImpl.translate(TableEnvironmentImpl.java:1329) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:676) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:767) at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:666) at com.jd.app.StreamingJob.main(StreamingJob.java:146) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:349) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:219) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.deployment.application.DetachedApplicationRunner.tryExecuteJobs(DetachedApplicationRunner.java:84) at org.apache.flink.client.deployment.application.DetachedApplicationRunner.run(DetachedApplicationRunner.java:70) at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleRequest$0(JarRunHandler.java:102) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) ... 7 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:64) at org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:65) at org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:48) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 48 more

When execute 'mvn clean package', I found some WARNING info:
[WARNING] Discovered module-info.class. Shading will break its strong encapsulation.
[WARNING] parquet-hadoop-1.11.1.jar, parquet-column-1.11.1.jar define 50 overlapping classes:
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.IntArrays
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.IntIterators$EmptyIterator
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.AbstractIntCollection
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.IntIterators$UnmodifiableIterator
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.AbstractIntList
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.Arrays
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.IntArrays$1
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.IntArrays$ArrayHashStrategy
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.IntComparator
[WARNING]   - shaded.parquet.it.unimi.dsi.fastutil.ints.IntArrayList$1
[WARNING]   - 40 more...

------------------ 原始邮件 ------------------
发件人: "Timo Walther" <[hidden email]>;
发送时间: 2021年4月8日(星期四) 下午3:15
收件人: "user"<[hidden email]>;
主题: Re: Flink 1.12.2 sql api use parquet format error

Hi,

can you check the content of the JAR file that you are submitting? There
should be a `META-INF/services` directory with a
`org.apache.flink.table.factories.Factory` file that should list the
Parque format.

See also here:

https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/overview/#transform-table-connectorformat-resources

Regards,
Timo


On 06.04.21 10:25, 太平洋 wrote:

> ref: https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/parquet.html
>
> env and error:
>
> Flink version: 1.12.2
> deployment: standalone kubernetes session
> dependency:
>          <dependency>
>            <groupId>org.apache.flink</groupId>
> <artifactId>flink-parquet_2.11</artifactId>
> <version>1.12.2</version>
>          </dependency>
>
> 错误信息:
> Caused by: org.apache.flink.table.api.ValidationException: Could not
> find any format factory for identifier 'parquet' in the classpath. at
> org.apache.flink.table.filesystem.FileSystemTableSink.<init>(FileSystemTableSink.java:124)
> at
> org.apache.flink.table.filesystem.FileSystemTableFactory.createDynamicTableSink(FileSystemTableFactory.java:85)
>