This post was updated on .
When we submit a job which use udf of hive , the job will dependent on udf's
jars and configuration files. We have already store udf's jars and configuration files in hive metadata store,so we excpet that flink could get those files hdfs paths by hive-connector,and get those files in hdfs by paths when it running. In this code, it seemed we have already get those udf resources's path in FunctionInfo, but did't use it. https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80 Now,we maintain the same data as hive-metastore in flink-client.It is a big trouble to sync files by manual. So we try to find a way to avoid manual submit udf's resources when we submit a job.Is it possible? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
Hi Husky,
I guess https://issues.apache.org/jira/browse/FLINK-14055 is what is needed to make this feature possible. @Rui: Do you know more about this issue and current limitations. Regards, Timo On 18.09.20 09:11, Husky Zeng wrote: > When we submit a job which use udf of hive , the job will dependent on udf's > jars and configuration files. > > We have already store udf's jars and configuration files in hive metadata > store,so we excpet that flink could get those files hdfs paths by > hive-connector,and get those files in hdfs by paths when it running. > > In this code, it seemed we have already get those udf resources's path in > FunctionInfo, but did't use it. > > > https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/module/hive/HiveModule.java#L80 > > We submit udf's jars and configuration files with job to yarn by client now > ,and try to find a way to avoid submit udf's resources when we submit a > job.Is it possible? > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > |
Hi Timo,
Thanks for your attention,As what I say in this comment, this feature can surely solve our problem, but it seems that the workload is much larger than the solution in my scenario. Our project urgently needs to solve the problem of reusing hive UDF in hive metastore, so we are more inclined to develop a fast solution. I want to hear some community advice. https://issues.apache.org/jira/browse/FLINK-19335?focusedCommentId=17199927&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17199927 Best Regards, Husky Zeng -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ |
In reply to this post by Timo Walther
Hi Timo, I believe the blocker for this feature is that we don't support dynamically adding user jars/resources at the moment. We're able to read the path to the function jar from Hive metastore, but we cannot load the jar after the user session is started. On Tue, Sep 22, 2020 at 3:43 PM Timo Walther <[hidden email]> wrote: Hi Husky, Cheers, Rui Li |
Free forum by Nabble | Edit this page |