my data from a Hbase table ,it is like a List[rowkey,Map[String,String]],
class MySplittableIterator extends SplittableIterator[String]{ // Members declared in java.util.Iterator def hasNext(): Boolean = { } def next(): Nothing = { } // Members declared in org.apache.flink.util.SplittableIterator def getMaximumNumberOfSplits(): Int = { } def split(num: Int): Array[Iterator[String]] = { } } i do not know the methods to write,can you give me a example. ----- 原始邮件 ----- 发件人:Timo Walther <[hidden email]> 收件人:[hidden email] 主题:Re: 回复:Re: fromParallelCollection 日期:2016年09月06日 17点03分 Hi,
you have to implement a class that extends "org.apache.flink.util.SplittableIterator". The runtime will ask this class for multiple "java.util.Iterator"s over your split data. How you split your data and how an iterator looks like depends on your data and implementation. If you need more help, you should show us some examples of your data. Timo Am 06/09/16 um 09:46 schrieb [hidden email]: fromCollection is not parallelization,the data is huge,so i want to use env.fromParallelCollection(data),but the data i do not know how to initialize,
-- Freundliche Grüße / Kind Regards Timo Walther Follow me: @twalthr https://www.linkedin.com/in/twalthr |
If your data comes from HBase maybe it
would also good to implement a HBase source. A current HBase sink
is in the making: https://github.com/apache/flink/pull/2332
Maybe it would be better to save your data in an HDFS (e.g. CSV file) and use the built-in "readFile()". This does the parallelism automatically. Am 06/09/16 um 14:56 schrieb [hidden email]: my data from a Hbase table ,it is like a List[rowkey,Map[String,String]],
-- Freundliche Grüße / Kind Regards Timo Walther Follow me: @twalthr https://www.linkedin.com/in/twalthr |
Free forum by Nabble | Edit this page |