Re: Multiple kafka consumers

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Multiple kafka consumers

gerryzhou
Hi Amol,

I think If you set the parallelism of the source node equal to the number of the partition of the kafka topic, you could have per kafka customer per partition in your job. But if the number of the partitions of the kafka is dynamic, the 1:1 relationship might break. I think maybe @Gordon(CC) could give you more useful information.

Best, Sihua



On 06/25/2018 17:19[hidden email] wrote:
Same kind of question I have asked on stack overflow also.

Please answer it ASAP

https://stackoverflow.com/questions/51020018/partition-specific-flink-kafka-consumer

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------

On Mon, Jun 25, 2018 at 2:09 PM, Amol S - iProgrammer <[hidden email]
wrote:

Hello,

I wrote an streaming programme using kafka and flink to stream mongodb
oplog. I need to maintain an order of streaming within different kafka
partitions. As global ordering of records not possible throughout all
partitions I need N consumers for N different partitions. Is it possible to
consume data from N different partitions and N flink kafka consumers?

Please suggest.

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: Multiple kafka consumers

zhangminglei
Hi, Amol

As @Sihua said. Also in my case, if the kafka partition is 80. I will also set the job source operator parallelism to 80 as well.

Cheers
Minglei
在 2018年6月25日,下午5:39,sihua zhou <[hidden email]> 写道:

Hi Amol,

I think If you set the parallelism of the source node equal to the number of the partition of the kafka topic, you could have per kafka customer per partition in your job. But if the number of the partitions of the kafka is dynamic, the 1:1 relationship might break. I think maybe @Gordon(CC) could give you more useful information.

Best, Sihua



On 06/25/2018 17:19[hidden email] wrote: 
Same kind of question I have asked on stack overflow also.

Please answer it ASAP

https://stackoverflow.com/questions/51020018/partition-specific-flink-kafka-consumer

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------

On Mon, Jun 25, 2018 at 2:09 PM, Amol S - iProgrammer <[hidden email]
wrote:

Hello,

I wrote an streaming programme using kafka and flink to stream mongodb
oplog. I need to maintain an order of streaming within different kafka
partitions. As global ordering of records not possible throughout all
partitions I need N consumers for N different partitions. Is it possible to
consume data from N different partitions and N flink kafka consumers?

Please suggest.

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------


Reply | Threaded
Open this post in threaded view
|

Re: Multiple kafka consumers

zhangminglei
Hi, Amol

Yes. I think it is. But, env.setParallelism(80) means that you set a global parallelism for all operators. Actually, it depends on your job to set one of them(operators). Instead, You just set the source operator parallelism is enough. 

Like below, It will be 80 kafka consumers [also 80 task running, a task here is a consumer operator] for 80 number of partitions if you set the kafka partition number is 80.
DataStream<JavaBeam> dataStream = env.addSource(kafkaConsumer08).setParallelism(80);

Cheers
Minglei

在 2018年6月25日,下午6:02,Amol S - iProgrammer <[hidden email]> 写道:

Thanks zhangminglei,

Does this mean setting env.setParallelism(80) means I have created 80 kafka
consumers? and if this is true then can I change  env.setParallelism(80) to
any number i.e. number of partitions =  env.setParallelism or else I need
to restart my job each time I set new Parallelism in my job. I want to
write partition specific data transformation logic.

In short I want to create N flink kafka consumers for N number of
partitions.

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------

On Mon, Jun 25, 2018 at 3:25 PM, zhangminglei <[hidden email]> wrote:

Hi, Amol

As @Sihua said. Also in my case, if the kafka partition is 80. I will also
set the job source operator parallelism to 80 as well.

Cheers
Minglei

在 2018年6月25日,下午5:39,sihua zhou <[hidden email]> 写道:

Hi Amol,

I think If you set the parallelism of the source node equal to the number
of the partition of the kafka topic, you could have per kafka customer per
partition in your job. But if the number of the partitions of the kafka
is dynamic, the 1:1 relationship might break. I think maybe @Gordon(CC)
could give you more useful information.

Best, Sihua



On 06/25/2018 17:19,Amol S - iProgrammer<[hidden email]>
<[hidden email]> wrote:

Same kind of question I have asked on stack overflow also.

Please answer it ASAP

https://stackoverflow.com/questions/51020018/partition-
specific-flink-kafka-consumer

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------

On Mon, Jun 25, 2018 at 2:09 PM, Amol S - iProgrammer <
[hidden email]

wrote:


Hello,

I wrote an streaming programme using kafka and flink to stream mongodb
oplog. I need to maintain an order of streaming within different kafka
partitions. As global ordering of records not possible throughout all
partitions I need N consumers for N different partitions. Is it possible to
consume data from N different partitions and N flink kafka consumers?

Please suggest.

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------