当前位置: 动力学知识库 > 问答 > 编程问答 >

google cloud dataflow - Is there any form to reduce the quantity of messages read per second from PubSubIO?

问题描述:

I have a cloud streaming pipeline that read from PubSubIO and which "PipelineOptions" are set with "WorkerMachineType = n1-standard-1". This machine have 3.75GB of memory.

My problem is that if the subscription has a lot of messages, the pipeline reads really fast and when starts to process many elements it doesn't have enough memory.

Is there any form to reduce the quantity of messages read per second? or is the memory consumption related with the time duration assigned to the window and I would reduce this time duration?

Thanks is advance.

网友答案:

It sounds like you may be trying to process too much data with too few workers. We are looking at addressing this and related scenarios, but in the meantime you may want to try dialing down the amount of data you're ingesting, or increasing the number of workers available to the jobs.

You'll also get better performance with n1-standard-4 machines, which is why we make those the default for the streaming runner.

分享给朋友:
您可能感兴趣的文章:
随机阅读: