Connection leak with flink elastic Sink

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Connection leak with flink elastic Sink

Vijay Bhaskar
Hi
We are using flink elastic sink which streams at the rate of 1000 events/sec, as described in https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
We are observing connection leak of elastic connections. After few minutes all the open connections are exceeding the process limits of the max open descriptors and Job is getting terminated. But the  http connections with the elastic search server remain open forever. Am i missing any specific configuration setting to close the open connection, after serving the request?
But there is no such setting is described in the above documentation of elastic sink

Regards
Bhaskar
Reply | Threaded
Open this post in threaded view
|

Re: Connection leak with flink elastic Sink

Andrey Zagrebin
Hi Bhaskar,

I think Gordon might help you, I am pulling him into the discussion.

Best,
Andrey

On 12 Dec 2018, at 13:31, Vijay Bhaskar <[hidden email]> wrote:

Hi
We are using flink elastic sink which streams at the rate of 1000 events/sec, as described in https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
We are observing connection leak of elastic connections. After few minutes all the open connections are exceeding the process limits of the max open descriptors and Job is getting terminated. But the  http connections with the elastic search server remain open forever. Am i missing any specific configuration setting to close the open connection, after serving the request?
But there is no such setting is described in the above documentation of elastic sink

Regards
Bhaskar

Reply | Threaded
Open this post in threaded view
|

Re: Connection leak with flink elastic Sink

Chesnay Schepler
In reply to this post by Vijay Bhaskar
Specifically which connector are you using, and which Flink version?

On 12.12.2018 13:31, Vijay Bhaskar wrote:

> Hi
> We are using flink elastic sink which streams at the rate of 1000
> events/sec, as described in
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
> We are observing connection leak of elastic connections. After few
> minutes all the open connections are exceeding the process limits of
> the max open descriptors and Job is getting terminated. But the  http
> connections with the elastic search server remain open forever. Am i
> missing any specific configuration setting to close the open
> connection, after serving the request?
> But there is no such setting is described in the above documentation
> of elastic sink
>
> Regards
> Bhaskar


Reply | Threaded
Open this post in threaded view
|

Re: Connection leak with flink elastic Sink

Tzu-Li (Gordon) Tai
Hi,

Besides the information that Chesnay requested, could you also provide a stack trace of the exception that caused the job to terminate in the first place?

The Elasticsearch sink does indeed close the internally used Elasticsearch client, which should in turn properly release all resources [1].
I would like to double check whether or not the case here is that that part of the code was never reached.

Cheers,
Gordon


On 13 December 2018 at 5:59:34 PM, Chesnay Schepler ([hidden email]) wrote:

Specifically which connector are you using, and which Flink version?

On 12.12.2018 13:31, Vijay Bhaskar wrote:

> Hi
> We are using flink elastic sink which streams at the rate of 1000
> events/sec, as described in
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
> We are observing connection leak of elastic connections. After few
> minutes all the open connections are exceeding the process limits of
> the max open descriptors and Job is getting terminated. But the http
> connections with the elastic search server remain open forever. Am i
> missing any specific configuration setting to close the open
> connection, after serving the request?
> But there is no such setting is described in the above documentation
> of elastic sink
>
> Regards
> Bhaskar


Reply | Threaded
Open this post in threaded view
|

Re: Connection leak with flink elastic Sink

Vijay Bhaskar
Hi Gordon,
We are using flink cluster 1.6.1, elastic search connector version: flink-connector-elasticsearch6_2.11
Attached the stack trace. 

Following are the max open file descriptor limit of theTask manager  process and open connections to the elastic
search cluster

Regards
Bhaskar
#lsof -p 62041 | wc -l

65583

All the connections to elastic cluster reached to:

netstat -aln | grep 9200 | wc -l

2333




On Thu, Dec 13, 2018 at 4:12 PM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi,

Besides the information that Chesnay requested, could you also provide a stack trace of the exception that caused the job to terminate in the first place?

The Elasticsearch sink does indeed close the internally used Elasticsearch client, which should in turn properly release all resources [1].
I would like to double check whether or not the case here is that that part of the code was never reached.

Cheers,
Gordon


On 13 December 2018 at 5:59:34 PM, Chesnay Schepler ([hidden email]) wrote:

Specifically which connector are you using, and which Flink version?

On 12.12.2018 13:31, Vijay Bhaskar wrote:

> Hi
> We are using flink elastic sink which streams at the rate of 1000
> events/sec, as described in
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
> We are observing connection leak of elastic connections. After few
> minutes all the open connections are exceeding the process limits of
> the max open descriptors and Job is getting terminated. But the http
> connections with the elastic search server remain open forever. Am i
> missing any specific configuration setting to close the open
> connection, after serving the request?
> But there is no such setting is described in the above documentation
> of elastic sink
>
> Regards
> Bhaskar



stack_trace.txt (85K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Connection leak with flink elastic Sink

Tzu-Li (Gordon) Tai
Hi,

(Removed dev@ from the mail thread)

I took a look at the logs you provided, and it seems like the sink operators should have been properly tear-down, and therefore closing the RestHighLevelClient used internally.

I’m at this point not really sure what else could have caused this besides a bug with the Elasticsearch client itself not cleaning up properly.
Have you tried turning on debug level for logging to see if there is anything suspicious?

Cheers,
Gordon


On 13 December 2018 at 7:35:33 PM, Vijay Bhaskar ([hidden email]) wrote:

Hi Gordon,
We are using flink cluster 1.6.1, elastic search connector version: flink-connector-elasticsearch6_2.11
Attached the stack trace. 

Following are the max open file descriptor limit of theTask manager  process and open connections to the elastic
search cluster

Regards
Bhaskar
#lsof -p 62041 | wc -l

65583

All the connections to elastic cluster reached to:

netstat -aln | grep 9200 | wc -l

2333




On Thu, Dec 13, 2018 at 4:12 PM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi,

Besides the information that Chesnay requested, could you also provide a stack trace of the exception that caused the job to terminate in the first place?

The Elasticsearch sink does indeed close the internally used Elasticsearch client, which should in turn properly release all resources [1].
I would like to double check whether or not the case here is that that part of the code was never reached.

Cheers,
Gordon


On 13 December 2018 at 5:59:34 PM, Chesnay Schepler ([hidden email]) wrote:

Specifically which connector are you using, and which Flink version?

On 12.12.2018 13:31, Vijay Bhaskar wrote:
> Hi
> We are using flink elastic sink which streams at the rate of 1000
> events/sec, as described in
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
> We are observing connection leak of elastic connections. After few
> minutes all the open connections are exceeding the process limits of
> the max open descriptors and Job is getting terminated. But the http
> connections with the elastic search server remain open forever. Am i
> missing any specific configuration setting to close the open
> connection, after serving the request?
> But there is no such setting is described in the above documentation
> of elastic sink
>
> Regards
> Bhaskar


Reply | Threaded
Open this post in threaded view
|

Re: Connection leak with flink elastic Sink

Vijay Bhaskar
Sure, let me try out with more debug logs and get back to you

Regards
Bhaskar

On Fri, Dec 14, 2018 at 4:41 PM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi,

(Removed dev@ from the mail thread)

I took a look at the logs you provided, and it seems like the sink operators should have been properly tear-down, and therefore closing the RestHighLevelClient used internally.

I’m at this point not really sure what else could have caused this besides a bug with the Elasticsearch client itself not cleaning up properly.
Have you tried turning on debug level for logging to see if there is anything suspicious?

Cheers,
Gordon


On 13 December 2018 at 7:35:33 PM, Vijay Bhaskar ([hidden email]) wrote:

Hi Gordon,
We are using flink cluster 1.6.1, elastic search connector version: flink-connector-elasticsearch6_2.11
Attached the stack trace. 

Following are the max open file descriptor limit of theTask manager  process and open connections to the elastic
search cluster

Regards
Bhaskar
#lsof -p 62041 | wc -l

65583

All the connections to elastic cluster reached to:

netstat -aln | grep 9200 | wc -l

2333




On Thu, Dec 13, 2018 at 4:12 PM Tzu-Li (Gordon) Tai <[hidden email]> wrote:
Hi,

Besides the information that Chesnay requested, could you also provide a stack trace of the exception that caused the job to terminate in the first place?

The Elasticsearch sink does indeed close the internally used Elasticsearch client, which should in turn properly release all resources [1].
I would like to double check whether or not the case here is that that part of the code was never reached.

Cheers,
Gordon


On 13 December 2018 at 5:59:34 PM, Chesnay Schepler ([hidden email]) wrote:

Specifically which connector are you using, and which Flink version?

On 12.12.2018 13:31, Vijay Bhaskar wrote:
> Hi
> We are using flink elastic sink which streams at the rate of 1000
> events/sec, as described in
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/elasticsearch.html.
> We are observing connection leak of elastic connections. After few
> minutes all the open connections are exceeding the process limits of
> the max open descriptors and Job is getting terminated. But the http
> connections with the elastic search server remain open forever. Am i
> missing any specific configuration setting to close the open
> connection, after serving the request?
> But there is no such setting is described in the above documentation
> of elastic sink
>
> Regards
> Bhaskar