Nuxeo DM 5.5 and PostgreSQL Locks

Hi all,

We are using Nuxeo DM 5.5 with PostgreSQL 8.4.and we have a singular (almost for us) behavior of the system. For long period of time Nuxeo works well, we are able to insert, upload delete document and so on. Then after a several operation the system do not respond, but the server (tomcat) is up and the database also. After some investigation we have noted that during the period where nuxeo do not respond there are some transaction running in the database that i think create some locks. After 10-15 minutes the transactions are closed and Nuxeo respond and .. incredible the last operation has been done correctly. Any suggestions in order to solve this issue.. I attach the evidence of locks during the session where Nuxeo do not respond. Apologies in case the image is too big. Thanks in advance. Giancarlo

alt text

0 votes

1 answers

2099 views

ANSWER

Please use the jstack command to get a stack trace of the Nuxeo server while it's doing something unknown. Having the stack trace would help us understand what it's doing.
05/10/2012

Here part of stack error, when happens the problem. Thanks in advance.
05/11/2012

Seriously, this was unreadable. Such a huge stack trace is not suitable for this site. I removed it. Please post it to an external site designed for pastes (pastebin.com for instance) or give us just an extract.

Also, if what you're posting is not an answer to the initial question, then don't post it as an answer but as a comment.

05/11/2012

Sorry I posted in http://pastebin.com/berTiVL9 Thank you for interest.
05/11/2012

The error Resetting connection means that connection to the database was lost and had to be reset. You also see things like An I/O error occured while sending to the backend or Connection timed out.

So there is a network problem or on the database side.

05/11/2012

Thanks very much. We will investigate in our infrastructure firewalls, etc
05/11/2012

We have investigated in our infrastructure and yes we have some problem with firewalls that drops the connection after a certain period of time. So we have tried to specify in the datasource the option <property name="tcpKeepAlive">true</property> but we have found a lot of point where there is <xa-datasource>org.postgresql.xa.PGXADataSource</xa-datasource>. We have added the option in the default-sql-directories-bundle.xml but don't work.
05/15/2012



To enable keep-alive on PostgreSQL connections you need to change:

  1. the datasources (server.xml or nuxeo.xml or the relevant templates) to add tcpKeepAlive=true to the <Resource> for your datasource(s),
  2. the repository configuration (default-repository-config.xml or its template) to add <property name="TcpKeepAlive">true</property>.
0 votes



Thank you very much. In reality we already made this configuration, but the datasource related to the repository seems do not keep in consideration the property. The difference is that we have typed "tcpKeepAlive" instead "TcpKeepAlive". As soon as possible we will made this change, and try again. Thanks again.
05/16/2012

Good morning, we have made the changes suggested in the three datasource, adding "TcpKeepAlive" but it works only on one. Here http://pastebin.com/MyRtwv56 some data from our engineering. Any suggestion? We are struggling with this problem. Thanks in advance.
05/17/2012

It works for me (the method setTcpKeepAlive of the XA datasource is called (checked with a breakpoint) when I add &lt;property name=&quot;TcpKeepAlive&quot;&gt;true&lt;/property&gt; to the repository config.

Also note that you should not modify existing templates, they could be overwritten on upgrades (including hotfixes). You should create a new custom template containing a copy of the file you want to modify and use it. Please read on how to use templates in Nuxeo.

05/21/2012

Hi Florent, thank you very much for you support. But probably there are some settings wrong that we made. I setted custom/template that replace the postgres default. We have tested it changing the property password with one wrong, and Nuxeo raise an error of wrong password, so we are sure that Nuxeo is using our custum tempalte, but the property TcpKeepAlive has no effect. We have only one datasource that works fine with the tcpkeepalive property as mentioned in the previuos comment. The datasources XA do not recognize the TcpKeepAlive property. We have supposed the jdbc driver, but it works fine because we have one datasource that establish keepAlive connections. Giancarlo
05/23/2012

Short of breakpointing in RepositoryBackend.initialize to see why the call to BeanUtils.setProperty doesn't do what you want, I don't see how to debug this.
05/23/2012

Solved. We have made some settings on PostegreSql, in particular we have enabled the parameters related to the tcp_keepalives_idle and so on. Now all connections estabished between Nuxeo e Posgres works fine. Thank you very much for you support.

Giancarlo

05/25/2012