We recently switched one of our client’s WebSphere Application Server and WebSphere Portal based infrastructure to a new physical environment. During our testing we found that we were getting stale connection exceptions in SystemOut.log and in the application logs.
On digging a little deeper, our team tracked it down to the fact that the Cisco firewall was dropping our Oracle database connections after they had been inactive for a certain amount of time.
When our team discussed the issue with the network team, they were essentially told that all was good with the network infrastructure.
The application team tweaked some of the connection pool settings but that only helped alleviate the issue slightly and we started encountering performance issues. Eventually, we figured out a solution that works!
The solution that we ended up with was to work around this issue on the Oracle end. We modified the sqlnet.expire_time parameter in sqlnet.ora on the Oracle 10g server down from the default “30″ (30 minutes) to “1″ (1 minute).
Why this worked?
The parameter SQLNET.EXPIRE_TIME is used to specify the time interval, in minutes, to send a probe to verify that client/server connections are active. If the probe finds a terminated connection, or a connection that is no longer in use, it returns an error, causing the server process to exit. This parameter is primarily intended for the database server so that it can free up server-side resources that are not in use.
A side-effect of the probe is that there is TCP-IP activity between the client and the server each time the probe is executed, and the firewall designates the link as an active one. By lowering the probe interval down to a minute, we managed to fool the firewall into leaving the database connections in the connection pool alone, and not terminate them even though the client may not execute a query using the connection for an extended period of time.
I am sure you can tweak the value of this parameter to determine the optimal value for your firewall and situation; dropping down the sqlnet.expire_time to a minute is probably too radical and redundant in reality. Even though the probe generated by this setting is small, it still does generate network traffic that may be meaningful in some situations.
No comments:
Post a Comment