Ayanova server configuration adjustments
Print

Short version: A customer recently reported daily Ayanova disconnects with the cryptic error:   The underlying connection was closed: An unexpected error occurred on a receive. Some configuration changes were made to the server to mitigate this problem and we hope that it helps performance.  Some of the changes won't go through until tomorrow and you may have experienced an increased number of disconnects today.

Long version:

Prior to a full investigation, my gut told me to check on various timers and timeouts since the server may be terminating the connection prematurely.  When it comes to timers and timeouts, in IIS7 on Windows servers, there's a lot of them to be familiar with for sure.  A few were updated:

  • Connection Time-Out under IIS7 webobject under Connection Limits:  Changed from 120 to 3600 seconds.  This governs when connetions are closed based on the period of time that connections are inactive.  This changes it from 2 minutes to one hour.
  • Application pool worker process shutdown time limit:  Changed from 90 to 300 seconds.  This governs the amount of time allowed before a worker process will be shut down once it's launched.  Generally worker processes should be short lived, but over time they can take longer to complete (datasizes grow, more complex queries are built and it's waiting on the SQL server, etc.)  Changing this from 1.5 minutes to 5 minutes extremely generous.  This has to be kept somewhat low, however, to prevent out of control worker processes from monopolizing server resources for too long - there's a point when it's clear that it won't ever finish because something is wrong and it needs to be terminated, and that's what this does.

Applying these changes unexpectedly reset the application pools (the process which powers your AyaNova installation), you may have lost connection during this time - if so, I apologize (these changes occurred between 9:30 AM and 12 PM today.)  I am holding off on any further adjustments now that I know the app pools may be reset, until the scheduled maintenance period of 7 PM - 9 PM PST (which admittedly I should have done in the first place - the general policy is "don't touch the servers outside of the maintenance periods unless it's an emergency."

Of course, whenever there is an error, a proper investigation is warranted.   There is some good information here about this error:

http://geekswithblogs.net/denis/archive/2005/08/16/50365.aspx

This discussion suggests a few things:  .NET software bugs, choice of HTTP version by the application, and has one suggested workaround of disabling KEEPALIVE which increases the total number of connections potentially sevenfold.

http://support.microsoft.com/default.aspx?scid=kb;EN-US;915599

This page suggests :

"This error occurs when the server or another network device unexpectedly closes an existing Transmission Control Protocol (TCP) connection. This problem may occur when a time-out value on the server or on the network device is set too low. To resolve this problem, see resolutions A, D, E, F, and O."

Resolution A is to update the .net framework - the server version is slightly out of date as is often the case since updates are released often and reboots to install them are avoided until sufficient updates stack up pressuring a reboot.  A reboot will be performed tonight to push these updates through.

Resolution D is to disable keepalives.  I'm familiar with this setting and I'm not ready to do this without doing a performance impact analysis first.  I will study this throughout the week when time permits inside the maintenance window and try to come to a decision soon on this.  Microsoft's suggestions to fix problems are sometimes drastic without even bothering to mention the sometimes severe performance penalties.

Resolution E is to "set the ServicePointManager.MaxServicePointIdleTime property to less than the time-out value of the server keep-alive connection".  The property they talk about is in the code of AyaNova, so I need to find out what that is, but the server keep-alive connection value it refers to is, I am 99.9% sure, the same as the Connection Timeout listed above which has already been changed from 90 seconds to one hour, so I think there's a very good chance this property is now below that.

Resolution F is to increase the connection timeout, which has already been done.

Resolution O is to "make sure that the client computer does not send the HTTP 100-Continue header."  I will have to examine the AyaNova protocol stream or check with the developers to see if this is the case.

scheduleduserimage AyaNova

Work order and service management software
AyaNova - the affordable workorder and dispatch management software - and HaveAByte makes it easy with our turnkey hosted system. Read more or request a trial today!