noticeboard.ru.ac.za

2013/08/12 - Proxy Server Performance Problems
The two new proxy servers installed during last Tuesday's maintenance period are experiencing some problems. You will experience this as slow responses from the proxies, with intermittent timeouts while trying to access pages.

The problems first appeared towards the end of last week, and became worse over the weekend.

The exact cause is not yet known, but we're looking at a number of workarounds to try and improve the situation. At present it looks like the most likely explanation is that the proxies are bottlenecking on disk performance. For this reason, we will temporarily disable disk caching.
Disabling disk caching hasn't made much difference, but has eliminated one possible cause. We'll continue with similar experiments during the course of the day to systematically narrow down the possible causes, and thus try and figure out exactly what's happening.
It appears that the underlying problem here was the size of the disk caches, and the number of objects they contained. As the caches filled up (which took several days), the number of very small objects that were stored increased, and so the total disk I/O increased. On Saturday or so this reached a point where the disks could no longer cope (as of this afternoon there was just short of a million objects cached on disk).

To mitigate this we've halved the size of the disk cache each instance uses, and increased the minimum size of objects to be written to disk (meaning smaller objects should be kept in memory). This new configuration is currently going into production. Unfortunately to make these changes we've had to discard the existing disk caches, which means it'll likely take a few days before we know if the changes we've made are completely successful.
It seems we're now hitting memory limits instead, so I've just made some changes to try and resolve that.
post.5532672