noticeboard.ru.ac.za

2014/09/02 - Scheduled maintenance
During the Tuesday, September 2nd maintenance window, maintenance work affecting a number of services will occur, starting at 5:30 PM and being completed before midnight. The following will occur:
  • Web proxy servers: software on the web proxy servers will be updated, and a configuration change will be implemented. Access to the web may be affected for around an hour.
  • E-mail mailbox servers: software on the staff and student e-mail mailbox (IMAP) servers will be updated. It is expected that mailboxes will be unavailable for two periods of around an hour. During these outages, it will be possible to send e-mail from e-mail clients, but the webmail service will not accept logins.
  • Server software updates: software on the servers which provide the following services will be updated, resulting in outages of approximately five minutes per server:
    • DNS
    • DHCP
    • RT
    • Registration system
    • Incoming e-mail
    • Outgoing e-mail
    • SMS
    • Apple Software Update Server
    • Provisioning (web proxy autodiscovery, telephone provisioning)
    • Wi-Fi and VPN authentication
    • Network graphs
    • Guest network access
    • Network monitoring
  • Firewall software updates: software on the firewalls which connect the Internet to the campus, SEALS and public networks will be updated. As the firewalls operate as a redundant pair, no noticeable outages are expected during this maintenance.
This maintenance work is being undertaken in the ITSC-approved scheduled maintenance window. More information about maintenance windows and maintenance periods.
QUOTE(drs @ Aug 29 2014, 04:02 PM)
Web proxy servers: software on the web proxy servers will be updated, and a configuration change will be implemented. Access to the web may be affected for around an hour.

This work did not go according to plan. Whilst the software updates were successful, the configuration change -- related to the change to per-user quotas -- was not. The new configuration caused the proxy servers to crash within a minute or two of starting, in much the same way as they did when they were first installed.

During the course of the evening we have continued to experiment with one of the six proxy instances (e.cache), and have tried a number of different configuration scenarios. As a result of this testing, we now have a better understanding of the problem and how we might work around it.

e.cache is currently configured slightly differently to the other five proxies, and will remain so overnight. Theoretically this difference should not be visible to users of the proxies; it is merely an internal optimization. We're hoping that this will allow us to confirm what we've learnt is stable over a period of hours and under different load conditions. We'll re-evaluate this tomorrow morning, and if the problem we've been seeing has re-occurred, we'll revert the configuration change.
e.cache has remained stable throughout the night. To further confirm the theory, I've extended the test to all instances running on our Struben cache server (a.cache, c.cache, e.cache).
The expanded test caused the proxies to start crashing again. However, in doing so, we've learnt something further about the problem.

At the end of the maintenance window (7.30AM), I reverted the configuration to what it was yesterday afternoon. No further testing will be done now.
post.5532716