At about 3.30PM this afternoon the main circuit breaker on the protected load side of the UPS in Struben data centre tripped. As a result, all IT services running out of the Struben data centre (including core network, file & print sharing, and telephony) unexpectedly lost electrical power. This will have caused wide-ranging campus-wide service outages.

The bulk of systems were recovered within the first half an hour or so. However, we've spent the last four hours recovering the remainder. As of a few minutes ago, normal service should have returned (there is one exception: rhino.ru.ac.za, being the software library, has not recovered).

The reasons for the circuit breaker tripping are not clear -- the circuit is running at about 55% of the rated capacity of the circuit breaker, and there are no apparent faults. At this stage it can only be assumed that the circuit breaker itself may be faulty. This is supported by the fact that the breaker is several decades old, and is noticeably hot to the touch (which is unusual at such a low load).

The data centre is currently powered via the suspect breaker, which means that there is a chance that this problem will recur. We've sourced a replacement circuit breaker, and will install this should the existing breaker trip again.

The UPS itself is due to be decommissioned (its replacement was, ironically, delivered last week). As a result of this fault, we'll attempt to accelerate the process of switching over to the new UPS.