At approximately 00:35 this morning (Saturday) the University's core switch crashed. As a result all users of the University's network will have had limited or no connectivity. This crash will have affected all services, including but not limited to e-mail access, file server access (jackal/gecko), Internet/web access, DHCP, the SEALS shared library services, etc.
Service was restored to most areas of campus by 09:40 this morning.
Users may find that they need to reboot their computers in order to get them to work.
A more detailed technical explanation follows ...
The University's core switch is backed by a redundant switching fabric but for some reason the hardware watchdog on the switch did not pick up the failure of the main fabric. As a result the switch was left in a state where it was passing some packets but not other (in particular, it was not routing packets the the University's backbone subnet which hosts core services). The primary swiching fabric was unresponsive even from the console and as a result the switch needed to be power-cycled. At the moment the core switch is running on its secondary CPU.
Shortly after the main switch crashed, a second, identical switch that serves the departments of Computer Science and Information Systems locked up as well. This almost conclusively rules out the possibility of hardware failure. We're unsure at this stage what caused the crash but will follow up on Monday.
Some areas of campus would have experienced strange results when trying to access the network (for instance being able to access the Internet but not any backbone services). This is a combination of the way the University's network is routed around campus and the fact that there are a couple of redudant DNS and DHCP servers on campus.
Certain areas on campus did not regain connectivity after the core switch was rebooted. In particular, the Physics department's media converter also failed and had to be replaced.
As a result of the main DHCP servers being unavailable some computers will have failed to renew their leases on their IP addresses. This may result in those computers binding IP addresses within Microsoft's auto-IP block (169.254.*.*) rather than their correct Rhodes IPs. The easiest way to correct this is to reboot the machine.