** Maintenance Announcement – NO service interruption anticipated **

On Tuesday April 30th at 2200 (10:00pm) we will be modifying the SSL cert for secure.oregonstate.edu.  This is not a cert renewal, it is a new cert that has a Subject Alternative Name (SAN) for oregonstate.edu.  End users will not be affected.

 

Start: 04/30/2013 at 2200

End:  04/30/2013 at 2300

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

Part1.A Post Config Follow up:

We applied the sNTP change last night and none of the switches will pull time. After a quick searching on the internet we found this: 2910al-48G-can-not-get-time-from-W2K3-NTP-server

From this thread we learn that the firmware (W.14.38) we are running has a bug talking to the management host which also serves as the ntp server. So while the sNTP config we have now is good, the firmware is not. On this coming Saturday maintenance when we install the new firmware we will hopefully fix sNTP and get good timestamps on logs!

Spanning tree configuration went as expected and we saw the changes take place as we made them. We have not had a recurrence of spanning tree flapping, but we have had several instances where we would go a day or two with out an event. So we are still in a wait and see game.

Part1.B Pre-firmware questions to HP Support:

I sent the following response in on our open ticket with HP:

HP Support recommended firmware version W.15.10.0010 (and gave us a copy)
I see the current HP stable version is W.15.08.0012
And there is a early release version of W.15.12.0006
(see 2910al firmware download) Is there a particular reason support sent us a version in the middle of these two? Which would be the best version to load on the switches?

HP support then replied back with the response:

Hello

This email is regarding the case 4************, for the 2910al also the version W.15.10.0010 is stable but havent been posted on the website, and earliest availability version W.15.12.0006 you are right seems to be a new one but I dont see it under the list of 2910al software release versions that’s why suggest to use the W.15.10.0010

If HP support is going to recommend this, considers it stable, and will provide support to us running it then that is what we will do. We just want to be running in a current supported configuration. So we will apply W.15.10.0010 during this Saturday’s maintenance window as planned.

Part 1 | Part 1 follow up | Part 2 | Part 3 | Part 4

Our Problem:

We have an HP StoreVirtual / LeftHand OS multi-site SAN. Part of this SAN is it’s switching infrastructure, which is built out of 4 2910al-24g switches. Each switch has a 10-gig add on module providing 2 10gig ports.

Each site has 2 switches joined together via a lacp trunk. Between each site we have 2 10gig fiber pairs linking the sites. Hanging off this switching core we have a blade center per site, as well as 6 x p4500g2 StoreVirtual nodes per site. The blade centers have a pair of  10gig uplinks (1 per switch per side) in a active/passive configuration. Each p4500g2 node has a pair of 1gig uplinks (1 per switch per side) in a alb configuration. So our SAN network looks like this:

The configuration on these switches was setup by a vendor for us. At the time we were very new to the StoreVirtual world and needed the help. All was well for over an year! Then about a week ago we started to get a pages and notifications that a switch was down. This was disconcerting but you jump on the switches and all seemed well. We have never seen any traffic problems or anything wrong at all. This week we started to get multiple pages per night. We were not too happy. My colleague Josh put a call into HP support and they noticed we have a Spanning tree problem. Which switch thinks its the root node is flapping around.

Looking into this problem and others has illustrated what seems to be a miss configuration on our switches. Switch1 in each location is configured. Switch2 in each location is auto detecting its world and has no configuration set other then its local ip address.

So here we are! Part 1 of switch RE-configuration. Lets see if we can try and get these switches configured optimally. Our part 1 strategy is to just mitigate the spanning tree flapping. We do not know if this is indicating an hardware error or if each switch having the same priority is causing the flapping. We also discovered that no ntp was set so the logs out of the switches are less then useful.

Part1.A Configuration changes on our next Tuesday maintenance window:

All switches will get:

config
timesync sntp
sntp unicast
sntp server priority 1 ***.***.***.1
show sntp
write mem
exit

Then each switch will get a Spanning Tree priority set. We are going to try and be minimally disruptive as possible so we will be keeping the present/most often winning switch as the spanning tree root. Where X is 1, 2, 3, or 4 depending on which switch it is.

config
spanning-tree clear-debug-counters
spanning-tree priority X
show spanning-tree
write mem
exit

Part1.B Firmware upgrade to latest on our next Saturday maintenance window:

We will be applying the latest firmware to each switch. This is also a bit interesting so we will follow up with hp support to answer the question… Which firmware should we apply?

Current stable: W.15.08.0012
Support provided: W.15.10.0010
Early Available: W.15.12.0006

Part 1 | Part 1 follow up | Part 2 | Part 3 | Part 4

** Maintenance Announcement – DEV VM service interruption anticipated **

We are upgrading our 2910al-24g switches to the latest firmware. During the upgrade of the switch that provides the storage for the DEV VMware cluster will be unavailable, as such the Dev VMware cluster will also be shut down.

Production SANs are redundantly connected and maintenance will have no noticeable effect on these SANs, meaning that the OSU systems used by students, staff, and faculty will not experience a service interruption.

Start: 04/20/2013 at 9:00 PM

End:  04/20/2013 at 11:59 PM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

** Maintenance Announcement – No service interruption anticipated **

We will be applying a configuration change to our iSCSI switches that support our StoreVirtual SAN. This is the storage network that back’s our VMware infrastructure.

We do not anticipate any service interruption. Our switching is redundant, we will only change the switches one at a time, and the changes should not be service interrupting.

Start: 04/16/2013 10:00 PM

End: 04/16/2013 11:00 PM

If you have questions or concerns about this maintenance, please contact the Shared Infrastructure Group at osu-sig (at) oregonstate.edu or call 737-7SIG.

We will be working with NOC to add vlan 414 to our blade centers trunk. This will be done to one trunk port at a time, and one data center at a time.

Start Time: 4/06/13 at 9:00 pm

End Time: 4/06/13 at 9:30 pm

If you have any questions or concerns about this maintenance, please contact OSU-SIG ( at ) oregonstate.edu or call 7-help