This site is maintained and updated by Colin Lynch a Subject Matter Expert for all things Cisco UCS.
I have no Vendor affiliation, I just call it as I see it.
Colin Lynch CCIE#7064, VCP 4/5, EMCIE, NCIE-SAN
Subject Matter Expert for Intergrated Systems and Converged Infrastructure.
Hi Colin, Thanks for the beautiful posts that you have which create a lot of knowledge on UCS all around. I am a regular follower on your posts & appreciate them very much. I had some questions though & had them for quite some time which I did not know who to ask, so may be you can help me with them. So a couple of questions to start with, but, I have a lot of them (some stupid) also
1) UCS allows you to abstract identities in the form of the different types of pools that we can create like UUID, WWNN, WWPN & others. How do we make sure that these entities still remain unique in the universe for eg. UUID, anyone can now assign any UUID to their blade but how do we make sure that the initial concept that a UUID is universally unique still remains?
2) As I have studied there are 2 types of split brain scenarios with UCS. Partition in time & partition in space. Partition in space I understand happens when the cluster links are not working. What would be an example case for Partition in time?
3) How many WWNN do I assign to a full width server. Since I don’t have access to one I was not able to ascertain. From what I know a WWNN is assigned to 1HBA & WWPN to different ports on that HBA. So if I have 2 menlo cards on my full width server I should have 2 WWNN & 4 WWPNs. Is that right?
4) I see in some of the Cisco presentations the approach for VN-Link that is most suggested is the one which uses VIC with N1KV in UCS. Why is that the most recommended & what would be an example use case?
Thanks again for the nice posts & hoping for an answer to the above questions.
Regards,
-Tarun
Hi Tarun
Thanks for following and the great questions!
1) You are correct that we have to ensure each UCS defined address is unique but the boundary of that uniqueness differs i.e. MAC addresses only have to be unique within a subnet, UUID’s only have to be unique per UCS Domain and WWNN & WWPN’s only have to be unique within the Switched Fabric.
In all my Cisco UCS designs I create a standard for that Client as to how their UCS addresses are defined. This not only ensures uniqueness, (within that client at least) but allows us to define some useful information into WWPN and MAC addresses a luxury not available when using Burned In Addresses (BIA’s) I’m sure you will agree being able to look at a MAC addresses from a debug or Wireshark capture and immediately know what that device is, where it is and what OS it is running reduces troubleshooting times by orders of magnitude. Or If I see WWPNs with BB in them on my Fabric A then I immediately know someone has made a config / cabling error in the SAN somewhere.
I have shown some address naming conventions I use below.
2) Again you are correct the “Partition in Space” split brain scenario is when a node is no longer receiving heart beats across the L1 / L2 interconnect from the other node member. There is however a Serial Electrically Erasable Programmable Read Only Memory (SEEPROM) on the chassis accessed via the Fabric Extender which shows whether the upstream FI is still reachable, so if in the very unlikely event you loose both your copper cluster links, The fabric interconnects will still know whether or not its peer is alive and forwarding by checking the SEEPROM on the chassis. And NO failover will occur. You will however get alerts in UCSM and no new state information will be synchronized between the FI’s.
The other type of split brain is as you mention referred to as a partition in time. this occurs when a FI attempts to start the UCSM on an outdated configuration. Again, Cisco UCS Manager detects and resolves this type of split brain using the SEEPROM.
3) Again you are right if your are using the Menlo (M71KR / M72KR) then you have 2 vHBA’s per adapter, and if you have 2 of them in your full width blade that blade will then require 2 x WWNN and 4 x WWPN. Obviously if you use the Palo adapters (M81KR) or VIC 1280/1240 the obviously you can have allot more vHBA’s and will obviously require more WWPN for them.
e.g. Host 1 fc0 (Fabric A) = WWPN 20:00:00:25:B5:AA:00:01, Host 1 fc1 (Fabric B) = WWPN 20:00:00:25:B5:BB:00:01,
I generally assign 128 WWPN per Pool, that way it scales well, doesn’t take up excessive entries in the XML database, and all hosts get the same last 2 digits and the only difference between then is the Fabric designation, which makes troubleshooting, aliasing and zoning a dream
4) You have 2 choices when it comes to using VN-LINK either Software (Feature rich but software switched in the Hypervisor) or Hardware (VM-FEX previously called VN-LINK in hardware) which gives hardware switching performance and hypervisor by pass (if used in VMDirect Path mode)
Nexus 1000v is generally my preferred option as it is very flexible and feature rich and gives a nice NXOS command line so is very familiar to Nexus Admins. VM-FEX on the other hand uses UCS Manager for network administration so keeps the GUI admins happy. Bottom line is if low latency is a big requirement go VM-FEX, if features is the main requirement go N1KV.
The figure below shows the feature comparrison between N1KV and VM-FEX. Also see my VM-FEX videos at here
Hope this clears things up for you, if you need any more info you know where I am
Regards
Colin
Hi Colin,
Thanks a lot on this long & detailed reply & I am sorry to keep asking you all these question, but, honestly I really appreciate your answers because I’ve been pondering on them for so long & haven’t got an answer from anyone on them.
1) It is clear now. I understand now that every implementation of UCS has to take care to implement a unique format of identifier. It just came as a little odd to me that by default UCSM does not enforces anything for implementing uniqueness for identities that are supposed to be universally unique like UUID’s.
2) Thanks again for the detailed explanation on “Partition in Space”. I also understand a little bit on how can “Partition in Time” happen, but, I am not able to imagine what sequence of events can lead to a Partition in Time kind of a situation? I mean what kind of situation would we need this safeguard?
3) Thanks for the detailed explanation on this. It is crystal clear now.
4) Your videos on VM-FEX are just out of this world. I have gone through them & I love them. I learned a few things after watching those videos. What I am still not sure about though is what will be the use case for using a VIC along with a N1KV in a UCS solution? I have seen that as a recommended implementation from Cisco. Why?
Once again, even before you attempt to answer those question I want to thank you again for helping me out here. I can’t emphasize that more.
Regards,
-Tarun
1) Cisco have to leave somthing for us to make a living
2) You don’t really have to safeguard against a partition in time, UCS does this for you via the SEEPROM, i.e. if a Fabric Interconnect is “out of state” config is behind and for what ever reason UCSM tries to go active on the Out of State FI, then that is where the potential for a Partition in Time could occur, but again UCS mitigates this via the SEEPROM in the chassis. There is a bit about this in the following doc. http://www.cisco.com/en/US/prod/collateral/ps10265/ps10281/white_paper_c11-525344.html
3) Good
4) Using a VIC inconjucion with a N1KV does provide some benefits but is not mandatory by any means. The VIC in this case only provides the UPLINKs for the Nexus 1000V. The benefits being that as these uplinks interfaces are virtual and you can configure / reconfigure them as you wish. As you know when ever you create a vNIC on the Cisco VIC, this creates a corresponding Veth on the fabric interconnect and connects them together with a virtual cable, this is what is the Virtual Network Link. VN-LINK. The VIC really comes into its own when using VM-FEX as you know.
Regards
Colin
Hi Colin,
Another question just came in my mind.
For a 4-link topology, if 1 link fails & we re-ack the chassis. Documentation says re-pinning would happen & it would pin odd slots to odd uplink & even slots to even uplink. What if uplinks 1 & 3 on the IOM go down then what would be the pinning like because both the odd uplinks from the IOM are now down?
Hi Tarun
If you have a 4 link topology and 1 link fails then the adapters pinned to that link will failover (If configured to do so) the remaining 3 links will continue to be used (3 Links in discrete pinning mode is not a valid startup topology but is a valid running topology, as in the above scenario)
OK if you then Re-Ack the chassis UCS will say 3 links is not valid and reduce the links to two and re-pin all adapters acordingly i.e. Odd blade slots to first active link and evens to the second.
If links 1 and 3 went down leaving only links 2 and 4, I would expect Odd blades to PIN to Link 2 and Even blades to PIN to Link 4 when the chassis is Re-ACK’d.
Regards
Colin
Hi Colin,
I am loving it! So nice, to have a reply from a UCS expert. Thanks again for your time. That’s what I was also thinking (& I am proud I think on the lines of a UCS expert), but, then if odd servers are pinned to Link2 would it not negate the rule for 2 link topology which is odd slot servers to odd numbered uplink & even slot servers to even numbered uplink?
I am sure there is going to be an exception for this kind of situation, but, I could not find any documentation for the same.
Regatds,
-Tarun
Pingback: Have a Question about Cisco UCS? Ask it Here! | UCSguru.com