Leave your question in the comments section.
Before this section was added many questions were left in the About section above, so why not also check there to see if your question has already been asked.
Leave your question in the comments section.
Before this section was added many questions were left in the About section above, so why not also check there to see if your question has already been asked.
Hi Colin,
I might be asking a stupid question & I might need to do a little more research before asking this question, but, this post of yours is enticing me into asking this question. I had been thinking UCS can do a per packet/flow based load-balancing by doing an active-active NIC teaming between the two vNIC’s created for the same H/W server. Is it not?
I just cane across a PDF which says “Per flow/packet load balancing at the host level, is not allowed on UCS B-Series”.
If the above is true it means I can have one h/w server using FI-A (FI-B for failover) & other h/w server using FI-B (FI-A for failover) but I can’t have the same h/w server using both FI-A & FI-B at the same time which I earlier thought was possible using 2vNIC’s on the same h/w server one using FI-A another using FI-B & teaming them together for both b/w & HA.
-Regards
-Tarun Lohumi
Hi Tarun
Again the answer is it depends, if you are using a Hyper-Visor like VMware you can select the hash algorithm to use and if you would like to active/active load balance as you suggest you would need to specify Port-Group based load balancing, this will ensure that each VM gets consistently mapped to the same UCS vNIC so VM 1 may go fabric A and VM 2 may go out Fabric B. This is fine and provides good balance.
I think the issue arises when you use a per packet / flow algorithm as the upstream LAN will see the host on Fabric A one minute then fabric B the next and the host will “flap” between the two.
With regards to a bare metal install there is now a M81KR (Palo) NIC teaming driver for W2K8R2 available, which I have found works really well.
the Windows teaming driver supports the below:
Supported teaming modes:
• Active-Backup (with or without failback)
• Active-Active (outbound load balancing only)
• 802.3ad LACP (only supported on C-Series (P81E)) Currently
Supported load balancing methods:
• TCP connection
• Source and destination MAC address
• MAC address and IP address
Supported hashing options for load balancing:
• XOR hash
• CRC hash
A pleasure as always.
Colin
Thanks for clarifying Colin. Appreciate your speedy responses.
-Regards,
-Tarun Lohumi
Here’s my recommendation when teaching UCS: Load sharing is usually done to a) scale bandwidth and b) provide redundancy in case a link fails. For hypervisors and UCS, I typically recommend using virtual Port ID as the load sharing mechanism. Some VMs will get pinned to the A interface, and some will get pinned to the B interface. Since we’re dealing with 20 to 80 Gbits per second per server, you don’t need to spread load across multiple links for scaling. For redundancy, VMware does a great job of handling link failure scenarios, or alternatively you can use Fabric Failover, which will turn an A-connected vNIC into a B-connected vNIC (or vice-versa) in case of failure. So either VMware
Thanks for the input Tony and completeley agree, Port ID load sharing is the way to go for ESX Hosts for dual uplinked vSwicthes / DvS.
And Fabric Failover for single uplinked vSwitches if you want to keep east/west traffic within the FI (like vMotion)
Colin
What is the best practice (and why) regarding network load balancing within vSphere when using UCS? Maybe, expand that to a networking best practice for UCS (NIOC, shares, QOS, etc)?
Thanks!
GS
Hi GS
Thanks for the great question, and one (as you might expect) with potentially several answers depending on the implementation, i.e. whether using Standard vSwitches / vDS, Nexus 1000V or VM-FEX. Lets take the most common implementation I tend to do which is vSphere using standard vSwitches.
Ok, Thats narrowed us down but still a lengthy topic, so I’ll concentrate on the Cisco UCS specific aspects and not so much on the standard VMware config, I/O control etc.. which is equally relevant whatever platform is used and I’m sure you are familiar with.
So the first question I tend to address with customers is how do they want their hosts networking to look. What I mean by that is, the client may well have a Networking Standard for their ESXi hosts or want to use their standard host templates, which is fine. But Cisco UCS does have some nice features which could greatly simplify the Hosts networking. Features that you may well already be aware of like Hardware Fabric Failover, where you can present a single vNIC to the OS / Hyper-visor and that vNIC is backed by Hardware fabric failover, i.e. if there is any break in the traffic path on the primary fabric that the vNIC is mapped to then UCS Manager will immediately switch the data path to the other fabric, without the OS ever seeing the vNIC go down. This as you may have guessed could potentially half the number of Network interfaces in your hosts (i.e. you could potentially leave out all the Uplink interfaces which are purely there for redundancy, and you can salt and pepper the remaining single vNICs to be mapped primarily to Fabric A and Fabric B to provide load balancing across both Fabrics.
The Potential situation to be aware of here though is if a VM which has its active traffic flow via an Uplink mapped to fabric A is communicating with a VM whose traffic flow is mapped via Fabric B then that flow has to be forwarded to beyond the Fabric Interconnects to the upstream LAN switches to be switched at Layer 2 between fabrics even if both VM’s are on the same VLAN.
So what I tend to do is use a mixture of both single vNICs backed by hardware fabric failover and dual teamed vNICs for vSwitch uplinks which I would like to load balance across both fabrics.
But lets assume the customer wants to retain their Physical Host Networking standard so vSphere admins have a consistent view and config for all hosts whatever platform they are hosted on.
So a typical ESXi Host would look something like:
2 x Teamed vNICs for Management vSwitch
eth 0 mapped to fabric A
eth 1 mapped to fabric B
1 x vNIC for VMware user PortGroups uplinking to a dVS
eth 2 mapped to fabric A
1 x vNIC with Fabric Failover enabled for vMotion
eth 3 mapped to fabric B
Of course you can add other vNICs if you have more networking requirements or require more than a simple port-group (802.1q tag) separation. i.e. an add in an iSCSI vSwitch, Backup vSwitch etc..
So the setup would look something like this

The reason I go with a single fabric failover vNIC for vMotion is for the potential “issue” pointed out above, which if I have 2 vNIC uplinks to my vMotion vSwitch and were using them in an Active/Active team for redundancy and load balancing I would map one to fabric A and one to fabric B, that could mean that vMotion traffic is potentially taking a very suboptimal route across the network i.e having to go via the upstream swicthes. so by using only 1 vNIC and mapping it so a single fabric all my East/West vMotion traffic will be locally switched within the Fabric Interconnect and not have to be sent to the upstream LAN at all. And if in the event we had a failure within the primary fabric path UCS would switch this traffic transparently from the ESXi host to the other fabric which would again locally switch all vMotion traffic.
Also important to note when teaming the vNICs within vSphere to use Port-ID as the hash, this is to prevent hosts “flapping” between fabrics in the eyes of the upstream LAN switches.
OK once the above its setup you do have the option of mapping UCS QoS policies to each of the above vNICs within UCS Manager (by default all traffic is placed in a “best effort” policy)
As a standard I generally set a 1Gbs reservation for the vMotion vNICs and leave the others as default. Bearing in mind that these are multiple 10Gbs links and the QoS would only kick in in the event of congestion.
NB) FCoE traffic is inherently prioritised within the 802.1Qbb – Priority-based Flow Control standard a sub component of the Data Center Bridging (DCB) standard which Cisco UCS inherently uses. between the Mez Card on the blade and the Fabric Interconnect.
Ok, so with reagrds to Northbound load balancing, as you may know when you create the vNIC within the Mez card what you are actually creating is a Veth port within the Fabric Interconnect, as the Mez card (Cisco VIC) is an adapter Fabric Extender.
So when you create your teamed pair of vNICs within vSphere that will only get your load balanced traffic to the fabric Interconnects. Now assuming you are running your fabric Interconnects in the default end host mode (Where the FI’s appear to the upstream LAN as a Big Server, The FI’s obviously need load balancing uplinks into the LAN.
Now for redundancy you will likely have a pair of LAN switches hopefully capable of running a Multi-Chassis Ethernet service live Nexus vPC or Catalyst VSS. If thats the case you just size your uplinks to what you want and dual connect your FI’s to the upsteam switch pair and channel them at both ends (Standard LACP).
As shown below

The end to end result is that load balancing is done safely and optimally and East/West traffic is maintained within the UCS Infrastructure as much as possible.
Hope that answers your question, if not fire back at me, after all us Guru’s need to stick together
Regards
Colin
whre did you get the vmware icons like portgroup and vswitch?
Hi Arjan
Most likley from one of the Visio Cafes or perhaps VMware.
I have a few Visio stencils and Powerpoint slide Icons. I have zipped them up and posted them here
Regards
Colin
Great article (and website!), thanks for the useful info. One thing I’m not clear on is the 2 connections for management (I am using vSphere). Are they active-active or active-passive? For east-west traffic you suggest one vNIC to ensure traffic is locally switched on one FI. Are you happy to use 2 for Management as the traffic is north-south? Would 1 Management vNIC be sufficient?
Would you ever recommend having all traffic go through one fabric (except FC) and hardware failover box ticked for the vNIC?
Hi Andy
Thanks for the question.
My preference is to use a single vNIC and vSwitch for vMotion with hardware fabric fault tolerance enabled. This way all vMotion traffic will be locally switched in the FI.
Now you could do the above for management however vSphere complains if it does not think you have redundancy for you management uplink. vSphere would obviously be unaware that this was being provided at hardware level.
While you can suppress these vSphere complaints I generally just keep it happy by giving it another vNIC mapped to the other fabric and configuring then as active/standby. Active/Active would cause issues as you can’t port-channel across different fabrics.
Then if I was using a vDS I would just have 2 vNICs mapped to different fabrics! And teamed in ESXi using Port-ID as the hash. This way you get active/active load balancing with no MAC address flapping in the upstream LAN switches.
The only time I have ever sent all traffic through a single fabric, is when a customer insisted they did not want any reduction in bandwidth in the event of a fabric failure. Which meant that either they would have to ensure that they only ever went to 50% capacity on each fabric, or they just use one active fabric and know that if that fabric failed they would get identical performance on the other one. But not something I would do generally.
Hope that clears things up.
Regards
Colin
Hi, i have an C200M2 With 2 CPU E5620, 28 gb RAM, 1 HDD SATA 500gb 7.2k -no raid, lastest firmware (1.4.3) to CIMC and BIOS, Vmware ESXi 4.1, disable LRO, deploy using an OVA template to install CUCM 8.5 for 1000 Users, single nic one vlan, vSwitch without Traffic Shapping disabled, Vmware tools installed, but the appliance is too slow, what feature recommend to resolve this issue??
Regards.
Hi Albert
Sounds like there are alot of variables in there, I guess its going to come down to where your bottle neck currently is, I doubt you are CPU Constrained which leaves Disk, memory and I/O as possible “culprits” assuming that is that the VM’s and backend are all working OK.
VMware is generally pretty good at telling you if you have memory or CPU bottle necks, I guess you’ve checked the util of the mem/cpu ?. and the IOPs of the disk. Obviously a single SATA isn’t going to give you much performance and has no redundantcy. Also check the Util of the switch port to which your server connects to to check for high everage utilisation or packets drops/runts/giants etc.. which could indicate a speed/duplex issue.
Does your HW meet the recomended spec for a 1000 User CUCM deployment
Also what are you comparing it to?
Did it used to run fine and has slowed up recently
Does it “slow down” at particular times of the day or is it always slow?
All good questions to ask and explore, probably not the right forum here, but more for your own support teams and the escalate to TAC if issues cannot be identified.
Theres a good presentation on CUCM on UCS here which details sizing and optimisation.
http://www.cisco.com/web/CA/events/pdfs/CNSF2011-Planning-and-Designing-Virtual-UC-Solutions-on-UCS-Platform-Joseph-Bassaly.pdf
Regards
Colin
we are testing mapping EMC Luns from UCS, have created service profile and assigned it to a blade . but we are not able to see the vHBA WWWPN numbers , only one wwpn number is visible in the storage zone. Not sure what is the issue, Need help/ suggestions
Hi thanks for the question.
The first thing I would check is that all your vHBA’s have WWNN/WWPNs assigned either manually or from a pool, check this by expanding the vHBAs in the service profile.
Assuming they have their addresses check the vHBA’s are in the correct VSAN (if no default VSANs are being used)
If you are booting from SAN confirm your SAN Boot Policy has the correct WWPNs for the correct targets on the correct fabric.
(At this stage you can just use dummy WWPN of the targets, the vHBA’s will still flogi into the fabric)
If above checks out. ssh into the FI’s directly, connect NXOS, and check whether your WWPNs are in the flogi database of your FI’s “sh NPV flogi database” from NXOS mode.
If they are confirm the correct WWPN is showing on the correct Fabric Interconnect, if thats good,check your fibre channel uplinks between your FI’s and your SAN switches, make sure they are in the correct VSAN and are connected to the correct fabric.
If thats good check the flogi database of your SAN switches, and confirm they have NPIV enabled.
Also turn off quiet boot in the default BIOS settings and have a look during boot as to whether your vHBAs are seeing the targets you expect.
Thats the order I would do it in and you should rapidly work out where the issue is or isn’t.
Good luck, feel free to come back to me with any other questions.
Regards
Colin
Hi UCS Guru. Long time reader, first time poster. Something that has been bugging me. A 6120 Fabric Interconnect has two physical management ports, mgmt0 and mgmt1. The gui and CLI will only allow you to configure one mgmt ip (mgmt0) it seems. Can you configure the second management port? I have tried researching and came up empty.
Hi Jim
Thanks for reading and getting involved
The reason there are 2 MGMT ports on the 6100 series controllers will no doubt come as a real anti-climax to you.
Its simply because the 6100 series uses the same tin as the Nexus 5000 series (6120 = 5010, 6140 = 5020) Theres no great technical answer I’m affraid.
So the 2nd MGMT port on an FI is not used.
The Gen 2 FI’s similarly are Nexus 5500′s painted a different colour and running readonly NXOS with all the UCS goodies ontop.
6248UP = Nexus 5548UP, 6296UP = Nexus 5596UP. You will notice that all these models now only have a single MGMT Port.
Makes sense really why should Cisco re-invent the wheel if they already have a great product that fits the bill nicely.
You may have noticed that the UCS products and features trials the Nexus BU by approx 6 months, so whatever products / features get released for Nexus you could expect to see them in UCS within 6 months or so.
So hope that puts you out of your misery.
Look forward to your future posts.
Regards
Colin
Hi Colin, we have a wwn presentation problem from a UCS 5100 series with 2 FI 2208XP connected to 2 nxk5. Right now we managed to make the networking funcional, but the fiber channel part doesn’t work as we don’t see any wwn on the nexus. What do we need to be able to use the fc part of fcoe interfaces? Do we need NPV feauture enabled? Also we see fc port on nexus as “initializing”.
Thanks in advance
SGT
Hi SGT
Thanks for the comment.
No your upstream Nexus should not be in NPV mode, your 6248UP are however in NPV mode by default. What you will need to do is enable “Feature NPIV” on your Nexus 5500′s this will allow multiple fabric logins over the same F Port on the Nexus.
Are you sure you have configured some of your Nexus ports as native fibre channel? You need to split your 5500′s into an ethernet side and a native fc side using the below commands:
Slot 1
port 1-24 type ethernet
port 25-32 type fc
if your using some ports on the expansion module for fc just the above commands but choose slot 2 instead.
You may also want to ensure you have a storage licence on the N5ks which covers the number of fc ports you have.
If the above is all ok, I would confirm you can see the WWPN in the Fabric Interconnects, by dropping to the UCS CLI and doing a:
Connect NXOS
Sh NPV Flogi database
This should show that your servers have a least logged in to the FI’s, which from the servers point of view are fc swicthes (NPV mode just means that the FI’s are switches to the UCS servers but appear as hosts to the upstream SAN switches in your case the N5500′s)
Once the WWPN’s of your servers are visable in the local FI’s flogi database ensure that your VSAN ID on your FI FC uplinks matches the VSAN on the port on the Nexus you are connecting to (by default 1 on both ends) Also as mentioned that “Feature NPIV” is enabled on the Nexus switches.
Going through the above you should rapidly sort your issues out.
Regards
Colin
Hi Colin,
Thanks for this great page, now we have a definitive place to find answers to our questions. A small & might be dumb question I had, but I would still post it to you.
The UCS system is designed to operate in multi-tenant & cloud computing kind of environments. Keeping this in mind, what happens if there are two customer’s both using vlan 10 & both of them are terminating on the same FI’s? How would the FI handle traffic from two different customer but the same VLAN terminating on them?
-Regards,
-Tarun
Hi Tarun
You are right that UCS can work well in multi-tenant situations and I have set many up.
Things to remember that The administration of UCS is still not all that suitable for multi-tenant environments what I mean by that is is its currently not possible to negate the view privilidge in UCSM so if you wanted your tenants actually logging into UCS or KVM Launcher etc.. they would have read only privilages to the other Tenants Orgs and Locales etc..
That said Tenants should not need access to UCSM Manager anyway just the virtual estate that has been provisioned for them.
So going back to your question about VLAN overlap, You are right UCS is only a layer 2 device with no concept of VDC (or even vrfs from a usable perspective) as such VLAN 10 is VLAN 10 globally and should only be given to a single tenant, Tenants could of course share a common VLAN and be seperated by a context based gateway like VSG which can filter on much more than just IP’s for example all VMs prefixed by TenantAxxxxx cannot talk to any VMs prefixed by TenantB etc.. etc..
Still as a provider you should be the one in control of the VLANs not the tenant, this would only come into play if you are uplinking to multiple tenants own networks which is not really a likley scenario and even then you could bridge between different VLAN ID’s with untagged uplinks.
So in Summary if I were going multi tenant on UCS I would give unique blocks of VLANs to each tenant, create UCS Organisations’s for each to control resourses and certainly be using N1KV and VSG (No excuses now N1KV Essential Edition will be free and VSG is bundled with the Advanced Edition
A pleasure as always
Colin
Hi Colin, thanks for answering – but i’m still missing something. We would like to use fcoe, meaning we want ethernet and fc on the same interface on nx5k. We don’t want to connect trought fc uplink ports directly. Is it possible?
Thanks
SGT
Ah Think I’m with you now.
)
If you are after going FCoE between the Fabric Interconnects and the Nexus 5500 this is not currently possible, this is due for inclusion with the next major UCSM update v2.1 (Still can’t talk too much about it
I have drawn how you need to connect this up today along with how it should be in UCSM 2.1
Hope that clears things up for you.
Nice info about the 2.1 release.
Great blog!
What is the difference between a “full state” and “all configuration” backup? From the description in UCS, “all configuration” pretty much sounds like everything….so what is “full state” backing up that “all configuration” isn’t and why would I choose one or the other?
Thanks!
Hi Nick
As the name might suggest the full state backup as well as the complete UCS config also backs up the current STATE of the system, i.e. which Service Profiles are associated to which blades etc..
Another big difference is that the full state backup is a binary file which is not easily read. Also the only way of restoring this file is via a full system restore after defaulting the Fabric Interconnects. So as last resort if you have to use it.
The All_Config Backup is an XML file, which is just the entire config of the system. it can re restored on a running system and will warn if the restore conflicts with a current setting and gives you a couple of options to overwrite, merge configs or ignore etc…. You can also open and read this file in any txt, xml or web browser viewer.
I always take both periodically especially before doing any major work. Like upgrades etc.
The other backup options are partial All_Config backups which I don’t really tend to use much if at all.
Hope that clears things up.
Thanks for getting involved.
Regards
Colin
Why doesn’t the ucs support community VLANs – technical reason? Also while using isolated VLANs in combination with the 1000v, the configuration is very convoluted (mark them isolated on the UCS server uplinks, configure the pvlan uplink port-profile trunk with the primary vlan as a native VLAN). Can you help explain the encapsulation/decapsulation within 6140s.
Hi Ajay
the NXOS on top on which UCSM sits certainly do support community VLANs, so perhaps a restriction in UCSM.
Another great question, I have wondered that too
Luckily whenever I have setup PVLANs nativly in UCS Isolation VLANs was all I was after.
You mention you also have Nexus 1000v, so I would suggest you pass all VLANs as regular VLANs to the VEM’s and configure the PVLANs on the N1kv only they do support community VLANs so hopefully that work for you.
Re encap/decap, if were still talking about PVLANs there is no encapsulation involved the traffic flow is based on VLAN tags.
Regards
Colin
http://www.cisco.com/en/US/docs/switches/datacenter/nexus1000/sw/4_0_4_s_v_1_3/port_profile/configuration/guide/n1000v_portprof_6pvlan.html
Great Blog!
My question is about mobility of profiles between a B200-M2 with an M81KR, and a B200-M3 with a vic 1240. I believe we can create a service profile with host firmware package and management firmware package that has both adapter and bios packages for both blade types (True?). What I am wondering about is how would ESX 5.0 handle booting up on an M3 with a vic 1240, when it was installed on a B200-M2 with an M81KR.
Hi Duane
Correct, You can have a single firmware policy specifying every type of hardware in your environment if you wished. i.e. just a a single Host Firmware Package and call it 2.0(4a)_Update for example, and tick all of your boxes for B200M2 and B200M3 along with the boxes for the M81KR and VIC1240. Then whenever that SP is assocaited with either Model blade they would be up/down graded to that version prior to association.
From an OS point of view the driver is the same for the M81KR and VIC 1240/1280 so there will be no issues there.
The only time you have to think about moving service profiles between servers with different Mezzanines is if you were using a non VIC; M7xKR, or an Ethernet only Mezz, but the system would fail the association if the resource requirements of the SP are not met by the hardware.
The only instance I have found when using Cisco VICs is when moving a service profile from a half width blade to a full width blade with two mezzanine adapters. UCS tries to be clever and will rightly distribute your Virtual Adapters across both mezzanines. i.e. vNIC0 on Mezz 1 and vNIC1 on Mezz 2, this can cause your ethernet adapters to get re-ordered each time the SP is moved between them. Of course you can just quickly KVM into the host, select the correct vNIC MAC address as your management NICs and your back up and running again, but thats when you could also use a “placement policy” to try and keep the adapters consistant from an OS point of view.
Hope that helped
PS Am Jealous (still waiting on my M3′s for my Lab)
Regards
Colin
Regards
Colin
Hi Colin,
Thanks for all the previous answers. I am back to bug you with a couple of more questions.
1. When defining static server to FI uplink pinning using a pin group, once I define a FI port as part of a pin group for a specific vNIC configuration, can I use the same port (which was earlier used in the pin group) for some other pin group? I guess the answer to that is Yes. Please correct me if I am wrong. Also, is that port (already used in a pin group) available to other servers which do not have any static pin group defined for dynamic uplink pinning?
2. If using dynamic or static pinning on the FI uplink some servers are pinned to FI A uplinks under normal operations, howerver, because of the uplink failure on FI-A get pinned to FI-B uplinks. Would they failback on restoration of the FI-A uplink automatically? If not, is there a way I can do that manually?
3. On IOM to FI link failure happens, there can be different behaviors depending on the fact that I ack the chassis or DO NOT ack the chassis. I know what are the pros & cons of ack & no-ack. What I want to know is since, everything is in favor of not acking, what would be the scenario under which one would ack after failure of an uplink from the IOM to the FI?
-Regards,
-Tarun
Hi Tarun
1) I don’t tend to use static PIN groups very often (if at all) unless in very specific use cases where I need to guarentee dedicated uplink bandwidth to a particular work load. But in answer to your question yes you can use the same uplinks ports (Targets) in different PIN Groups (Just tested it)
2) If a vNIC was statically pinned to an uplink and that uplink failed then that vNIC would NOT be dymanically pinned to another uplink, When the uplink target on Fabric Interconnect A goes down, the corresponding failover mechanism of the vNIC goes into effect, and traffic is redirected to the target port on Fabric Interconnect B.
3) Thing to bear in mind is that Re-ACK’ing a Chassis will cause approx 15 seconds of downtime to all blades in that chassis as all FI -IOM ports are reset. A failed link once re-established will come back without a Re-ACK. The only time you would need to Re-ACK the chassis is if you were reducing / increasing the number of FI to IOM links which of course you can plan in a maintainence window.
As I side note I have seen a Chassis Re-Ack take out an entire exterprise for over an hour!. Hoever that was down to a “Perfect Storm” of misconfigurations and poor practices. but I’ll share it with you.
We had a Chassis was Re-Ack’d and rather than the 30 or so seconds of expected outage we lost all VM’s and vCenter. What had happened was that there were two ESXi hosts in the same chassis which were clustered, they had a keepalive timeout of 30 secs, they were only referecing each others IP, and were set to power down all VM’s in case of being isolated. So you can see where this is going. So after an hours work of KVM’ing into ESXi hosts, disabling lockdown mode (as vCenter was a VM) powering up SQL servers, then AD then vCenter then all VM’s etc etc.. was a real pain.
Hence the best practice to reference default gateways in clusters, spread hosts across chassis, minimum of 3 hosts per cluster yada, yada , yada any one of which would have prevented this issue.
Absolutley no relevance to your question, but a real world scenario of how a Re-Ack can “Cause” or be the Catalyst for unforeseen issues.
Regards
Colin
For B440s with dual vnics can mezz 1 A path be used for 1 vmnic, and mezz 2 B path be used for the other vmnic, say if the function is for the mgmt port?
Hi Fred
Thanks for the question. And the answer to your question is Yes, Cisco UCS will actually do this for you autmatically. If you want to influence which Mezzanine a particular vNIC/vHBA is placed on then you can use a “Placement Policy”.
Each Mezzanine card is represented as a Virtual Connection *vCon” in UCS Manager so a Half Width blade would have all of it’s vNICS and vHBA’s on vCon1 Mezz 1, you using a B440 with 2 Mezzanine Adapters will have vCon1 (Mezz 1) and vCon2 (Mezz 2) as available options. So you can use a “placement policy” to distribute your vNICs and vHBA’s as you wish.
When setting up your placement policy you will likley see other vCons available 3 and 4 perhaps, these are are the New M3 Blades which have LAN on Mother board mLOM and for C Series that can have multiple PCI adapters bo don’t worry about them.
Hope that helps.
Regards
Colin
Hi Colin,
Thanks again for some excellent answers to my questions. Your blog is like a wish come true for me where I can get answers to all my questions
1. From my previous post, if a FI uplink port is already being used in a pin group, would it still be used for dynamic pinning. If Yes, then how does the server for which we define static pin group get dedicated bandwidth which is the primary purpose of static pin groups.
2. When using VN-link in HW in either PTS or Hypervisor bypass modes, can I bundle more than one vNIC’s to a single VM. If Yes, can you help me understand how (may be you can point me to a document if it exists, or create one which you are expert in doing)?
3. Chassis Discovery Policy – Is it per IOM, or per chassis? I know what it does, but, in what case would I use it to be more than 1 link. May be in circumstances where I want to guarantee that is a chassis is online it has 4 links connected per IOM or it should not come online. What do you think?
4. Power cap on individual servers – this option was included after the initial release of UCS – what is its primary use case? Is there any whitepaper or document that mentions which features would be disabled if power is not available or something like that?
5. I have repeatedly seen this question in most of my trainings – HP VC(virtual connect) allows you to hard partition a 10G NIC into 4 NIC’s each with different bandwidth like NIC1 – 2G, NIC2 – 4, NIC3 – 3G, NIC4 – 1G, however, in UCS even if I have 4 vNIC’s all of them share the same 10G(active/standby), 20G(active/active) pipe in the background & there is no way to hard partition except for partition based on QoS. How do I effectively answer this question?
Once again, a very big heart felt thank you for taking all the pain to answer my endless queries
-Regards,
-Tarun
Hi Tarun
1) No, if an uplink port is the target in a static pin group it will not be used for dynamic pinning. Thats the point of Static pinning to give a vNIC or group of vNICs exclusivle use of the uplink(s).
2) Yes you can, just add however many Network adpaters you want to your VM in vSphere and select the VM-FEX port group you want them in. Simples!
3) Chassis Discovery Policy is SYSTEM WIDE, And your correct, if your company policy is to have a minimum of 2,4 links then you can enforce it with the CDP. (3 links is supported in port-channel mode) or any number upto 8 with the 2208XP
4) It is very common in the ral world that Data Centres has a per rack power constraint, in my experience usually between 3 and 7 kW, a single UCS Chassis running hot could easily draw 5kW so you would use the power capping to keep the chassis within your constraints (you may be paying a host copy per Kw for example and this is a good way you could stay within your SLA.
Re power capping on a server basis, you can use power cap policies within Service Profiles. Each service profile can be assigned a power cap policy that defines the relative power priority of a physical server that is associated with that power profile, and the power capping group to which the server belongs. When there is adequate power for all servers, the priorities do not come onto play. In the event that the servers in a given power cap group begin to exceed their group allocation, power is allocated according to the priorities defined in the attached power cap group policy, ensuring that critical loads are throttled last. Additionally, there is an option to designate a server as having no power cap, for workloads whose performance cannot be compromised even to the minor extent that power capping entails.
The WP that covers this can be found at:
http://www.cisco.com/en/US/solutions/collateral/ns340/ns517/ns224/ns944/white_paper_c11-627731.html
5) You did very well at answering your own question, QoS is exactly how you would do it, you could do presicley what you mentioned i.e. “Chop up the 10Gbs CNA into 1 1Gbs, 3Gbs pipes for particular traffic (by vNIC) with the added benefit that if there is no congestion east vNIC can use the full bandwidth anyway.
I’ve got a follow-up question to number 1) above that I think I know the answer to, but I thought I would post it to make sure. Assuming you have one static pinned uplink and two dynamically pinned uplinks (in a port channel). I realize that the traffic to the static uplink would only see that which is in the pin group. Everything else would crowd out the dynamic ports. What happens if both of the dynamically pinned uplinks go down? I assume the traffic will use the only uplink left… the static pinned uplink. Any port in a storm I guess. Is that correct?
Hi Troy
Sounds great in Theory but I am pretty sure the if all dynamicly pinnable uplinks go down, all vNICs that are NOT a member of the remaining static uplink will go down, then rely on fabric failover or teaming to retain connectivity.
The remaining Static target uplink will not suddenly get maobbed with dynamically pinned vNICs.
Regards
Colin
Hello Colin,
In the first place thanks for the page. It really clears many questions I have. I have a very simple question here. What is wire 1,2 and 4 when we talk about the chassis and blades. Why is 3 not a number to be used when we discuss about wire in UCS?
Regards
Dhruva S Kolli
Hi Dhruva
Thanks for the question. Historically Cisco UCS as you rightly say only supported 1,2 & 4 links this was due to the pinning relationshup between the blade slots and the IOM Network ports. i.e. if you have a single link between the IOM and the FI all blades obviously have to use that single link. If you have 2 links the odd blade slots pin to link 1 and even blade slots to link 2. 3 Links was not supported, and if 4 links were used slots 1&5 use link 1, 2&6 use link 2, 3&7 use link 3 and 4&8 use link 4.
Now while 3 links was not a supported startup configuration, it is a supported running configuration. What I mean by that is; if you had 4 links and 1 link failed, the server slots using that link would failover to the other fabric, and the remaining servers would continue on the 3 remaining links. (however if you then re-acknowledged the chassis, the system would revert to 2 links.
Now the above assumes you are in whats called “Discrete Pinning Mode” which is the default. However now we have mezzanine cards which can drive upto 40Gbs of I/O per fabric, so having a server pinned to a single 10Gbs link obviously is an immediate potential bottle kneck.
To address this since UCSM v2.0 you have been able to port-channel all links between the IOM and the FI to give all servers access to the entire bandwidth between the IOM and FI (80Gbs with the 2208XP).
In port-channel mode 3 links is supported (in fact any number of links if using the 2208XP).
Hope that clears things up.
Regards
Colin
Hello Colin,
Thanks a lot. That helps.. Can you explain about the hybrid view of the port-channel mode with 3 links? I am a bit confused here.
Regards
Dhruva
Hi Dhruva
Basically in Port-Channel Mode any number of FEX to FI connections is permisable. (As it is just a LACP port-channel)
If you are using any VICs with greater than 10Gbs per fabric connectivity (VIC1240 / VIC1280) then you should be port-channeling between the FEX.
and FI.
Regards
Colin
Colin,
Thanks for the reply on my question regarding port channel. I’ve got one more question here. Lets say I have 5 UCS blades spread across 2 chassis (say two blades in one chassis and three in the other)and we have 8 uplink ports on each Chassis. So how does the pinning of servers to these blades behave.On each chassis is this how the blades are pinned.?
Blade slots 1,5 pin to uplink port1, Blade Slot 2 pins to uplink port2, blade slot 3 uses uplink port3
If this is not the case then request you to please explain the pinning in this scenario with and without port-channel
Regards
Dhruva
Hi Dhruva
Yep you got it.
In Disrete pinning mode its slots 1,5 use Link 1, 2,6 Use Link 2, 3,7 use Link 3 and slots 4 and 8 use link 4, as shown below
I would like to know if you can connect multiple 6100 FI to each other to limit the uplink connections to a 7K? For instance: there are 4 6100 FI and 2 UCS systems connecting to 2-7K. I would like to connect 6100-1 to 6100-3 to enable L2 switching between the FI. Can this be done via switch mode and is vPC supported between the FI?
Thanks, Steve
Hi Steve
Not 100% sure what you are trying to achieve with that setup and thats certainly not the way Cisco UCS is designed ot intended to work.
There are now very few if any valid reasons to put the FI’s in switch mode.
The FI’s although they look like a Nexus 5k thats been strategically painted a different colour, do not act like them (even in switch mode) they do not have the same rich L3 features (like vPC) and I’d certainly not remcommend this setup.
As to world it work, likley yes without vPC etc.. you cannot configure STP priorites etc.. so to prevent your FI’s becoming the Root bridge over your 7k’s you need to reduce the priorites on all your other switches.
In summary with the kit and functionality these days, there will always be a far better solution using End Host mode, than switch mode.
Regards
Colin
I would like know if inside UCS manager is possible check ther power consumer in AMP. Another question is how much power one fabric 6140xp more 8 blades m1 consume ? I am using the ucs power calculator, but i cant put the information in AMP, so, i have 32A is enough to me put one fabric 6140 with fc module, more 8 blades m1? im using 4 power units. thanks
Hi Thanks for the question.
I’m pretty sure that new Cisco UCS power calculator does provide the consumption in Amps (I’ll check)
https://express.salire.com/Modules/Analyses/Edit/Analysis.aspx
In the meantime there are plenty of kW to Amps calculators like the one below.
http://www.rapidtables.com/calc/electric/kW_to_Amp_Calculator.htm
Regards
Colin
Full disclosure – I’m the Cisco PM for these tools.
The correct URL for the current version of the UCS Power Calculator is http://express.salire.com/Go/Cisco/Cisco-UCS-Power-Calculator.aspx. The URL above is actually a generic redirect URL generated by the vendor’s framework depending upon which tool you use (they also host the Public UCS TCO/ROI Calculator (http://express.salire.com/Go/Cisco/Cisco-UCS-TCO-ROI-Advisor.aspx)).
Hope this helps and saves some confusion.
Thanks Bill
I’ll get the link updated.
Regards Colin
Hi Colin,
I’ve installed Windows 2008 Enterprise on Bare-metal UCS B-200 M2 server with VIC M81KR Mezannine card. UCS Manager version is 1.3(c).
The Problem that I have is that I cannot define VLANs on the NIC adapter inside Windows. is this a problem with the NIC driver ?
( Other NIC cards in other machine for example, have Advnaced Tab in the network configuration where we can define VLANs ).
( in this situation, Windows is un-aware of the VLANs so all the traffic going out will be untagged and FI sending it to the native VLAN on the network).
Thanks in advance,
Mohammad
Hi Mohammad
I’ll check this out on the kit I have in the lab next week.
First couple of things are using the latest M81KR Drivers and utilities?
http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/vic_drivers/install/Windows/2.0/b_Cisco_VIC_Drivers_for_Windows_Installation_Guide.pdf
Also as a side note perhaps to start thinking about a Upgrade to 2.0x as your missing out on lots of cool features.
Colin
Hi UCSGuru:
What is the best method to capture Flow or nbar information from nexus 6296 or nexus 5596 switches?
Hi Thomas
We are talking about two different boxes here, neither of which inhrently support NBAR.
With the Fabric Interconnects (6100/6200) setup a SPAN/ERSPAN session and either capture the out put or forward to an NBAR capable device like the Network Analysis Module (NAM)
Best doc detailing this is:
http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/141/UCSM_GUI_Configuration_Guide_141_chapter45.html
A similar setup is required for the Nexus 5500 which is detailed in the below link.
http://www.cisco.com/en/US/docs/switches/datacenter/nexus5000/sw/system_management/503_n1_1/b_cisco_n5k_system_mgmt_cg_rel_503_n1_1_chapter_01111.pdf
Regards
Colin
Hi Colin,
I have a question that seems very hard to find an answer too…
Is it possible to vmotion a virtual machine that is configured to use VM-FEX across UCS management domains?
Thanks
Tom
Hi Tom
If you mean vMotion between two different USM Domains when using VM-FEX affraid not as the Ports which the VM’s connect to are assigned by UCS Manager and need to remain consistant post the vMotion (Hense why all policies and interface statistics are maintained).
You can however vMotion between any blades in any chassis within a Cisco UCS Domain if using VM-FEX even if using VM-FEX in VMDirectPath Mode.
If being able to vMotion between UCS Domains is a requirement you need, then go for Nexus 1000v or just use standard VMware vDS.
Regards
Colin
That’s cool, I thought as much, I couldn’t find that info anywhere, but it makes sense.
Looks I’ll my ESXi hosts that utilize VM-FEX need to stay in the same UCS management domain.
Thanks for your response
Tom
I am not able to see WWN in MDS 9500 while I can see FC uplink ports WWN but blade server virtual WWN assigned by pool are not appear in Cisco MDS FLogi.
Hi Faisal
There was a identical question to yours asked in the “About” page of this site. (One of the main reasons I created an “Ask a question” category’ was to keep all the questions in one place) Anyway have a look, I gave quite a detailed answer with all the required troubleshooting steps.
The first thing I would check is that all your vHBA’s have WWNN/WWPNs assigned either manually or from a pool, check this by expanding the vHBAs in the service profile.
Assuming they have their addresses check the vHBA’s are in the correct VSAN (if no default VSANs are being used)
If you are booting from SAN confirm your SAN Boot Policy has the correct WWPNs for the correct targets on the correct fabric.
(At this stage you can just use dummy WWPN of the targets, the vHBA’s will still flogi into the fabric)
If above checks out. ssh into the FI’s directly, connect NXOS, and check whether your WWPNs are in the flogi database of your FI’s “sh NPV flogi database” from NXOS mode.
If they are confirm the correct WWPN is showing on the correct Fabric Interconnect, if thats good,check your fibre channel uplinks between your FI’s and your SAN switches, make sure they are in the correct VSAN and are connected to the correct fabric. The FI uplinks need to be in or carrying the VSAN you put you vHBAs in.
If thats good check the flogi database of your SAN switches, and confirm the switches have NPIV enabled. And that the correct VSANs have been created on them and the MDS ports that connect to the FI’s are in the correct VSAN. (Or carrying the correct VSANs if trunking VSANs)
Also turn off quiet boot in the default BIOS settings and have a look during boot as to whether your vHBAs are seeing the targets you expect.
Thats the order I would do it in and you should rapidly work out where the issue is or isn’t.
Come back to me if you still have issues.
Regards
Colin
Hi Colin,
I didn’t get the answer of my question. I have created SP’s, now WWN are assigned to blade servers, I have configured 2 ports 31 & 32 on FI as FC uplink ports. Now on MDS 9500, I can see 2 WWN of port 31 & 32 but can’t see the WWN of blade servers which were delivered from WWN pool. Is there some step required more.
Hi Faisal
Completeley understand what you have done and where you are, and have seen your issue many times, it will be that you have not created the VSAN on the MDS (if using any other VSAN than 1) or have not put your MDS ports in the same VSAN that your vHBA’s are in.
If you go through the troubleshooting steps below you will find and rectify your issue.
Also confirm you have NPIV enabled on your MDS swicthes.
switch(config)# npiv enable
Regards
Colin
Hi Colin,
In MDS they are using VSAN1 which is default and I am using Default VSAN as well in SP. On MDS they are using MPIV mode so that FC uplink ports are up on FI. But still I am not able to see Blade servers WWN on MDS while I can see FI physical port WWN on MDS. Do I need to do any connectivity with FI FC uplink ports (31 & 32) with blade FC ports.
SAN team is saying to enable MPIV mode at FI as well. But I can’t see MPIV is enabled or not.
Hi Faisal
Ok if your using default VSAN on the FI’s and VSAN 1 on the MDS that should be fine.
You would know if NPIV was not enabled on the SAN swicthes as you get a very helpful message in UCS manager that NPIV is not enabled on the upstream SAN switch if it is not. (and the fc FI uplink will not come up)
Your issue most likely is then that your servers are not trying to flogi into the fabric.
Are you booting from SAN? Have your created your boot policy to boot of your vHBA’s? (interface name must match in boot policy and SP i.e. fc0/fc1)
have you put some WWPN’s in as boot targets to initiate the flogi, have you turned quiet boot off so you can watch the vHBA’s initialise and discover the targets. Have you checked that you can see your WWPN’s in the flogi database of your FI’s?
Again can’t think of another way to word this but if you step through the steps in my first response you should quickly resovle your issue.
Regards
Colin
I have several fc-node ‘named policy unresolved’ faults. I have no vHBAs configured but do have WWPN/WWNN default pools configured. 2.04a code. Any ideas how to clear? thanks.
Fred
Hi Fred
That error usually means you are referencing a pool in a policy that is not / no longer there.
In my experience this is usually a default pool that has been deleted but is still being referenced in a Service Profile somewhere.
To be sure have a look at the fault description and code
e.g.
effected object (This will be the pool in question)
code: F4525239
description: Policy reference identPoolName does not resolve to named policy
Have also seen this after an upgrade or if your pools are in child Orgs other than root.
As a test try re-selecting the pool in question under your effected SP/SP Templates and see if that cures it.
If not try creating a pool of the same name under root if its not already there.
If all else fails give TAC a call.
Good Luck
Colin
You nailed it. created ‘node-default’ pool (empty) under WWNN. Faults cleared. thanks for your help.
Hi Colin,
Which settings of services profile requires a reboot of a blade?
Where do I find a full list of the same?
Hi
There is no list (as far as I am aware), and there are far too many to list here.
But essentially if its a change that modifies the hardware i.e. BIOS, Adding a vNIC/vHBA etc.. then a planned reboot would be required.
If it is a soft change like adding/removing a VLAN to a vNIC etc.. then no reboot is required.
Why not start a list and share it with the Cisco UCS Community?
Regards
Colin
We are thinking about purchasing a new UCS infrastructure but have had mixes reviews around reliability and bugs. Please can you let me know if the system is reliable enough for a highly available production environment?
Hi Barry
Thanks for the comment, I don’t really hear these sort of concerns these days (2 years ago was a totally different matter) No most customers know the many benefits that Cisco UCS gives them, even if they have not adopted the technology yet.
The “endorsement” if you will that Cisco UCS is certainly stable in production environments, comes from the customer bases in which I have been deploying it in to. 2 Years ago it was all the trendy, early adopter type clients, but in the last 12 Months I have designed and deployed Cisco UCS in extreamly risk averse financial institutions and major Banks.
I would certainly recommend you try and get a PoC setup and have a play and see for yourself. The main thing I would suggest is that you have an UCS engineer give you a good overview of the tech and walk you through the setup. While not complicated it is a different mindset to what you are likley used to.
UCSguru.com is always here to give you impartial advice and assistance.
Good luck and have fun on your journey to convergence.
Regards
Colin
Hi,
I am doing Boot from SAN. UCS blade B200M3 has been started and I have done VMWARE 5.1 installation after completing the installation it reboot, When it reboot after scaning the SAN disk, it showing black screen and cursor is blinking. VMWARE installation is complete and I can see Symatrix DMX3 disks as well. But in my all servers I had same problem. What could be the reason.
Hi, I am still now able to do boot from SAN. I had the following in FI,
adapter 1/2/1 (fls):3# lunlist
vnic : 16 lifid: 5
– FLOGI State : flogi est (fc_id 0xe81a08)
– PLOGI Sessions
– WWNN 50:06:04:8c:52:a6:69:49 WWPN 50:06:04:8c:52:a6:69:49 fc_id 0x34002b
– LUN’s configured (SCSI Type, Version, Vendor, Serial No.)
LUN ID : 0×0000000000000000 (0×0, 0×4, EMC , 101669060000)
– REPORT LUNs Query Response
LUN ID : 0×0000000000000000
LUN ID : 0×0031000000000000
– Nameserver Query Response
– WWPN : 50:06:04:8c:52:a6:69:49
Is it correct. I can do installation of ESXi 5.1 but can’t boot it. Urgent support required.
Hi Faisal
As you need urgent support, the best thing to do is open a Service Request with Cisco.
REgards
Colin
Is there a way from the UCS side to tell it the Upstream VLANS are configured on the Port-Channel Trunks, say coming from a Nexus 5k?
Hi Allen
Not as such, there are things you can do to help test it i.e. put a VM into that VLAN and see if you can PING the default gateway on or beyond the upstream switch, if your sure you are carrying the VLAN on the vNIC and the FI uplink and there is no connectivity chances are the VLAN is not enabled on the upstream switch port(s).
But to be sure you would have to check the upstream switch.
Regards
Colin
We are trying to benefit from Microsoft Server 2012 Hyper-V 3.0 “Virtual Fibre Channel adapters”.
When we wanted to build a HA-cluster within VMs we needed to use iSCSI for this, but I assume we should be able to use FC as well now.
UCS should be the ideal solution for this and the Cisco VIC FCoE Storport drivers version 2.2.0.9 seemed to support this.
Unfortunatly that gave me NPIV errors and from what I understood I needed to wait for the official UCS 2.1 realease.
It’s there and it’s up and running, but using the latest drivers (2.2.0.17) it says “the device or driver does not support virtual Fibre Channel”.
We’re using a NetApp SAN directly connected to our Interconnects using FC (Swith mode).
Is it possible (and if so, how) to benefit from these Virtual Fibre Channel adapters?
Hi Peter
Thanks for the question,
I have also been in the situation of requiring vHBA passthrough from a Cisco VIC to a VM (Tape libary access etc..), and you are right this was not supported. and did not work, if for example trying to use VMware’s VM DirectPath I/O on a vHBA.
What is needed is either a VM-FEX type setup for vHBA’s or supporting passthru of the vHBA of the host.
I know that SR-IOV enhancements in the Hypervisors should make this possible, but have not had any time to play around with this as yet.
I don’t have a huge amount of experience on the Hyper-V side (but am getting more interest in it these days) Rather than me lab up your environment, I would suggest you talk to Cisco TAC.
Sorry not to be able to advise more on this one.
Regards
Colin
Hi Colin,
Greetings of the day!
)
I am here with another simple, yet difficult question because I am not sure of the correct answer ( or may be I am!
I would try to explain that using an example.
Step 1 – Lets say I create a Service Profile Template by the name Oracle_RAC_Template. Within this service profile template I define a boot policy for SAN boot which has primary/secondary boot target WWPN’s & a boot LUN (’0′ most of the times).
Step 2 – I create 5 Service Profiles using the above template Oracle_RAC1 through Oracle_RAC5.
Now my question is, since all the 5 service profiles are created using the same template, all of them would have the same primary/secondary boot target WWPN & the same boot LUN for SAN boot. Which in turn means they boot from the same place in the SAN storage box which does not makes sense as each physical server should see a separate boot area on the SAN box.
So, as I see there are 2 options here as follows:
1. I manually change the primary/secondary boot targets for all 5 service profiles which I created out of the same template initially, which would kind of defy the very purpose of using templates for rapid deployment of SP’s.
2. I do some magic on my storage side with zoning & masking which allows all of the 5 service profiles to use the same primary/secondary boot targets & the same LUN number, but, still depending on the source WWPN take it to the correct SAN boot area.
I am kind of inclined more towards option 2 being correct (may be because of my SAN ignorance), but, I would love to be corrected !!
Thanks again for spending so much time in reading & answering all the questions.
-Regards,
-Tarun
Hi Colin,
Looks like you missed this one
-Regards,
-Tarun
Hi Tarun
Thanks for picking me up on missing this one, in my defence I was on Holiday as well as it being the day before my Birthday
OK
So Your Step 2 is not quite right. You are right that if you had a common ‘Boot from SAN’ policy for all of your service profiles they would try the WWPN of the targets in the same order and potentially all hit the same WWPN of the array at boot time.
The Boot LUN, as you say should indeed be 0, however this does not mean that all LUN 0′s on the target WWPN is the same LUN. You will have a separate small boot LUN for each host, these LUNs will have a unique Array Logical Unit (ALU) but then also have a Host Logical Unit (HLU) which you can assign, in this case 0 for all boot LUNs. So what you end up with is many LUNs all with unique ALU’s from the Arrays point of view, but in the case of Boot LUNs all presented to the hosts as LUN 0′s. So how do we ensure the host boots of its correct LUN?, thats where Masking on the Array comes in.
We use Zoning on the SAN Switches to only allow the Host HBA WWPN’s access to the Target WWPN’s we want but as I have shown above the Host would potentially see all of the LUNs on the array, even the ones that are not relevant to that host.
So to prevent this we “Mask” at the array, which is basically create a group (usually called the Servers Hostname) and then put both the LUN and the WWPN of the Host (Initiator) in this group. That way each host can only see the LUNs that have been masked to it. In Networking terms you can think of a Zone like a VLAN and a Mask like an Access Control list (ACL)
So Host A booting from LUN0, will be booting from a completely different LUN than Host B booting from LUN 0 even if using the same Target WWPN (Like in a Boot Policy)
OK, All that said, in larger environments you may want to spread the server boot I/O load across multiple Array WWPN’s and the only way to do this is to use multiple boot policies with the target WWPN’s in different orders. (I generally don’t as I prefer a single boot policy and most servers do not tend to get rebooted that often)
The scenario when I would definatley consider this is in VDI environments when all Virtual Desktops for a company all get booted at 09:00 in the morning, which could potentially cause a “Boot Storm” which causes intense concentrated storage I/O that can easily overwhelm a storage subsystem.
This could make the brand new VDI Solution you just installed slow and potentially unusable, not a good place to be.
A typical desktop VM running Windows 7 will generate from 50-100 IOPS while is it booting; but this drops to about 5-10 once booted and running normal workloads.
There are numerous design options to prevent issues from Boot Storms, with the most common being intelligent use of SSD or Flash drives for the boot image(s).
Once again I’ve gone a bit off topic but all good relevant info.
Regards
Colin
Thanks a ton Colin! As usual amazing depth.
-Regards,
-Tarun
Hi Colin,
I have a question regarding connecting a standalone server (for example C series server or any other non Cisco server) directly to Fabric Interconnects. Is this setup supported?
Can I connect the C Series rack server redundantly to Fabric Interconnects without FEX? I understand that the UCS Manager would not be able to manage devices in that scenario.
Thanks for your time.
Best Regards,
Adi
Hi Adi
The quick answer to your question would be No, the correct way of attaching Cisco UCS C Series servers would be via the 2232PP FEX as I’m sure you are aware.
Prior to version 2.0.2 you could connect the C Series Data ports directly to the FI’s and the management was via a pair of 2248′s FEX’s but with version 2.0.2 and above this is no longer supported and only 2232PP FEX’s are supported
Re Connecting 3rd Party Servers directly to the FI’s
You could put your FI’s into switch mode and connect servers directly to the FI’s, but this would not be a overwhelming reason to put the FI’s into switch mode in my opinion, far better to just buy an upstream LAN switch.
You could leave the FI’s in End Host Most and connect a Server into an Appliance port (Designed for NAS direct attach), but this again this is not ideal as you cannot prune VLAN’s off an appliance port, so the server would recieve lots of unwanted broadcast traffic fromm other VLANs, and while it would work (Appliance ports Learn MAC addresses) I wouldn’t recommend it and that setup is likley not supported.
In short wouldn’t recommend directly attaching any servers to the FI’s at least in a production environment.
Regards
Colin
Thank you for the detailed reply. I appreciate it.
Regards,
Adi
Hi, Colin,
Do you know if it’s possible to connect a 5108 (w/2208XP) to 2232PPs hanging off some 6296 FIs ? Or do they _have_ to connect directly to the 6296s ? I’m guessing this sort of FE “stacking” can’t be done, but I thought I’d double check.
Cheers
Also, is it possible to have C-Series UCSM integration with _only_ 1GbE available ? Eg: 2248s hanging off 6248s, but then only 1GbE copper to the servers themselves ?
Hi
The 2248′s are no longer supported as from UCSM version 2.0(2x) so if you have any you will either need to replace them with 2232PP’s or do not upgrade beyond UCSM 2.0(2), If you do upgrade UCSM to 2.0(2) or beyond your 2248′s will no longer be recognised.
All supported options for C Series integration require 10Gbs connections for the Data Path, if you only have a C Series with 1Gbs ports the integration with UCS Manager is not an option. You need a 10Gbs Expansion Card P81E / VIC1225 etc..
You can of course use a 1Gbs C series in stand-a-lone mode.
All Supported C Series Intergration options can be found here
Regards
Colin
Hi Thanks for the question, and you are right you cannot “Daisey Chain” Nexus 2000′s (The N2232 and UCS IOM are both Nexus 2000′s in essence)
The FEX Standard (802.1BR, Bridge Port Extension) does however allow for hierarchical port extenders, ala VMFEX and Adapter FEX, in which the Nexus 2000 (the IOM in the case of UCS) acts like a passthrough between the UCS Mezzanine card, which in itself if a FEX (Port Extender) and the Fabric Interconnect which is the Controlling Bridge.
Regards
Colin
Thanks, that sort of answers the question but raises another.
Basically, would it be possible to connect 6296 -> 2232PP -> 2208/5108, or does it have to be 6296 -> 2208/5108 ?
Cheers
Hi
The only supported options for the components you mention are:
6296 -> 2208/5108
6296 -> 2232PP -> C Series (single wire option with UCSM 2.1)
Regards
Colin
Can you help me understand why, when using SAN switches, each Fabric Interconnect (FI) must be connected in the following manner:
FI-A –> SANSwitchA
FI-B –> SANSwitchB
(each SAN switch has a connection to the “A” and “B” side of the storage array)
Why can’t I hook FI-A to SANSwitchA and B? So that a single switch failure doesn’t take out all storage connections through a single FI? I undertand that you can design the service profiles so that traffic goes out each FI, but don’t understand why Cisco doesn’t want you to do this or why you can’t use the same vSAN on both FI.
Thanks.
-Bill
Hi Bill
Thanks for the great question, and one I find myself explaining to customer’s allot.
Historically Storage Networks have always been 2 separate networks, SAN A and SAN B, this as I’m sure you are aware is to provide two distinctly separate paths for the storage traffic in order to provide full resiliency.
These two separate networks obviously provide multiple paths between the Host (Initiator / Server) and the Array containing the logical disk (Target), and all these paths can be intelligently used by multipathing aware drivers (MPIO, EMC PowerPath etc..)
OK so back to your question about why don’t we dual attach the Fabric Interconnects to the SAN Switches like we do the Array Controllers.
Well as you may know in the default N Port virtualization (NPV) mode the Fabric Interconnects act like an Initiator (N Port) as far as the upstream SAN Switches are concerned.
So as with any initiator it is best practice to have one HBA to SAN A and a Separate HBA to SAN B, so you can think of the Fabric Interconnects as HBA’s. So that answers your first question.
Your second question around, if the point is to have two completely separate SAN Networks why are the Array Controllers often dual attached to both fabrics.
This actually provides several benefits the main one being around fail over.
As you may know whenever you create a LUN on an array, that LUN is active on a single Storage Processor (SP), if the active SP ever failed then the secondary SP would take ownership of that LUN.
However imagine the scenario in the above diagram if Fabric Interconnect A or SAN Switch A failed. If the Blade was using a LUN which was being owned by SP A and SP A was only attached to SAN Switch A, the server would now be isolated from its LUN, as all paths to SP A would be broken.
Now depending what failed and what array you have this at best would cause the LUN to failover to the other SP which while works well is a “Moving Part” in the failover process that should be avoided as a best practice.
Hence the best practice to dual attach the SP’s in the Array to both fabrics, as this enables the Host to have a path to its LUNs across both fabrics regardless to the SP that owns them.
I mentioned this also has other benefits; the one I capitalise on most is when performing a migration of a host from one SAN to another or even a SAN replacement.
For example a few weeks back I moved a client from a Brocade SAN to a Nexus SAN and had to move all their Servers and Arrays to the new Nexus Fabric with no downtime.
This was a fairly easy task as all the Arrays (VNX and Clariion in this case) were dual attached to both fabrics. So it was a case of, install the Nexus Fabric, configure all the aliasing and zoning etc..
Then move each host and array off the Brocade SAN A and onto the Nexus SAN A. This obviously breaks SAN A but as long as all paths across both fabrics were active this will not cause the Host to lose connectivity to its LUNs (Regardless of which SP own s them)
Then once done confirm all is well and all paths are back up and then do the same for SAN B.
Appreciate we got a bit off topic, but thought it was a good opportunity provide some context and examples around the topic for other readers.
Regards
Colin
I would like to clarify on this answer a bit more off your diagram above.
We are running 2.1.1a using FCoE. Our SAN shows “Partially Connected” after the migration from directly connected to using a SAN Switch. We do not use VSAN 1 and do not have it configured. We use VSAN 11 for Fabric A and VSAN 12 for Fabric B. Are questions are as follows (Any help would be greatly appreciated):
VSAN Configuration on the SAN Switches
1. Should each Fabric -> SAN Switch follow a separate VSAN. i.e. Fab A (VSAN 11) to SAN Switch A (VSAN 11), and Fab B (VSAN 12) to SAN Switch B (VSAN 12).
VSAN Configuration inside UCSM
2. Should we just have 2 vHBA (vHBA 1 configured for Fabric A VSAN 11 (Cloud) and vHBA 2 configured for Fabric VSAN 12 (Cloud).
Hi Todd
The answers to your questions are as follows:
1) Yes Each Fabric should have a unique VSAN ID or IDs (if multiple VSANs per fabric) and if using FCoE only trunk the FCoE VLANs to the correct Fabric. i.e in your example if you use VLAN 3011 for VSAN 11 (Fabric A) and VLAN 3012 for VSAN 12 (Fabric B) then VLAN 3011 should only be mapped to the direct link or Portchannel between FIA and FCoE Switch A and VLAN 3012 should only be mapped to the direct link / port channel between FIB and FCoE Switch B.
2) Yes thats correct, I only tend to add additional vHBA’s when using multiple VSANs within a single Fabric (Rather than relying on upstream Inter VSAN Routing (IVR))
Regards
Colin
Hi Colin ,
Does the same apply when using direct attached SAN. Is it still best practises to connect SPA/SPB to Fabric Interconnect A and B ?
Hi Craig
My view would be yes, as the FI’s then become the fc switch and if you lost an FI your host would still have a path to both SP’s. So no failover of the SP’s within the array would be required.
Colin
Hello Colin,
I know I have asked you this question earlier But would be great if you could bear with me for one more time. I would like to clear my doubts on the Discrete Pinning Mode and Port-Channel Mode?
Here is a scenario, we have a mezzanine card for our blades, two 5108 Chassis, two 6248UP FIC, and 5 blades. First three blades are in Slot 1,2 and 3 respectively in Chassis 1 and Blades 4 and 5 on slots 1&2 on Chassis 2.
I have 4 ports going from each Chassis to the FI’s. (2 connections from FEX A to FI A and 2 connections from FEX B to FI B. This is done on each of the Chassis)
So how does the connectivity from Blade to FEX to FI work?
Regards
Dhruva
Hello Colin I just wanted to understand how the port channeling can be done. thats it… In the discrete mode I understand the whole connectivity when we have all the blades connected and all the fex’s connected. I just wanted to understand the connectivity in the scenario I described above in Discrete Mode.
Port Channles Mode I guess that would completely depend on the configuration that we do.
Hi Dhruva
I generally use port-channel mode in most cases, as if you are using a VIC 1240 / 1280 they provide 20 and 40Gbs per fabric respectivley, so if you left your FI’s in discrete pinning mode, each blade would only have access to the 10Gbs link it was pinned to, which as you see would be an immediate bottle kneck.
Hence port-channel all the IO Module to FI links to give any server access to the full bandwidth of the port-channel (80Gbs if using the 2208XP) which eliminates this potential bottle kneck.
The only time I leave an FI in Discrete pinning mode when using a Gen 2 VIC, is where a customer wants deterministic and consistant failover.
i.e. if a link failed in a port-channel no failover occurs, infact if in the highly unlikley event you lost 7 of your 8 IOM to FI links still no failover would occur as the port-channel would still be up. So you would now have all of your vNICs mapped to that fabric contesting for the single remaining 10Gbs link.
If the FI was left in Discrete Pinning mode then failover would occur in the event of a single link failure (for vNICS with fabric failover enabled, for the Servers pinned to that particular failed link) This would provide a more deterministic failure pattern. But I would still prefer going down the port-channel route unless this deterministic failover was paramount to a customers requirement as other wise you would never get the full use of the additional bandwidth that the Gen 2 VICs are capable of due to the server only having access to the single 10Gbs link.
Regards
Colin
Hi Dhruva
If you are using Discrete Pinning Mode (Default) with 2 Links per IO Module all odd Blade slots (1,3,5,7) will pin to Link 1 and all even Blade Slots (2,4,6,8) will pin to Link 2.
Then its just a case of which Fabric you have mapped each vNIC to to see which FI they are using.
You can also get all this info from having a look at the VIF paths under each service profile. (see one of my previous posts “Understanding VIF Paths”)
Hope that clears this up for you.
Regards
Colin
Thanks a lot Colin..
I think the following link has explained me the whole stuff clearly… Thanks colin… I dint want you to concentrate much on the question i asked.
http://www.definethecloud.net/tag/ucs
Thanks & Regards
Dhruva S Kolli
Glad Joe cleared that up for you.
Regards
Colin
Hey Colin-
What is the max number of VM’s I should be able to run via VM-FEX on each ESX host assuming the following:
6296 FI’s
4x uplinks per 2208XP IOM in port channel
5x B200 M3 running vSphere 5.1
All gear running UCS 2.1 (1a)
I have everything setup and running great in high-performance mode with up to 23-24 VM’s per blade, however, once I add an additional VM to the blade it’s not reachable. It seems I’m incurring a limitation somewhere. I’ve tried disabling high-performance mode on the port profiles, but that didn’t work either. Currently open TAC case, but haven’t gotten to the bottom of it yet
I guess I should also mention my vNIC configuration:
2x vNIC for management
2x vNIC for NFS storage
2x vNIC for UCS DVS uplink
50x dynamic vNIC
Hi Jeff
Assuming you are using the VIC 1280, these can support 256 virtual interfaces, the 6200 FI’s support 63 Virtual Interfaces per downlink so in your setup with 4 downlinks would support 252 Virtual Interfaces on you VIC 1280. Subtract the number of static Virtual Interfaces you have (6) leaves you a maximum of 246 Dynamic Virtual Interfaces available for VM-FEX.
So you should be fine with your 50x Dynamic vNICS.
But I’m sure you are aware of all this and how it works on paper.
But as we all know whats fine in theory can be quite different is practice, thats where things like this blog and the Cisco Support Community really add value.
So not sure what would be causing your issue. But I would be very interested to find out.
So once hopfully TAC work with you to resolve the issue, please post the solution back to this thread.
Good Luck
Regards
Colin
Hi Colin-
Just got off the phone with TAC and have a resolution. The problem in my configuration was as follows – I hadn’t configured any UCS DVS static vNIC uplinks on my second physical adapter in my blades. Basically, the 2x vNIC for UCS DVS uplink that I had configured were both assigned to adapter 1 (VIC 1240 in the B200 M3). Since my dynamic vNIC policy had created about 25 dynamic vNICs on each adapter 1 and adapter 2, once my VMs had utilized all 25 on the first adapter, VMs couldn’t connect to the dynamic vNICs on the second adapter. Kind of hard to explain… All I had to do to fix this was create 2 additional static vNICs for DVS uplinks and assign them to be on adapter 2 (VIC 1280 in the B200 M3). Then, on the ESX host configuration, add all 4 static vNICs as uplinks. This allowed me to utilize the dynamic vNICs that were created on both adapter 1 and adapter 2. Whew.
Thanks,
-Jeff
Great news Jeff
And a great real world gotcha for everyone to be aware of.
Makes sense when you think about it as they are seperate adapters, I doubt you would have had the issue if you had used the Port Expander with the VIC 1240 as all of your Dynamic vNICs would be on your VIC 1240.
Makes more sense when you look at the diagrams below.
In some respect I’m really envious of the guys in TAC and Cisco, all the great people and info they have at their disposal
I’d be like a “Kid in a Candy Store” there
Regards
Colin
Hi Colin,
I want asking something. Usually between chassis and FI, have connectivity like this. Fex slot 1 (left) to FI A, fex slot 2 (rigth) to FI B.
what if fex slot 1 ( left- seen from rear chassis) connected to FI B and then fex slot 2(rigth) connect to FI A with 4uplink port channel. Is it will causing any problems?
Waiting for your answer soon.
Thanks and have a good day.
Hi ZT
The short answer to your question is Yes it would work.
Nothing at all stops you from having your FEX A in the IOM 2 (Middle) position and FEX B in IOM Slot 1.
My question would be why would you want to do that, there is a huge ammount to be said for best practice and the best practice is to have FI A connected to FEX 1 (The left of the chassis from the back) and FI B to FEX 2 (middle slot)
It would wind me up if I was was called in to trouble shoot an issue and had to spend time deciphering the topology before I got stuck in.
This job can be tough enough as it is, without us adding to it voluntarily
so lets all try and keep to the defacto standards and best practices
I have attached a quick cheat sheet for you below.
Regards
Colin
Hello Colin,
Back with another simple question I guess… Well it is regarding the Management Port on the FI’s.
While configuring the FI’s we give a cluster Ip as well as an Individual IP to each of the FI’s? Why do we need these management ports? Is it for a fall back? And did we have any such situations of the cluster IP failing once configured?
So to access this mgmt port, do we need to have it connected to the Layer 3 switch directly? In case we do not want to connect it directly to a Layer3 switch , we need to have a layer2 in that connects to layer3,. Here again there is more usage of equipment right?
So the mngmt port?
Hi Dhruva
The Management ports on the FI’s are the physical port in the Management vrf to which you assign the physical IP’s of the FI’s MGMT0 Interface. The same as in the Nexus swicthes if that helps.
If you did not connect the MGMT interfaces you would not be able to manage or even PING the FI’s.
These are Layer 2 Interfaces and as such should be connected to a layer 2 port (So either any port on a Layer 2 switch or a layer 2 (switchport) on a Layer 3 switch. This is because the MGMT 0 port on an FI obviously has to be in the same VLAN as the MGMT Port of the other FI.
The cluster address is shared between them and is owned by which ever FI is acting as the primary FI.
Also remember that your KVM access to all the CIMC adapters in the blades go through these MGMT 0 Ports, so also remember to have a UCS Management subnet large enough for your 3 FI cluster addresses and then enough for each blade (Mandatory) and each Service Profile (Optional)
Re your question about failure of cluster IP addresses, the IP addresses themselves do not fail, but in the 100 or so Cisco UCS Upgrades I have done I have had 4 Occasions when unplanned outages occured. This was outages to UCS Manager during the UCS Manager upgrade NOT to the Forwarding of traffic, which would be more serious. One of the reasons I always do UCS Upgrades (or any upgrades for that matter) within a planned outage window, even if no outage is anticipated, for as we all know sometimes Sh*t just happens.
Regards
Colin
Thanks a lot Cloin, Another question I have is, the planning of vnics on esx. We have got two FI’s, Two 3750 switch, two MDS switch and two chassis with 5 blades (3 blades in one and two blades in the other). We do not use a nexus switch in our environment.
Here comes the question of deciding on how many vNICs, vHBA’s we need on every Blade? Please guide me on this and also request you to share pre-req’s to consider on designing this.
Hope your 3750′s are connected to your FI’s via 10Gb other wise you may have a bit of a bottleneck even if using a port-group of multiple 1Gb links.
Re: how to present your vNICs and vHBA’s to your ESX Server you have many options but it’s pretty much up to your preference.
Some like to mirror the ESX servers in their physical environment for example 2 vNICs for Management, 2 for vMotion, 2 for VM Traffic. etc..
Some like to simplify their environment as much as possible by just having 2 vNICs (One Fabric A and One Fabric B) Team them and then just seperate traffic by Port-Groups in a DvS or Nexus 1000v. (Cisco have a white paper on UCS with ESX and recommend this method)
My Preference is a bit of both, I like to use seperate vNICs for Management and vMotion as its very easy to define QoS to a seperate vNIC.
I use a single vNIC on a seperate vSwitch with a guarentee of 1Gbs for vMotion (I enable fabric failover on this vNIC this means that vMotion traffic is always locally switched within the Fabric interconnect)
I use 2 x vNICs for Management on a seperate vSwitch, As I have had issues in the past of locking myself out of the Management of my ESX sever by making config errors on the N1kv. (1 fabric A one Fabric B and Team them) as ESXi complains if it thinks you have no redundantcy for your management links, and while these messages can be suppressed I just “Humour” ESX with 2 x vNICs.
Then I have a Teamed pair of vNICs (1 Fab A and 1 Fab B) as the uplinks for my DvS (Usually Nexus 1000v, especially now the essentials bundle is free).
I wrote a alot more about this in a previous response to a question by GS, so may have a look through all the Q&A above and have a read.
Regards
Colin.
Hello Colin,
I keep getting the following error when configuring the boot from SAN option on a blade.
Affected Object: sys/chassis-1/blade-2/fabric-A/path-1/vc-875
Description: ether VIF 1 / 2 A-875 down, reason: Bound Physical Interface Down.
ID: 141549
Code: F0283
Original Severity: Major
I guess this is because of some configuration error on the MDS switch after zoning and masking. Please guide me on this. Let me know any inputs you would need to proceed further to explain.
Also what is vc-875 in this message.
Regards
Dhruva
Hello Colin,
I have a question regarding the SAN supported configuration on the UCS. We have a need to create Hyper-V Virtual Fibre Channel (http://technet.microsoft.com/en-us/library/hh831413.aspx) so guest mssql server could have a direct link to storage to share a cluster disk. We already have directly connected storage (UCSM version 2.1 running) connected to FIs in FC switching mode. I see that “feature npiv” is running but when we try to create a virtual san switch inside HyperV host we get an error that hba firmware does not support this feature. Is this supported on the UCS system. Mezzanine cards are VIC1240.
Thanks in advance
Best Regards,
Adi
Hi Adri,
From version 2.1.2 and up this should be supported. It was intended for version 2.1.1 but was removed last minute. 2.1.2. is scheduldes for March 2013 if i’m correct.
Kind regards,
Peter
Thats for picking that one up Peter!
Regards
Colin
Thank you
Regards,
Adi
Hello
I am confused over how much bandwidth each blade has and how much bandwidth go out on the FIs. I have 2, 10GB links per iom on each chassis. So that makes it 40GB? Also I have B200 M2 servers with the 1280 vic card. how much total bandwidth does each blade get then?
Also, after installing esxi, I see that each vnic has 40000. Where is the 40GB coming from?
And
Do you see any advantage of going with less or more vnics on esxi networking?
since I think the maximum vnics is 256.
should I go with something like this
2 vnics for mgmt
2 vnics for vmotion
2 vnics for vm data
2 vnics for iscsi
2 vnics for FT
2 vnics for MSCS clustering
or
2 vnics for mgmt and vmotion (use vlan tagging on port groups)
2 vnics for vmdata
2 vnics for iscsi
Is 1 vnic of 40GB and 8 vnics of 40GB each still going to be the same since any vnic traffic is going through that 1280vic of 40GB max?
kinda confused
thanks
Hi Tony
If you have a B200M2 with a VIC 1280 then each blade will have a potential of 80Gb (40Gb Fabric A and 40Gb Fabric B)
But you also need to consider your IO Module as there are a differing number of traces to each blade slot depending on the model.
So giving your setup a B200 M2 with a VIC 1280
with a 2104XP IO Module (which has 1 x 10Gb trace per blade slot will give you 20Gb total to your blade (10Gb A and 10Gb B)
If you use a 2204XP you’ll get double as it has 2 x 10Gb traces per blade
And if you use the 2208XP you will get to use the full bandwidth of your VIC1280 (40Gb Fabric A and 40Gb Fabric B) as it has 4 x 10Gb Ports per blade which marry up with all 4 Ports on the VIC1280 giving you your 80Gb total per blade total.
I get asked different bandwidth combinations allot so think my next post will be all the different combinations and what bandwidth you get.
Although Cisco do a great job, covering all the different permutations for the B200M3 in the below paper in Network Connectivity section
http://www.cisco.com/en/US/prod/collateral/ps10265/ps10280/B200M3_SpecSheet.pdf
You also mention you have 2 x 10Gb connections from each IOM, which is obviously shared between all of your blades in that chassis. so you just have to work out your contention ratios i.e if you have 8 blades each with 40Gb of I/O per fabric (320Gb) with your 2 links you will have a 16:1 maximum contention ratio. Now that assumes that all of your servers are running at full line rate at the same time (Never gonna happen) I just suggest you monitor your IOM to FI links and ensure they are not a bottle neck for you.
You should also consider your FI to Upstream LAN / SAN switches and work out your full end to end contention rates.
Example below. (I try and keep full end to end contention rate under 15:1)
I have covered “Best Practice” for how many vNICs to create on an ESX Host in this QA section a couple to times
see the question from Dhruva above (December 16, 2012 at 10:49 am) or GS Khalsa August 9, 2012 at 5:49 pm.
And you are right aggregated bandwidth is not a valid reason for creating multiple vNICs (unless you map them to different fabrics and load balance) the main reason for creating separate vNICs is to define different QoS or Policies per vNIC as well as increase the separation of traffic VNTAG rather VLAN TAGs (Port Group separation)
Hope that clarifies things for you.
Regards
Colin
Thank you for your response. Can you explain how you came about to those ratios 16:1 etc..
I do have 2208 XP IOMs on the chassis.
From the FI to the nexus 7k, its 4 links per FI.
FRom IOM to FI its 2 links
Thanks
Hi Tony
Ratios in diagram:
8x Blades with VIC1280 used with 2208XP = (8 x 40Gb) = 320Gb per fabric
2 x Links from IO Module to FI = 20Gb
This gives you a 16:1 Ratio or contention rate between the chassis and the FI.
The Contention Ratio just between the FI and the upsteam LAN is worked out by dividing your downlinks by your uplinks. Normally you would have multiple chassis, however for simplicity I have only shown and calculated based on a single chassis hense the inverse ratio. There are 2 Downlinks to the chassis and 6 uplinks to the LAN (1 downlink for every 3 Uplinks) Hense 1:3
But as mentioned in the previous answer you need to consider your full end to end contention ratios i.e server to core. which is where the last contention ratio of 5.3:1 comes in.
Total potential Bandwidth south of FI = 320Gb in this case, divided by the Bandwidth between the FI and the LAN, in this case 60Gb so 320/60 = 5.3
But as I’m sure you know, the likleyhood of all servers pushing line rate 40Gb at the same time is not a likley scenario, added to the fact that not all traffic will be North/South (A good design will try and keep alot of the East/West Layer 2 traffic like vMotion and between servers within the vSwitch / Nexus 1000v and withinn the Fabric Interconnect.
But still having a diagram like the above in your design not only shows you have considered this but can also sometimes blatently point out a potential bottle neck.
Regards
Colin
So to be super clear in regards to multiple vNICs… Can you theoretically provision vNIC0 on Fabric A and vNIC1 on Fabric B (B200M3/2208/VIC1280/6200/4 or 8 links per fex) and be able to take advantage of the full 80Gb per blade or will you need to create more vNICs. Also do you need to port channel your FEX links to to the FIs? I hope this question makes sense.
Hi Frank
Yes with the kit you list you will get up to 80Gb per blade. You will definatley need to port-channel your FEX to FI links otherwise your blades bandwidth would be limited by the single 10Gb link they would be pinned to.
Regards Colin
Hi Colin,
Thanks for being a great resource! My question seems simple… (maybe not)
I have a UCS currently configured to connect to my access switches (4506′s) over two 10 gig connections to move all my traffic back to the core. I’m trying to connect direct to the core (3750x stack) with 2 new 10g uplinks. Uplinks seem to be functional as the 6120′s sees the 3750x’s and vice versa. I’ve set a unused vlan on the new uplinks, and it seems fine. The tricky part. When I add the unused vlan to the trunk vnic, it drops traffic over the old uplink for existing vlans and does not appear to move it over the new uplink. If I add the unused vlan to it’s own vnic, and put it on a host, it works fine, over the new link. Shouldn’t I be able to move vlans from uplink to uplink assuming the vlan is available on the uplink? Am I missing an obvious way to do this with minimum downtime?
Jerry
Hi Jerry
Think I understand what you are trying to do and you can do it, You must ensure you are using at least UCSM version 2.0 and ensure that within UCSM you map your correct VLANs to the correct uplinks.
If your links between the 4506′s and the 3750 Core are Layer 3 then that will be causing the isssues as you require L2 Adancency if you are using the SAME VLAN Id’s on the UCS FI’s , less of an issue if you are trunking all the required VLANs between your 4506′s and your Core 3750′s.
If your links are L3 then you will have to take the VLANs off the uplink to the 4506′s as you move them to the uplinks up to the 3750 Core.
Regards
Colin
Colin
Also can you explain this
>>And if you use the 2208XP you will get to use the full bandwidth of your VIC1280 (40Gb Fabric >>A and 40Gb Fabric B) as it has 4 x 10Gb Ports per blade which marry up with all 4 Ports on the >>VIC1280 giving you your 80Gb total per blade total.
I only have 2 physical links (10GB Each?) on each 2208XP IOM to the FI, where is the 40GB coming from?
So the VIC 1280 is 80GB max (40GB FAb A) (40GB FAB B). But how much I get going to the FI depends on how many ports I use from the IOM to the FI right?
so its 40GB going from the 1280VIC to the IOM and then to the FI? HOw many physical links from the IOM to the FI will I get 40GB? I got 8 physical ports on the 2208 XP but 2 are used only.
The IOM has 2 sides to it the Network Interfaces (NIFs) and the Host Interfaces (HIFs) The 2208XP is just a Nexus 2232 in a different form factor (If your are familiar with the the Nexus Portfolio)
This means it has 8 Uplinks to the Controlling Bridge (The Fabric Interconnect in our case) and 32 Downlinks to the servers (4 x 10Gb Ports to each of the 8 Server slots)
The VIC1280 has 4 x 10Gb Traces per Fabric, Hence with a VIC1280 the 4 x 10Gb Traces all match up to the 4 x 10Gb HIFs on the 2208XP Giving you your 40Gb per server per fabric.
As you rightly point out you have 2 x 10Gb Links between your IOM and your FI (Shared by all blades) so this will obviuosly be the limiting factor. Assuming you are running your IOM to FI links in portchannel mode any server could push 20Gb of I/O but has the potential to run at 40Gb if you just added 2 more IOM – FI Links.
Hope that clears that up for you.
Regards
Colin
Hi Guru,
I am new to ucs. I have given a task to build a new server in UCS blade. How to manage a server which is in UCS blade? (Like ILO, DRAC) IP, USername, password has been provided to me. I donno how to start. Can you please help me in this?
Hi Thajj
There are numerous ways to KVM onto a UCS Blade or Service Profile (The logical server that can move between blades)
Each Cisco UCS Blade has a Cisco Integrated Management Controller (CIMC) into which out of band KVM connections can be made, and virtual media mounted and installed etc..
The easiest way is just web browse to the Fabric Interconnect cluster IP address and click KVM Launcher, Login with the details you have been given, then just select the Service Profile name you want to KVM into. All the various methods are futher explained in this link.
Merry Christmas.
Regards
Colin
Thank you very much Colin
Hi All,
Happy Holidays. I have started my new year already
early.
I have been lately started working with UCS and it lot to learn. I have a question/clarity that I am still trying to figure out what it actually means.
Can someone explain me what is the actual difference between a Service Profile(creating one) and a Service Profile Template (creating one). I basically understand that if we have templates we can re-use them to create different Service Profiles. But If I am starting from zero config, then I create pools, then policies and then after that should I create a “Service Profile” or a “Service Profile Template”.
Thanks in advance for your help.
Warm Regards
Sandeep
Hi Sandeep
Welcome to the world of Cisco UCS, I’ll think you’ll really enjoy it.
Service Profile Templates are really useful and can be hugely powerful and I use them in most designs and installations.
The first thing to decide is whether you want to use an Initial Template or an Updating Template.
SP’s created from an Initial Template are “on their own” once created and are individually editable.
SP’s created from an Updating Template on the other hand remain bound to the template and cannot be edited after creation (Unless they are unbound from the template)
I mentioned Templates can be hugely powerful; imagine updating the firmware of all the servers in your entire estate in a single action or adding a Ethernet adapter to all of your ESXi Servers in a couple of clicks. Such things are possible with Updating Templates.
And with UCSM 2.1 all service profiles created from Templates can now be freely renamed (I usually just call them the same as the hostname of the server)
I have attached a link to the UCSM 2.1 Config Guide, which covers Templates as well as all other aspects of Cisco UCS Configuration.
Regards
Colin
Hi Colin,
Thank you for the response. I actually had the config guide, just was little lazy to ponder through the guide
Will study through it and will fiddle around the UCSPE to get some initial hands-on before a real lab environment.
Appreciate all your posts in this blog, it is very informative.
I am really enjoying Cisco UCS, my long term plan is towards CCIE DC. Lots of learning and I am thrilled. Wanted to know if I this blog also includes Nexus topics as well.
Cheers,
Sandeep
Hi Sandeep
Yes I tend to deal with Nexus as far as Connectivity to UCS Goes.
But I’ve been thinking of having a Catergory dedicated to CCIE Data Center Preperation, that would contain broader Nexus info as well as UCS, MDS, ACE, N1kV
But then again Tony Bourke already has a great section on it at Datacenter Overlords.
Regards
Colin
Hi Colin,
Thank you. That sounds great and Datacenter Overloads is a wonderful blog.
As of now trying to figure out things on UCSPE which seems to be very effective for initial learning and hands-on on the UCS manager.
I have UCSPE as a VM on an ESX server and load it across the network from firefox, and it works well. Was wondering if there is a limit for the number of users it can handle. I can launch the UCS manager and key in any combination of user and password and it takes me in. But not sure is there a limit of number of users into it, as all the users are considered as admin (according to release notes http://developer.cisco.com/documents/2048839/ba79fb92-a536-4de6-855b-65dcf49dbfc0)
Warm Regards,
Sandeep.
Hi Colin,
Happy Christmas and Happy New Year,
Can you help me for solving this warning message?
I have this warnings at all blade servers.
Code:F0283
Description:fc VIF 1 / 2 B-958 down, reason: Vlan not FCoE Enabled.
I’m using FC swicthing mode, and only have VSAN default. Default zoning enabled and in the properties FCoE vlan ID set to 4048. The SAN storage directly connect to FI use FCoE storage ports. This warning never appear before. Can you explain to me and how to solve this warning?
Thank you so much before
Regards,
Jacky
Hi Jacky
When you say this “Warning has never appeared before” I am guessing you have just done an upgrade or somthing and now this message is showing up.
VLAN 4048 is used as the default FCoE VLAN from UCSM 2.0 and as such is “reserved” I have seen messages appear for FCoE VLANs after un upgrade (Commonly around overlapping VSAN and VLAN ID’s, which were permitted in UCSM 1.x but give warning messages in UCSM 2.0 and up)
The default FCoE VLAN varies according to the type of VSAN and whether Cisco UCS is a fresh installation or an upgrade, as follows:
After an upgrade to Cisco UCS, release 2.0: The FCoE storage port native VLAN uses VLAN 4048 by default. If the default FCoE VSAN was set to use VLAN 1 before the upgrade, you must change it to a VLAN ID that is not used or reserved. For example, consider changing the default to 4049 if that VLAN ID is not in use.
After a fresh install of Cisco UCS, release 2.0: The FCoE VLAN for the default VSAN uses VLAN 4048 by default. The FCoE storage port native VLAN uses VLAN 4049.
I usually prefix all my VSAN ID’s with 30 for the FCoE VLAN ID. i.e. if I use VSAN 10 I use VLAN 3010 as the FCoE VLAN etc..
I would suggest to try and use a different VLAN ID for your FCoE VLAN 4049 for example, and see if that clears the message.
This should be non disruptive change ( I have done this on the fly many times) but as ever if you want to be 100% sure and if this is a Prodution Environment best shut down any hosts using the VSAN or at least confirm you have active paths via the other FI.
Regards
Colin
Under the equipment policy, can you help me understand the chassis discovery policy and the link grouping preference? Does the “link” have to match the exact number of links from each chassis to fabric interconnect, or can it be less (since it’s only for discovery)? In addition, what benefits does selecting “port channel” give me?
Thanks!
Tom
Hi Tom
The Chassis discovery policy allows you to set a minimum number of links a chassis must have (per fabric) in order to be discovered, just a way of enforcing a standard within your org. By default the Chassis Discovery Policy is set to 1 link, which means a chassis with at least 1 link will be discovered. Then a Chassis Acknowledgment is required to activate any additional links.
I generally leave it at 1 as the Chassis view shows you a nice diagram of the number of links anyway. But the option is there if you want to enforce a certain number of links in your setup.
With 1.x Code if you tried adding a Chassis with a lower number of links than the policy specified, the Chassis would still be discovered but UCSM would alert with a message saying the Chassis did not conform to the Policy. From UCSM version 2.x Chassis with a lower link count than the policy are no longer even discovered.
Benefits of Port-Channel between IOM and FI
In version UCSM 2.x Code the option was introduced to be able to port-channel the IOM to FI links. This is to take advange of the additional bandwidth available on the Gen 2 VIC 1240′s and VIC 1280′s.
If you did not use port-channel mode and instead used the default (Discrete Pinning) Then each blade slot would be pinned to a particular IOM cable thus no server having access to more than a single 10Gb cable, which for a 40Gb per Fabric Adapter is an immediate bottle kneck. (you obviously need to use the 2208XP IOM to get the Max bandwidth)
I always port-channel these days, the only reason not too would be if you wanted a very deterministic ammount of bandwidth in the event of a link failure, But even then I’m clutching at straws to advise it.
I have gone into more detail and provided diagrams in a previous question on this page so well worth having a look back through previous questions.
Hope that clears things up for you.
Regards
Colin
Hi Colin,
I have a 40Gb uplink port channel (2 port channels with 2x10Gb links). In terms of capacity management , do you think this current setup of ours would suffice in the long run if we currently have more than 5 chassis with around 3 servers each which have 4 nics (Cisco UCS M81KR)? Or would adding links to let’s say 80Gb uplink port channel would do?
Thanks in advance.
Hi Marie
The answer to your question I’m affraid is “It Depends”
There are numerous elements envolved amount of traffic from blades, and how much of that traffic will be North/South or East/West but inter Fabric or VLAN (Basically the ammount of traffic that needs to go via the Uplink ports).
I answered a question Submitted on 2012/12/24 at 10:26 am In reply to Tony, I which I show you how to work out your end to end contention ratios so have a look at that post.
In real terms I think 40Gb from each FI to your LAN should be ample for you, but if you want to proove it you can monitor your uplinks or even configure a threshold alert on your port-channels if they reach a certain level.
Vallard Benincosa talks you through how to set this up here and here
Regards
Colin
hi
I have 2 brocade dcx switches that is onnected to my 2 FIs. dcx1 is connected to FI-A, dcx2 is connected to FI-B.
Then I have a servicce profile template that has 2 vhbas. one is on fabric A and one is on fabB.
But now some servers when they are doing a initial boot from san, the wwpns for fabric A is showing up on the dcx2 switch and vice versa. It does not happen for every server. All the servers are using the same template.
I have vsanA for fabric A and vsanB for fabric B. So the wwpns for B should not be showing up on the other brocade switch.
any ideas?
thanks
Hi Tony
So as I am sure you know you must have a cross in your fabrics somewhere. If all your servers are using the same template and thus should all have consistent VSAN and Fabric assignments this should not be the cause.
As this is only happening for a subset of your servers, it sounds to me that you must likely have a link from FIA to DCX2 and thus any servers that have vHBA’s pinned to that link are ending up on the wrong fabric. So suggest double checking your patching.
To be sure ssh into the FI’s directly, connect NXOS, and check whether your WWPNs are in the flogi database of your FI’s “sh NPV flogi database” from NXOS mode.
If you see all the A’s on FI A and all the B’s on FI B then you know the at the config south of the FI’s is fine, and that you have a cross connection from the fc uplinks on FI A to dcx2 somewhere. Have a look which F-Port on dcx2 the WWPNs from Fabric A are being learned on and trace back and ensure it goes back to FI B.
If you are still seeing WWPNs from Fabric A’s in the NPV flogi database of FI B then you know the issue IS south of the FI most likely some vHBA’s are bound to the wrong Fabric (although unlikely if all your SP’s are referencing the same template, but worth checking each SP to be sure.
Anyway follow the above and you should rapidly find what’s causing your issue.
Regards
Colin
Thank you Colin. You are right. There was a problem with one of the FI links going to the second brocade. That explains why I was seeing wwpn B pool addresses on the 1st brocade switch.
What is the vnic/vhba placement policy for? I onl have B200 m2 half height blades with the vic 1280. So its single mezzanine card. Does it make an sense to use this policy on my blades?
thanks
Hi Tony
I covered a bit on Placement Policies in my response to Fred in the question posted October 13, 2012 at 12:37 pm so worth having a quick read of that.
But in short you are correct, in your situation having only a single Mez adapter you will only have vCon1 available to you.
A vCon as listed in the placement policy represents a Mezzanine card so if you have a full width blade with 2 Mez cards you could define which vNICs/vHBAs are placed on which Mez by assigning some to vCon1 and some to vCon2. You may see higher vCon numbers as an intergrated C Series can take more than 2 Cards.
The only time you may want to define a placement policy for a single Mez Blade is when using dynamic vNICs with VMFEX and you want to ensure they appear to the blade higher up the PCI Bus than the Static vNICs and vHBA’s. Or of course if you want your vNICs / vHBAs in a user defined specific order on your single Mez adapter.
Regards
Colin
hi
I did see one vcon1 but under the vcon1, I see vcon1-4. why is that?
Hi Tony
As mentioned all vCons in your case will be assigned to your only Mez Card (vCon1). The system is intelligent enough to know that even if you did have a placement policy specifing multiple vCons (Mez Adapters) it will only ever place them on vCons you actually have. vCon1 in your case.
Hope that makes sense.
Colin
Hi there,
I have seen the recommendation from cisco to have different vsans to each path on each Fabric interconnect. What is the effect of having the same vsan ( like the default vsan 1 ) for the whole cluster?
Hi Noorani
Yes that is correct, there is of course nothing stopping you using the default VSAN1 on both Fabrics, however it is a long standing best practice amoungst SAN administrators not to use VSAN 1 for (production traffic at least) on their SAN swicthes (Just like the LAN best practice not to use VLAN1).
But using the default VSANs certainly works, I use them when connecting to upstream SAN switches that do not understand VSANs (Brocade for example)
But as you say best practice is to use VSANs other than VSAN1 and have a different VSAN on both Fabrics, this just makes administration and troubleshooting easier on many levels.
Sometimes the VSAN ID’s will be predetermined by the Storage Admin and they will just tell you them, but if not or this is a new installation always go for user defined unique VSANs on MDS and Nexus Fabrics.
Regards
Colin
Thanks Colin
Hi Colin
When connecting UCS to two brocade switches is it best to :
a – use the default vsan
b – create two different vsans one on each fabric with different vsans and fcoe id’s and assume brocade will ignore/strip the vsan.
Thanks
Hi Mike
When uplinking to SAN switches that have no concept of VSANs I generally take your option A and just use the default VSAN for each Fabric in UCSM.
Regards
Colin
hi colin,
I got a situation where 1 host will see all 8 paths on the storage array, but another host will see only 4 paths to the same storage array. Both hosts are using the same service profile. Have you come across this before?
thanks
Hi Tony
I assume you mean the Service Profiles are created from the same Service Profile Template (and as such have all the same HBA settings, bar the unique WWPN’s)
Your issue sounds to me most likley to be a Zoning issue on the SAN switches.
I would have a look in your flogi database and active zoneset and ensure you see all WWPNs from the host that can only see 4 paths and that the Zones are correct and have the correct WWPN’s or alias in with no typo’s.
You could also check the initiators (WWPNs) are registered and active from the Array side.
Regards
Colin
Hi Colin,
I have 2 questions.
- Do you know how to clear power failure alarm on a chassis? We have 2 different power source to our equipments and for testing purposes we turned off one source to analyse the reaction. Then we brought it back online but UCS is stuck with the power failure error on the chassis. I havent found a way to clear it.
- Also another question concerning power, i have read that if you had 2 power sources like most datacenters have, its recommended to have the redundancy mode from N+1 to Grid mode? You have any experience/recommendation for this?
Assuming the power has indeed been restored your event should clear, if not you could try manually clearing the System Event Log (SEL)
Go to the system event log in the SEL Logs tab, click Clear.
Re your second question.
Power configurations in Cisco UCS can be a complex topic in its own right once you get into the details and topics like running in High Density Mode, but if you stick to the below “rules” you will be fine.
in UCS Terms treat N as 2 so N+1 would equal 3
This means you have one more power supply than you need to run the chassis and can stand the loss of a single PSU.
As you point out most Data Centres have redundant power feeds (Grids) so if you had 3 power supplies running in N+1 you would have to connect 2 power supplies to Grid A and only 1 to Grid B. Now while you are covered for a single power supply failure you are NOT covered for a failure of Grid A, as that would take out two power supplies and leave you with only one. And remember you need N to run the box with no redundancy and as we know N=2
So to run in fully Grid redundant mode, you would need 4 Power supplies and Grid Redundantcy selected, that way you can suffer the loss of any 2 power supplies or any Grid.
If you have 4 power supplies and you leave the tick box as N+1 your chassis will automatically run in whats called High Density Mode, which means it will actually be using 3 power supplies allowing a higher power draw to each blade slot (600W rather than 550W in standard mode) But in my experience I have never managed to get a Blade tio run hot enough to require running in HDM Mode (You would need to Max the Memory, Use the Highest CPU and then MAX all of this out with the workload on the server)
In the above scenario N becomes 3 so if you were to have 2 power supplies fail you would experience an outage.
In real terms if your chassis is running production workloads always go for 4 Power supplies and run with Grid Redundancy.
Regards
Colin
Hi UCS guru
In you reply to tarun
2) If a vNIC was statically pinned to an uplink and that uplink failed then that vNIC would NOT be dynamically pinned to another uplink, When the uplink target on Fabric Interconnect A goes down, the corresponding failover mechanism of the vNIC goes into effect, and traffic is redirected to the target port on Fabric Interconnect B.
Is this the same behavior for vHBA?
Will the vHBA failback to the fc-uplink even if it was not statically pinned?
example:
fcuplink1 – vhba1,vhba3
fcuplink2 – vhba2,vhba4
fcuplink1 – offline
fcuplink2 – vhba1,vhba2,vhba3,vhba4
what happens if the fcuplink1 comes back online.
Hi Ashok
Great Question
When fcuplink 1 comes back up the vhba’s would not automatically re-pin back, but fcuplink1 again becomes available as an uplink for new connections or if a server was rebooted or an vHBA flapped etc..
If SAN Pin groups were used my feeling is it would fail back if the pinned target becomes available again (But I’ll Lab it on Monday, and if there is any different result I’ll update this response)
Bear in mind whenever a vHBA is re-pinned this will disrupt traffic on that path as the FI needs to send a flogi on the new port.
Thats why its a good practice to use SAN Port-Channels. (If you have Nexus or MDS SAN swicthes) as the Server WWPNs on the SAN switches are associated to the port-channel and not the fc interface so the FI would not need to issue another flogi to update the updtream SAN switch.
Regards
Colin
Hi ,
I’m using two Fabric Interconnects (6296UP) with 5108 Blade Chassis.One of Fabric Interconnects is broken.And Cisco sent to new one.But I couldn’t change procedure on Internet.How can I change this ?
Hi Indat
I assume you are now just running on the single FI, in which case the addition of the new FI is the same as when you first installed the solution.
Connect your new FI into the infrastructure power it on, an install wizard will ask you if you want to set up the new FI, say “Yes” then choose to join an existing cluster, enter your admin password when prompted, and Fabric A/B and IP infomation when prompted and the new FI will import the config from your working one and rejoin the cluster.
Regards
Colin
I have question, there is one Cisco 5108 UCS chassis, populated with B220 UCS blades, from slot 1 to 4, I want to install B440 Blade to the same chassis, what is recommened proceduree….
Hi Aamir
No problem at all you can put your B440 in slots 5&6 or 7&8
You just need to take the guide bar out from between whichever of the above slots you use. (At the front of the guide bar there’s 2 small tabs you push one down and one up and then the bar just slides out).
Regards
Colin
Thanks Guru,
One More Question, about WWN Pool, How can we edit WWN Pool in Production chassis, Will it effect the already allocated WWNs, WWPNs to the blades, already in production…………… Or just can add new Pool with New block size and mac:00:00:00 change.
Thanks in Advance!!
Aamir
Hi Muhammad
Yes you can extend the range of your existing WWPN pool with no impact to your already assigned WWPN’s
Obviously the extended addresses should be unique in your environment and not overlap with other pools if you have any.
Colin
Hi Colin – I have a low latency app and my DBA is concerned about traffic going between two baremetal blades (no hypervisor) needs to go all the way from Blade#1 to FI to then come back down Blade#2 (they are on the same chassis). I always understood that the FEX is not a local switch but if depending the way you do mac-pinning you can potentially provide local-switching at the FEX level. I am sure this is not best practices but is this feasible?
Hi Andre
This is a concern I hear occasionally, and you are right the FEX is not a switch and will not provide any local switching.
But don’t think of the FEX as a switch it mereley extends the ports of the Fabric Interconnect and think of the cables between the FEX and FI as the “Backplane” of the switch. This has many benefits in reducing cabling and management points.
The Cisco 6200 FI’s provide the same low latentcy hardware switching as in the Nexus 5500 (2us port to port) so is well suited to low latentcy workloads.
You will also want to ensure that all of your East/West low latentcy traffic is mapped to the same FI and as such can be locally switched inside the FI, you can do this by having 1 vNIC per baremetal blade and mapping it to a single fabric and protecting it with fabric failover.
(The FI can localy switch L2 within the same fabric)
Hope this addresses the concerns of your DBA.
Hi Colin,
Great site! I’m starting to look into UCS and I have a question about mixing virtual and bare metal workloads in a UCS environment. I would be looking to replace an HP c7000 infrastructure with UCS. We currently have a mix of ESX boxes (which seem like they would make perfect sense in UCS), and mix of dedicated boxes for processor intensive work. Would these different workloads coexist well inside a UCS environment? In one of Joe Onisick’s posts (http://www.definethecloud.net/why-cisco-ucs-is-my-a-game-server-architecture) he mentions true workload portability as one of the game changing features of UCS. Does this mean a bare metal server can essentially “drag-and-drop” between physical UCS blades?
Thanks,
Brandon
Hi Brandon
Thanks for the question.
As we all know not all Workloads are suited to a Virtual Envrinoment, this can be for any number of reasons; Vendor Support, licencing, Performance, Security, Direct access to Fibre Channel devices like Tape Libraries, as well as Politics to name a few.
Also not all workloads are suited to a Blade Environment, and not a week goes by that I don’t have a “Blade Vs Rackmount” conversation with someone.
And this is really one of the many strengths of Cisco UCS.
Being able to mix Blades and Rack Mounts within the same cohesive system without adding management points, and even move workloads between them, Now that certainly is taking flexability to a whole new level.
This makes upgrading a dream or even temporarily “flexing” a workload up or down as demand dictates, for example you could move your payroll app to a Quad Socket Rackmount for the week that it is in most demand, and then move it back to an efficient performance 2 socket Blade for the rest of the month when its less utilised.
The keyword here is choice, Why choose if you don’t have too? and to quote Killian from the Running Man! “If you can’t decide don’t decide”
To give you a real world Enterprise example, I recently did a UCS design for a financial institution here in the UK, which was to migrate their entire Enterprise onto Cisco UCS, and just like your requirement this was from HP C7000′s and Stand-a-lone Rack Servers.
After 2 months of discovering and analysing the clients environment the below Cisco UCS kit was recommended and has just been installed.
6 x Cisco UCS Pods (each Pod being in a different security zone)
131 x B230M2 20C 256GB RAM (for ESXi Clusters)
99 x B22M3 6C 32GB RAM (For Bare Metal Windows and RHEL)
15 x C200M2 (For Bare Metal workloads with older O/S like RHEL 4.9)
15 x C220M3 (For Bare Metal workloads with O/S’s supported on M3 Blades)
So as you can see there is a real mix here, with each platform “flavour” suited and “pooled” to a particular use case.
So yes you can essentailly “Drag and Drop” workloads between Cisco UCS Blades/RackMounts, in UCS terms this is called a “Disassociation” from one bit of tin and a “Reassociation” to another bit of tin. The server in Cisco UCS terms being the “Service Profile” which is just a logical entity that can be freely moved around.
In order to get the most out of this “Statelessness” definatley look to Boot from SAN or iSCSI as that removes any dependantcy that a workload might have on a particular bit of tin, and with Cisco UCS configuring SAN boot for your whole environment takes about 10mins with the creation of a System wide boot policy.
Good luck in your journey and UCSguru.com is always here to give you a helping hand along the way.
Regards
Colin
Hi Colin !!!
I like your blog !!, I’m really new in UCS, so I want to know how to start in this theme, any link or tutorial is very welcome,, so please help me.
Thanks a lot
Thanks
which gives you a good intro
I have just mailed you across a copy of my UnOfficial “UCS for Dummies”
I think you are going to enjoy your Cisco UCS Journey.
Regards
Colin
Great blog!
Can you explain fabric failover for me? On all my Windows hosts, I have configured just one NIC with fabric failover enabled. If I lose both uplinks on Fabric Interconnect A, will my NIC failover to Fabric Interconnect B? Or does failover only happen when a Fabric Interconnect completely dies?
In addition, my ESXi host are configured for 6 nics. 2 for mgmt, 2 for vmotion and 2 for VM traffic. vnic0 goes to FI-A, vnic1 goes to FI-B and so on with no fabric failover. Since ESXi will handle NIC failover in this scenerio, am I correct in thinking we should keep fabric failover for these servers disabled? Something just seems wrong with enabling NIC failover on multiple levels.
Thanks in advance!
-Sharishma
Hi Sharishma
Thanks for the great question and certainly one that comes up alot.
Re Fabric Failover:
When ever you tick the fabric failover box for a vNIC a virtual Ethernet port (vETH) is created on BOTH fabric interconnects with a different identifyer (You can see this by looking under the VIF Paths tab of your Service Profile)
Then if the primary Fabric fails, the FI will do a subsecond failover to the other fabric (the standby virtual circuit is already established and just waiting to go active) The FI containing the failover circuit will then issue a GARP containing all MAC addresses that are behind the vNIC, up the new circuit to inform the upstream switches.
Now what can cause the failover? basically the loss of the primary FI as you say, or the loss of all FI uplinks that a vNIC could be pinned to.
Generally UCS best practice really minimises the scenarios that can initiate a Fabric Failover i.e. the use of Port-Channels combined with Multi-chassis Etherchannel (MEC) technologies like vPC and VSS.
Re: Fabric Failover or Redundantcy via Teaming
As a rule I use a single vNIC with fabric failover enabled in any situation where teaming or loadbalancing is not an option or a requirement or if I want to confine traffic within a certain fabric for low latentcy L2 swicthing within the FI (vMotion Traffic for example)
But for uplinks to vSwitches and DvS’s etc.. I always go for 1 Fabric A and one Fabric B and let the Hypervisor or Nexus 1000v handle the redundantcy.
I do not enable fabric failover on these already redundant vNICs as I have seen that mask issues (i.e if there is a failure I want to know about it)
Hope that clears things up for you.
Regards
Colin
Hi Colin,
Great Blog. I have a strange issue with PVLans. I have ESXi 5.1 configured on the UCS Blades. I have dVSwitch setup with multiple VLANs. We are trying to configure Private VLAN ( PVLAN ) on the dVSwitch. The steps to taken to create the PVLANs are as follows
1. Create the Master (vlan id 110) the Secondary PVlan (vlan id 115) in the Physical Switches.
2. We have created both Master and Secondary VLANS in the UCS as normal VLANS 3. On the Distributed vSwitch, Created the Primary VLAN id (110) and Created a secondary VLAN (115) The Type for the Secondary VLAN is Isolated.
4. Created a Port Group and the VLAN type chosen is Private VLAN and the Private VLAN Entry as Isolated (110,115).
Deployed a VM int the Port Group and we are not able to ping the Gateway. For troubleshooting purposes, Changed the VLAN Entry int the Port Group from Isolated ( 110,115) to Promiscuous ( 110,110). Then tried to ping the Default Gateway on the VM, it started Working. Went back to the Port Group settings and changed the Private VLAN Entry from Promiscusous ( 110,110) to isolated ( 110, 115). Tried to ping the gateway from the VM, the ping is still working. But after few hours, the ping stops working. If we go back to the Port Group settings and change the Private VLAN entry from Isolated to Promiscuous, the ping start to work. But only for few hours. Do i need to create Private VLANs in the UCS before creating in the VMware. I am thinking this is a bug in the vmware. the symptom are same either in regular dvSwitch or when using nexus 1000v.
Any help is appreciated.
thanks
-Vijay
Hi Vijay
I usually use one or other i.e either use PVLANs in the UCS or in the DvS (Unless of course you have a combination of physical and virtual workloads that need to share the Private VLANs)
I would suggest if you are only using Virtual machines in your private VLANs that you pass all VLANs through the UCS as classic VLANs and then do all the PVLAN config on the N1kv which has more PVLAN functionality like Community PVLANs etc..
Failing that if you are still experiencing issues then raise a Service Request.
Regards
Colin
hi Collin, wow amazing your blog and i love it. anyway i’m newbie for with UCS can you help me design with this material :
Options A :
- UCS 6296UP = 2
- 10GBASE-SR SFP Module = 4
- 10GBASE-CU SFP+ Cable 3 Meter = 12
- 8 Gbps Fibre Channel SW SFP+, LC = 8
- UCS 5108 Chassis = 2
- UCS 2208XP I/O Module = 4
- UCS B200 M3 With 2650 8x16GB Dual VIC = 4
- UCS VIC 1280 dual 40GB = 4
- UCS VIC 1240 Modular LOM = 4
Options B :
- UCS 6248UP = 2
- 10GBASE-SR SFP Module = 4
- 10GBASE-CU SFP+ Cable 3 Meter = 8
- 8 Gbps Fibre Channel SW SFP+, LC = 12
- UCS 5108 Chassis = 1
- UCS 2208XP I/O Module = 2
- UCS B200 M3 With 2650 8x8GB Dual VIC = 4
- UCS VIC 1280 dual 40GB = 4
- UCS VIC 1240 Modular LOM = 4
thank you very much if you helping me, because i dont know about UCS and how to design that material
Hi Pedro
Normally you would do a design before you define the Bill of Materials as one should dictate the other.
Designing to a kit list is definatley the wrong way around.
I can’t really advise as to a design with the kit you list, as there are far too many variables involved.
As a general rule, I always start with the use case and application requirements then work back to ensure everything is optimally designed for the application.
There are lots of design tips and examples in the “Ask the Guru” section of this blog so well worth having a read through as many of your questions have likly already been asked and answered.
If you have any specific questions that have not already been asked, then feel free to fire back.
All the best with your UCS journey.
Regards
Colin
Hi
I have votion vnic templates created in ucs and have set these to a mtu of 9000. however vmotion is very intermittent and failing most of the time. but when i set them back to 1500, vmotion seems to work. My upstream nexus 7ks are enabled for jumbo frames already. Is there anything else in ucs that I need to enable for vmotion to work with jumbo frames?
I also have 2 iscsi vnics that is also having issues. I suspect it has to do with the jumbo frames setting.
any ideas?
thanks
Hi Tony
I’m really glad you asked that, as it is a really common question / issue.
I can certainly see why users get a little confused, as to why when they set the MTU on a vNIC to say 9000, that they are experiencing a lot of fragmentation as the Fabric Interconnects are still sending packets with an MTU of 1500.
This is because the Fabric Interconnects at NX-OS level (under the hood) are essentially a Nexus 5500, and as you may know in NX-OS the MTU is defined on a per CoS basis. So your issue will be your vNIC is sending packets out with your defined jumbo MTU but if you have not defined a CoS Class for them, the traffic will default to the “Best Effort” class which has an MTU defined of 1500.
So all you have to do is define a QoS policy in UCS Manager and set the MTU of that Policy to 9216
My Advice would be just to change the MTU of the predefined “Gold” Class to 9216 (if 9216 supported in your environment) and then just assign Class “Gold” to your vMotion vNICs which you have already set as MTU 9000.
Proceedure as follows:
Configure the Gold Class for MTU 9216
LAN Tab > LAN Cloud > Qos System Class > General Tab
Enable the Gold Class and set the MTU to 9216
You may also want to set the “Weight” for Gold to 10%, to always reserve 1Gb of bandwith for vMotion
Create a QoS Policy
LAN Tab > LAN Policies > Create QoS Policy
Give it a name i.e. QoS_Gold_vMotion or somthing
Selecy “Gold” from the Priority drop down list.
You then can just reference this Qos Policy in your Service Profile or your vMotion vNIC Template, then hey presto you’ve configured your UCS for Jumbo Frames!
Remember to confirm your MTU is supported all the way to the Array, I usally do a PING test with a -f (Don’t Fragment) and -l 9000 (buffer size)
i.e. PING x.x.x.x -l 8900 -f if you get a reply your MTU is supported end to end, if you get a response saying “Fragmentation required but Don’t Fragment bit set” or simlilar your MTU is not supported end to end, and somthing on route is trying to fragment the packet.
Good luck
Regards
Colin
Hi Colin,
thanks for your response. Do I need to do some QoS/Cos maps on the upstream switch where the iscsi device is connected to? I heard that return packets going from the ucs server through the FI and then to the core upstream switch is tagged with the CoS/Qos value but returning back its untagged.
thanks a lot
HI Guru, please can you tell me if its posible connect two catalyst 3750 to fabric interconnect through 10 giga thanks a lot
Hi Dave
Yes absolutley no issue, you can either have a single port or better still a port-channel from each FI to the 3750 Stack, or a single link or port-channel from each FI to each 3750 if they are not stacked (but you would need a trunk link between them)
Regards
Colin
Guru!! thank you very much from Costa Rica, I learned a lot in this blog, go ahead, greetings.
Hi Guru,
Great posting and answers in your blog since the start of the year. Thanks a bunch for your answers and thanks to all who post questions as well
, my learning is going in steady pace.
While creating service profiles, vNIC templates etc and other policies and so on there are many options where in, we need to either set something or we can leave it default/not set. Wanted to know what is the difference between “default” and “not set”, do we have a document from cisco to know what are these defaults. Thank you.
Warm Regards
Sandeep
hi UCSGURU, i have 2 fabrics Cisco UCS 6140XP and a mds9000 as san, im working with vmware esxi 5.0 and i just want to enable npiv on my ucs for pass N wwn´s through this path , i open a case on cisco tac, and they says me that i need to upgrade my ucsm version from 1.4 to 2.1 to pass N wwn´s through this path or the other option is change end host mode to switch mode, but i cant do this, what do you recommends? please let me know if you do not understand me.
Hi Jose
I think you maybe confusing NPIV with NPV.
You do not enable NPIV on the UCS side, but rather on the MDS side with “feature NPIV” if using NX-OS or “NPIV enable” in SAN-OS.
The default mode of the FI is as you say End Host Mode or NPV in SAN terms, this means each UCS FI acts like a big server with multiple HBA’s and thereby multiple WWPN’s. By default your MDS F ports only expect 1 Flogi and 1WWPN behind the port and thus will only assign a single FCID. But as you have many WWPN’S and require and FCID for each of them the F port on the MDS needs to support NPIV to allow this to happen.
The first Flogi occurs as normal but subsequent Logins are changed to Fabric Discoveries (FDISCs)
There’s lots of info in the question section on this page which addresses setting this up and troubleshooting.
Good luck with it.
Re versions: All the above is supported on 1.4 but you may want to consider an upgrade anyway as there has been loads of cool features added since 1.4
Regards
Colin
Hi Guru,
Suppose there is a Cisco UCS system working wonderfully with vmware hypevisor. Can I use a 3th party blade system for vmotion and disaster recovery and load sharing purposes. What I mean is Cisco provisioned vm machines and network attributes and UCS management does provide or prevent vmotion to another non-cisco blade systems.
Thank you.
Regards,
SSC
Hi yes that’s fine just make sure you meet the usual prerequisites for vMotion I.e CPU compatibility etc.. You will likely need to configure Enhanced vMotion Compatibility (EVC)
Regards Colin
HI Guru, I can have two RAID arrays, 5 and 1 on Cisco ucs c220, if its posible, which controller i need, thanks.
Hi Dave Yes you can have a R1 and R5 using the Embedded MegaRAID controller as long as you meet the minimum disks required for each raid level. Have a read through the below doc which gives full details.
http://www.cisco.com/en/US/docs/unified_computing/ucs/c/hw/C220/install/raid.html#wp1015625
Regards Colin
Awesome blog!!
The new zoning feature in 2.1 firmware…what’s its’ signifigance? By that I mean, one of the purposes of zoning is an added layer of security…you can’t just plug something into a SAN switch, or in this case, a blade chassis and it automatically start hitting your array. Zoning creates an added layer of security where you have to specifically allow and configure access to the array. But with service profiles, this was already happening, a blade in a UCS system configured for direct attached storage couldn’t just communicate with the array, you had to first specify a service profile which then had configuration items which allowed communication to the array. What is the new zoning feature do differently besides add a couple more policies where you configure storage connectors? There doesn’t appear to be a significant difference and I was wondering if you could explain this for me.
Thomas
Thanks Thomas
The FI zoning feature is really only intended for specific use cases, i.e. No other fc SAN switches in the environment thus directly attaching fc targets to the FI’s.
In this setup I’m sure you can now see the benefit of being able to zone the FI’s to prevent inititators seeing initiators and targets from seeing other targets. Basicially being able to adhere to the best practice of Single Initiator / Single Target or at least Single initiator / Multi Target (SIZ) so more geared around aiding fabric stability and performance than security.
Regards
Colin
Thanks for the reply, Colin. I had thought, however, that initiators couldn’t see each other anyway. That, via adapter FEX, that traffic was all isolated anyway. And if I’m only plugging into one array, there is no worry about multiple targets seeing each other. What am I missing here? Are you saying this is incorrect and prior to the zoning in 2.1, all initiators and targets could see each other in direct attached scenerios?
Thomas
Hi Thomas
If multiple initiators are in the same zone or you are using the FI in Fibre Channel Switch mode with no upstream Cisco fc switch from which the FI can inherit the zone info from then there is nothing to stop the initiators seeing each other at fc level. Adapter FEX doesn’t prevent this potential communication that’s just the way fc works.
Always best practice to only have one initiator per zone.
Regards
Colin
Hi Colin,
How do you go about troubleshooting IO module down and in Fault tab it displays “link failed or not connect” and in FI Primary similar message is been display?
How do I know whether is it FI SFP, IOM or the cable connected is faulty?
Thanks.
JE
Hi Jia
“Link failed” would generally indicate an sfp or cable issue! If the IOM module is green in UCS Manager or just look at the LED on the back of it, then the IOM will be ok. Also if the IOM module was failed I would expect a critical alert to that effect.
Ensure you FI port is configured as a Server Port and check the cabling.
Regards
Colin
Hi Colin ,
My question is on UCS Central . I can add domains that is fine . When I look at the faults or events of those domains in UCS Central it says that it cannot connect to UCSM using the IP address . It almost looks like its trying to use a certificate bound to an IP rather then a name .This happens if your using the default keyring or a CA certificate keyring . Why would it be trying to connect IP rather then using DNS because during setup we specify a DNS server in UCS Central . Do I need to create certs based on IP and not Name ?
Thanks UCS Guru
Chris
Hi Chris
I personally haven’t seen this issue, every time I have setup UCS Central I have just used IP addresses to add in each UCS domain.
When I get some lab time I’ll see if I can replicate your issue. If you need this resolved ASAP then open a case with TAC.
Please update this thread if you get your issue resolved.
Regards
Colin
Question: On a 6200, when does a port consume a port license? And how do you release a port license from a port that is not in use? My system shows some ports in grace period when I have more licensed than are being consumed.
Hi Mike
Great question, port licences are required for every port on a UCS FI, but all FI’s come with several port licenses that are factory installed and shipped with the hardware.
Cisco UCS 6248 fabric interconnect—pre-installed licenses for the first twelve unified ports enabled in Cisco UCS Manager. Expansion modules come with eight licenses that can be used on the expansion module or the base module.
Cisco UCS 6296 fabric interconnect—pre-installed licenses for the first eighteen unified ports enabled in Cisco UCS Manager. Expansion modules come with eight licenses that can be used on the expansion module or the base module.
To emphasise port licences are assigned to the first ports you activate NOT to specific ports. If you have a licenced port that is unused simply disable it and that licence becomes available again. This is especially relevant to unused fc ports as they will always consume a license unless disabled.
Hope that clears things up.
Regards
Colin
Also i believe in a 6248 once a port has been designated as FC, it consumes a license independent of whether the port is actually used or not
noorani, I don’t beleive that to be correct. I had to disable some FC ports for a client running a 6296 to bring the license count back in compliance.
What i mean is that once you use the slide and designate the port to be a FC uplink, it uses a license, independent if you actually have a FC cable plugged in.
That is true, once you configure the port for FC it will consume a license but you can just disable the port to get that license back, so to speak.
I tried to do the hot swap for HDD on UCS Blade system and I need to re-ack the chassis for new HDD. Is this the hotswap behavior? Can we have any explanation on this?
Hi
You certainly should not have to Re-Ack a chassis after swapping a hard disk.
The only time you need to Re-Ack a chassis is when you have changed the number of FI to FEX links or during an initial chassis discovery if you are using more links than that defined in your chassis discovery policy.
This certainly needs more investigation.
Colin
Hey Guru,
I’ve read through all of these comments but I don’t think I saw anything in here address the vSphere 5 configurable maximums. I’ve done some research but I can’t get a clear answer on this. Here is a quick background of my setup:
2204 Fabric Extenders
B200 M3 Blades with 1240 VIC
No additional mezz cards.
So now the question is – vSphere 5.x has 8 10Gbps NIC configurable maximum. If I carve up 8 10Gbps vNICs for VMware to use, does it actually count the 8 vNICs against the configurable maximum or does it only see the 2 physical NICs from the 1240 VIC?
Thanks in advance!
Hi Jeff
The two 10Gb traces per Fabric on the VIC 1240 are completely transparent to ESXi. ESXi will only see the vNICs you have created in the Service Profile (Uses a similar technology to SR-IOV). What I crudely call “Virtual Physical NICs)
However there is some Cisco “special source” that does not seem subject to the usuall VSphere configuration maximums. While I’m not privy to exactly how this works, and how ESXi allocates memory to these “Virtual Physical NICs” I know you can go alot higher than the usuall 8. From memory I think 32 Static vNICs are possible with many more Dynamic vNICs possible (upto 116) if using VM-FEX
All this said I have rarely needed to use more than 8 Static vNICs as my preference is to use an HA pair of uplinks on. DvS like Nexus 1000v and then perhaps another couple of pairs of a management and vMotion vSwitches (if you don’t want the N1KV to handle these )
Regards
Colin
Hi,
I am not sure which management monitoring policy to choose in UCS manager. The choices are quite easy to understand except for Media Independent Interface status which i dont quite get what it is or what it does exactly. If you could shed some light on this for me, i would appreciate it.
Thanks,
Hi Noorani
I would recommend the “Ping Gateway” option as that in my opinion is the most logical option.
You can think of the “Mii Status” option as monitoring the internals of the MGMT 0 interface itself or more accuately the inteface between the Phycial Layer Chip (PHY) and the Ethernet MAC control Chip
Regards
Colin
Hi Colin,
I have a problem with UCS uplinked via FCoE from FIs to Nexus 5548UPs and want to see if you have encountered anything similar. After configuring a FCoE Uplink port-channel (2 interfaces, via twinax cables) from the FIs to Nexus switches and disabling the native FC Uplinks, storage performance is severely degraded (multipath software shows big queues).
Ethernet and vfc interfaces show no drop or errors on both FIs and Nexus switches.
But after reviewing “show queuing interface ethernet 1/5″ i see a lot of packets discarded on ingress:
qos-group 1
q-size: 79360, HW MTU: 2158 (2158 configured)
drop-type: no-drop, xon: 20480, xoff: 40320
Statistics:
Pkts received over the port : 809739
Ucast pkts sent to the cross-bar : 743529
Mcast pkts sent to the cross-bar : 0
Ucast pkts received from the cross-bar : 67599
Pkts sent to the port : 67599
Pkts discarded on ingress : 66210
Per-priority-pause status : Rx (Inactive), Tx (Inactive)
I’ve found the same problem on Cisco Support Community: https://supportforums.cisco.com/thread/2189106 but no viable solution.
Could you point me in the right direction to troubleshoot.
Thanks in advance
Best Regards,
Adi
Hi Adi
Thanks for flagging this to me, when I tested out multi-hop FCoE in my lab I just used a single FCoE link for all LAN and SAN traffic between FIA and IOM A (no vPC) and same for Fabric B. Must admit I didn’t do to much in the way of performance testing. But your issues and those listed on the CSC sound pretty dire.
Has anyone opened an SR on this? If not suggest you do, so this issue is officially recorded by TAC.
In the meantime I’ll see if I can replicate the issues in my Lab.
I’ll update this thread if I find anything, or would appreciate if you would If TAC resolves the issue.
I know the first Maint Release for 2.1 is due out shortly, so this may well be addressed if it was a known issue.
Colin
Hi Colin,
thanks for taking interest.
I’ve opened a service request and will update the thread if TAC resolves the problem.
Regards,
Adi
Hi,
here is a small update; I’ve spoken to TAC but they said we need to turn up the FCoE Uplinks so they could do the troubleshooting but unfortunately we were unable to since this is a production environment now. I’ve asked if they could recreate the problem in a LAB, still waiting for reply. They also suggested we update the Win2012 server HBA drivers since they were not the latest and test again.
Have you been able to test this problem in lab enviroment Colin?
Regards,
Adi
Just to update the thread, seems that the problem was QoS bandwidth reservation mismatch between UCS and Nexus.
Cisco community thread: https://supportforums.cisco.com/message/3897188#3897188
Adi
I have a UCS version 2.1 with (4) B200-M3′s and (2) 6248′s. My core switch is a Nexus 7000 with 10G ports available. I currently have FC storage presented and that is running fine. I want to add an iSCSI storage appliance with 10G ports and don’t know where to start. I don’t need to boot from iSCSI. Connect the iSCSI to a port on the Nexus 7K on its own VLAN and create an iSCSI vNIC in the Service Template?
Hi Brian
I appreciate your frustration the focus certainly is on booting from an iSCSI target these days and plain old iSCSI access is often just passed over.
The simplest method is just create an additional regular vNIC (Only use iSCSI vNICs for iSCSI boot) call it iSCSI give it an MTU of 9000 and assign it to a QoS system class configured with an MTU of 9000.
Create a seperate VLAN for iSCSI and corresponding iSCSI subnet
Then just configure your iSCSI via your OS as normal.
Regards
Colin
Have you seen bugs in UCS Central? I can barely get it to do anything after log in. There have been no updates since release and I have yet to see anyone report the problems I have been seeing. I was assuming I am the only one using it. Things like “Communications error, click OK to restart”.
Hi Daniel
I have had some intermittent connectivity issues with UCS Central, I.e. all is well then perhaps when I choose launch UCS Manager I get a communications failure or something. I resolved these by de registering and re registering the UCS Domain from UCS Central. My UCS Central has been fine since.
Regards
Colin
hello
I have a service profile that consists of 10vnics. this is because I am running 2 vnics for nexus 1000v. and the other 8 for 4 dvs switches – vmware mgmt, vmotion, vms and iscsi.
I needed this because I can change a vm from a dvs to the nexus anytime I want.
Do you see any issues with this service profile config?
thanks
Hi Colin,
Your tips for IT people are really great.
I have one more question regarding the FI.
Current FI’s version 4.1- in a UCS infrastructure are old version and the UCSM version is 1.3. One of the FI is out order, it is dead now. We received a new one it is with it is with latest updated.
Can we downgrade the IOS version of the new to the current UCS Infrastructure.
What is the best solution in this kind of situation, we are in Production and downtime is a question.
Thanks & Regards,
Aamir
Hi USC guru
what a great blog, Thx a lot for your work here… absolutely phantastic.
I’ve also an unanswered question. Because I haven’t found anything which helps solving my problem, I will ask you to help me.
I want to install Windows Server 2012 Hyper-V on USC B200 M3 blades with iSCSI boot from a NetApp iSCSI target. I found nowhere a solution how to configure it. I configured in the past iSCSI boot for vSphere ESXi 5.0 and it works great.
Can you help me, what steps I have to do, to implement iSCSI boot for a Windows Server 2012 Hyper-V host?
thanks and regards
Reto
Hi Reto
Apologies for the delay, I’ve been working away all week.
I’ve not done too much around Hyper-V as yet, I’m much more of a VMware guy
so I’ve had a quick look around too, most of the iSCSI boot for Windows docs / posts I have found apply to W2K8R2.
Sounds like you need to write a “how to” guide for Booting Windows server 2012 Hyper-V
Colin
Hi Colin, we are planning to build a small cloud with KVM and OpenStack,
we’d like to use five UCS C240 for HYP, and NetApp as storage.
What is the basic configuration you suggest using UCS, to reach the goal to have from the beginning 10Gbe and a Network scalable and ready to grow.
The matter is, we have small budget to start, but we’d like to keep the possibilty to grow doing some nice choise !
Regards
Mark
Hi Mark
Can’t really give you a kit list without really going into your requirements.
But based on what you mention above, you may be better off looking at a small B series setup or even something like an ExpressPod or small FlexPod (if you haven’t already bought the Storage) The key fact you mention is ready to grow there is a tipping point when B series becomes a better value propersition than C series which can be around the 5 Rack mounts you mention.
Regards
Colin
Hi Colin,
I was wondering if you can provide some visibility on the process for configuring connectivity between Cisco FI 6296, MDS and SpectraLogic T120 Tape Library? We are currently using a 5108 Chassis, B200 M3 blades, vSphere 5.1 installed and RHEL6.2 guests operating system. I would like to use a virtual machine for accessing the Tape Drives via 8Gb FC. From my understading, this would be the steps required. Would you mind clarifying the process?
1) Connect Fabric from FI to MDS
2) Connect Fabric from Tape Library to MDS
3) Configure FI Ports as “FC Uplink”
4) Provide WWN and Zone Accordingly
5) Create vHBA Pool and Template
6) Assign SP Template to Blades
7) Confirm NVIP is enabled on MDS Switch
8) Configure VSAN on Both Sides {Default or Custom}
9) Configure Virtual Machine As Pass Through
10) Validate Connectivity
Cheers,
Mike
Hi Mike
I have had similar requirements in the past.
Last time I checked (and I don’t think things have changed) the vHBA’s on a Cisco VIC did not support pass through (VMware DirectPath I/O)
Whenever I have required tape drive access I have used a small blade with a Baremetal O/S
Regards
Colin
Hi with all this UCS 2.1 goodness I love to get there. With that said I am currently running 1.4 and would like to upgrade to 2.1. With that said any gotchas or upgrade advisers out there to check the current system and what (if any) gotchas going to 2.1) I realize there are Cisco docs out there and I have read them albeit a bit dry none the less the intent is to save your bacon but how about a readers digest version of an upgrade it..?
I am running (2) chassis with 16 blades and I do have (2) FI’s
p.s.
(if it existed)
I like the idea of an upgrade advisor tool to run against it
Hi Pete
Your doing the right thing, dry though they may be, follow the correct upgrade guide 1.4 to 2.1 in your case.
I still do even after what has probably been 50 or so UCS production upgrades.
The main thing is to ensure your environment is stable and configured for fabric resiliency (so when the FI’s get rebooted there are no unpleasant surprises.
Also even though this process should not cause an unplanned outage, always perform the upgrade in a maintainance window.
Good luck
Colin
Thanks for the reply Colin. One of the things I was reading is something about no longer supporting default vlan. I looked at my current config and it shows the default id is is disabled. From seeing that I will assume it should be no problem for the upgrade to 2.1 from 1.43l?
Hi,
I have following devices with me and need your help on design.
2 UCS 5108 Chasis
16 B200 M2 blades
1 UCS 6100 Fabric Interconnect switch
1 Core Switch Nexus 5548 UP chasis 32 10 GbE Ports
1 SAN Switch MDS 9148
1 Storage EMC VNXe 3100 with 12 TB (3 TB SAS + 9 TB SATA)
Hi The above is fine, but obviously you will have no redundantcy in the Fabric Interconnects and only have half the available bandwdith of you server Mezzernine adapters.
While you could use your MDS as SAN Fabric A and Portion off a block of your Nexus 5548UP as a SAN switch B, I don’t really see the point given you only have one FI.
I guess this is a Lab setup or Test / Dev or somthing.
Anyway your kitlist pretty much dictates your design, which would be somthing similar to the below.
Regards
Colin
Thanks for your reply. There are some changes in my devices. Now I have following devices with me.
2 UCS 5108 Chasis with 4 2204XP I/O modules
16 B200 M2 blades
16 Cisco UCS VIC 1240 modular LOM for M3 blade servers
16 Cisco UCS VIC 1280 dual 40Gb capable Virtual Interface Card
1 UCS 6248UP Fabric Interconnect switch
1 Core Switch Nexus 5548 UP chasis 32 10 GbE Ports
1 SAN Switch MDS 9148
1 Storage EMC VNXe 5100 with 12 TB
My confusion is can i connect 2 FEX from one chassis to 1 Fabric Interconnect and what is the use of 1240 and 1280 VIC if I can not connect 2 FEX to 1 FI?
Regards
Hi Mike, I have a question regarding the speed/BW of vNics we create put of VIC 140/1280. Lets say I am using 6248 FI’s and 2208 IOM’s and I connect all 8 ports from each I/O module to FI, so I will have 80 GB B/W on each IOM going to FI. I have a VIC 1240 adapter on all of the 8 blades on the Chassis and I have ESXi installed on all 8 blades. So lets I want to have 4 nics on each ESXi hosts. What will be the speed of each nic on each host? Is it 10 GB(thats what we see in Vcneter) or is it less than 10 GB or this speed depends on the number of connections between IOM and FI?
Thanks,
Bhargav
Hi Bhargav
If you are using a VIC 1280 with a 2208 and are using all 8 FI to FEX Links then each one of your blades will be able to drive upto 40Gb of I/O per fabric (Upto 80Gb Total)
As far as your vNICs are concerned yes they will be reported by the OS as 10Gbs, but the traffic eminating from those vNICs will be load balanced across each of the 4 x 10Gb Traces (KR Ports) on The VIC 1280 Fabric to which it is mapped. EACH of these load balanced flows are limited to 10Gb, so your single vNIC could in theory drive upto 40Gb but is likley more dependant on the OS Driver and OS Policy. (I feel an Network Traffic Generator test coming on)
The fact that you have 80Gb of bandwidth between the FEX and FI would mean that there is no inherent bottle kneck between them for a blade driving 40Gb of I/O.
Hello,
Wondering if you can help, i am a USC novice, however i have been tasked to set up boot from iSCSI SAN. I am currently using firmware 2.1(1a). Any help or pointer would be greatly appreciated.
Hi Neil
The below link details the setup of iSCSI boot.
http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/2.0/b_UCSM_GUI_Configuration_Guide_2_0_chapter_011101.html#concept_CFF6B18F18684915816935F89B62CCAC
Also Craig over at RealWorldUCS has also done a nice iSCSI boot video
Regards
Colin
Hi there,
so I have been working on Vblock installs for a while now typically just handling the storage and fabric portion. My question is when doing a manual blade add it asks for a vhba for boot policy and I usually use whatever wwpn I have zoned to the new hosts such as vnx spa or spb ports. I am a little cobfused about
Hi Lee
When you are creating a Manual Service Profile, thats when you specify how many vHBAs you want in your Server commonly 2; I tend to call them
fc0 and map to Fabric A and fc1 mapped to Fabric B.
The Server will then want a single WWNN for the blade and each vHBA’s will want a WWPN, you can either just pull these from a pool you create (by far the best and easiest method) or you could just manually define them.
This is great as you are able to code alot of granular detail into your WWPN’s which can make trouble shooting a dream (example below)


.
.
.
.
.
.
Once the Servers vHBAs are configured and have addresses, you can then create your Boot from SAN Policy. (example below)
.
.
.
.
.
.
You would generally add two targets per vHBA (These would be the WWPN’s from your VNX SP ports.
The diagram below shows a typical UCS SAN setup

.
.
.
.
.
.
There is loads of info in previous answers on this page about how to setup and troubleshoot boot from SAN.
Good Luck and keep them Vblocks rolling!
Regards
Colin
Hello Colin,
First of all I’m really impressed of your knowledge and all the different topics you are talking, or writing, about Cisco UCS! Great site! Thanks a lot!
Actually I have two questions about Cisco UCS, whereat you hopefully have an answer on each of them for me.
Just to give a brief overview of our environment:
2 x datacenters with each of them having:
- 2 x UCS FI 6248UP (in Cluster)
- Each FI has 2 x 10GE-LAN-Uplinks and 6 x 8Gbit/s-SAN-Uplinks (to Brocade DCX Fabrics)
- 4 x UCS 5108 Chassis with each having two 2204XP I/O Modules (two ports of each 2204XP are wired)
- Each Chassis has 8 x B200 M3 Blades
- Each Blade Server has two E5-2680, 192GB RAM (24×8), one VIC 1240 LOM, no local storage
Ok, let me come to my questions:
1) As we have 32 Blade-Servers per datacenter, I have to assign 32 management-IPs for each Blade-CIMC plus 32 management-IPs for each Service Profile. Due to the fact that the management-IPs are placed in a pooled IP-range I thought I just have to reserve like 10 IPs and UCS takes care to assign an IP to an blade-server or a Service Profile that should be viewable through KVM. If I just assign 10 IPs into the pool and connect the FI to UCS Central, it even does not allow the association of new created Service Profiles to Blade-Servers… It stucks in the creation process. If I disconnect UCS Central it allows me the creation of SPs just until the pool is full.
2) For troubleshooting reasons I wanted to disable consecutively the assigned vHBAs to a Service Profile. Each SP has two vHBAs (vHBA0 and vHBA1). As I didn’t know how, and the complete environment is still under construction, I was able to disable all SAN-Uplinks, first of fabric A, then of fabric B. But in production usage I think this won’t be possible. So is there a possibility to shut down a virtual interface (may it be veth or vfc)?
Thanks a lot!
Regards
Hannes
Hi Hannes
Thanks for the question
Not sure I fully understand the first question,
The Management IP address is mandatory for the Blades (CIMC) and as soon as you create the pool All blades will take one.
The Management IP addresses for Service Profiles are optional (use them if you need a consistant KVM address if the SP moves to a different blade)
I certainly haven’t had any issues around UCS Central caused my Management IP Pools, (every UCS Domain I have added, has always had plenty of Management IP Addresses in it.
2) No affraid you can’t shut down a veth or vfc within an FI (would be good if you could), what I have done in the past as the next best thing is disable the NIC under the host OS for the vNIC and put the vHBA into a VSAN that does not exist in the upsteam fabric (This brings down the HBA)
Feel free to fire back with some more info.
Regards
Colin
Hi Colin,
thanks for the answer.
For question 2) Thank you for the workaround. A quick possibility of “shut” and “no shut” would also be great from Cisco. Let’s see what future versions bring with them…
For question 1)
I wasn’t aware that the Management IP addresses are mandatory for the Blades (CIMC)…
My understanding of a Management IP-Pool was, that I e.g. have 10 IP addresses and these 10 IP addresses are mostly unused. But when a KVM connection is needed, UCS should associate one free IP of the 10 IPs to the requested Blade. And after the KVM session is closed, the IP should be released from the blade.
So to say, to “save” IP addresses…
So if I got you right I must have at minimum the exact number of management IP addresses as the number of blades I’ve got. Right?
But I don’t have to have furthermore reserved IP address for Service Profiles, if I don’t want to have static management IPs for ServiceProfiles, do I?
Thanks in advance
Hannes
Hi Colin,
Like everyone here, I really appreciate the clarity and depth to this product line. I also appreciate your responsiveness to this venue. Bravo!
I posted a message further up in response to one of your responses, but I wasn’t sure if you’d catch it so I thought I would re-post it here with the most recent questions and provide a little more detail.
On October 14th 2012 you answered Tarun’s question about Static and Dynamic Pin Groups. My question is along those lines.
I think I know the answer, but I thought I would post it to make sure. Starting with the assumption that you have one static pinned uplink and two dynamically pinned uplinks (in a port channel). I realize that the only traffic destined for the static uplink would only be that which is in the pin group. Everything else would crowd out the dynamic ports. What happens if both of the dynamically pinned uplinks go down? I assume the traffic will use the only uplink left… the static pinned uplink. Any port in a storm I guess. Is that correct?
Here’s why I ask the question… we have a local 8509 in our datacenter that we will have both fabrics attach to via port channels. For fail-over use only, we’d like to connect single links from each fabric to another 8509 that we’ve got in another part of the building using long-range optics. We don’t want any traffic to use the fail-over link unless there is a failure of the datacenter 8509. Since I’m not aware of any ability to put a cost on the paths, I thought a statically pinned uplink along with an empty pin group associated to the fail-over uplink would do it and offer us the greatest amount of flexibility. Does that sound reasonable?
Thanks for any help you can offer!
Troy
I quick follow-up to let you know where I came up with this idea. @2:10 in Brad Hedland’s Part 7a video on End-Host Mode Pinning indicates this is the case, but I just wanted to verify this because I wasn’t able to find it in the documentation. Brad’s videos http://bradhedlund.com/2011/03/08/cisco-ucs-networking-videos-in-hd-updated-improved/
Hi Troy
Firstly appologies for my lateness in answering, I’ve been working away all week and have been totally maxed out.
Anyway I did answer your first question about and hour ago but thanks for adding in the additional info it gives the question the context it needed.
But affraid the answer is still the same, the Dynamic vNICs will not failover to the Static Target, but instead will either failover to the other Fabric (if configured for fabric failover) or just rely on having a teamed member in the other fabric.
You could have a link or channel to the non DC switch but you would need a L2 port-channel between your DC switch and your Remote switch. Which may have a sub-optimal effect on your traffic paths.
Perhaps better to put the link in and disable it at the FI end, and enable it in a DC switch failure scenario (if you can still get to it that is)
Regards
Colin
Hi Colin,
Your tips for IT people are really great.
I have one more question regarding the FI.
Current FI’s version 4.1- in a UCS infrastructure are old version and the UCSM version is 1.3. One of the FI is out order, it is dead now. We received a new one it is with it is with latest updated.
Can we downgrade the IOS version of the new to the current UCS Infrastructure.
What is the best solution in this kind of situation, we are in Production and downtime is a question.
Thanks & Regards,
Aamir
Have a question, why i could not able to set BIOS backup version for some blade models (ie: i have used B230 m2 blade, VIC 1280 and UCSM firmware 2.0(4b). unable to update bios firmware, set backup version. it appears N/A why ? hardware is not supported upgrade bios via GUI?
Hi
It may well be that your running BIOS version is the first version that supports your hardware, thus a backup version is unavailable.
Regards
Colin
Colin,
I and my implementation partner have been scratching our heads about an error that we received. First, let me define my environment. I have a Cisco UCS 5108 with 6 UCS B200 M3 blades. We are using an EMC VNXe 3100 for storage (iSCSI), and VMWare 5.1 Standard for virtualization.
We have the blades installing the VMWare without any problems. The problem comes on the reboot. We receive an error that Bank5 and Bank 6 are not VMWare boot bank, and no HyperVisor is installed.
The common answer is that did we change the boot from UEFI to BIOS, and the answer is no.
Any advice or direction would be greatly appreciated.
Thanks in advance for your assistance. Jimmy
Hi Jimmy
Must admit have not seen that issue before.
Are you using the Cisco OEM version of 5.1? available here.
If you still have issues fire, back or open a Service Request.
Colin
Colin,
Yes, we are on that version. The problem was the fact that the iSCSI NIC was set for jumbo frames with an MTU 9600. The underlying NIC was set for the same MTU. In some other research, we found that even if the iSCSI NIC is set with an MTU 9600, the underlying nic needs to be set at 1500. Once we made that change, the boot from an iSCSI SAN worked.
Jimmy
Total newbie question…created a server profile, specifying local disk in the boot policy as 1st in order, with CD-ROM as 2nd in order. (The boot policy instance specifies a Raid-1 configuration).
The server boots to the CD-ROM as expected (RHEL 6.4 install iso); but after successful installation and reboot the server does not/will not boot from local disk despite the policy. Am I missing a step somewhere?
I was able to install/boot another blade without a problem. Interesting that in comparing Bios settings, the problem blade does not see the disk but the successful blade does. I’m suspecting a hardware/controller problem; however, the bios will not let me disable quiet mode so I can’t get into the controller setup. If anyone has any other observations/thoughts I’d welcome them.
Thanks,
Peter
Hi Peter
No sounds like your doing everything right, if your Hard Disk is above your CDROM in the boot policy and it has a bootable OS on it then that should be all that is required.
If your same Service Profile works on one blade but not another, then as you say sounds like that blade may have an issue.
Sounds like the best thing to do is open a service request.
Colin
Hi ucsguru,
I wonder if you can give me a quick explanation. We have Vethernet interfaces created for the vNIC paths to the servers, however when a blade is decomissioned our monitoring tool see’s these as down, when we put a new blade in and commission it, the UCS dynamically creates new Vethernet interfaces but the old one’s are not removed and our monitoring tool keeps logging alerts for these interfaces showing as down.
Should we stop monitoring Vethernet interfaces as these are dynamically created by the UCS and can not be statically monitored, does the UCS not remove old unused Vethernet interfaces?
Thanks in advance.
Simon
Hi Simon
You are correct that vEths are dynamic and will be created and deleted as Service Profiles are associated and disassociated from the blades.
First a quick word on vEth behaviour.
The vNICs and therefore their corresponding vEths are created when the Service Profle is associated to the blade (during VIC programing while automatically PXE booted from the UUOS ISO on the Primary FI) Once created the vEth remains down (nonPaticipating) until the server O/S boots.
If the Server is Shutdown the vEth remains but is brought to a down state (BoundIFDown)
If the Service Profile is Disassociated from a Blade then the vEth is deleted from the Fabric Interconnect. (If yours are not it may be a bug in your UCSM version)
If the Service Profile is then re-associated (Whether to the same blade or not) UCS Manager will assign a different Channel ID’s and therefore different vEths ID’s to the vNICs.
So now we understand the vEth behaviour, you can make the decision whether to monitor them or not. If your environment is static and Service profiles don’t get disassociated very often then there is probably no harm monitoring them. If you have a more dynamic environment then you will probably get a lot of “False Positives”.
That said all the causes that I can think of that could bring a vEth unexpectedly down will alost certainly generate another alert i.e. failed FI Port / Failed VIC. And there is not much chance of a Virtual Cable being pulled out
Hope that all helps guide you.
Regards
Colin
Hi Colin,
Many thanks, I have another question, if I add new RAM into a blade does the service profile need to be disassociated/re-associated for the changes to take affect?
No you don’t. Think of it the same way as adding RAM to a regular physical server. Power down install the RAM and power back up.
Okay well that causes me an issue, as the server got stuck in a reboot cycle and had to be hard shutdown and disassociated/re-associated from profile before it worked.
Thanks UCSguru for your reply. There are some changes in my devices. Now I have following devices with me.
1 UCS 6248UP Fabric Interconnect switch
2 UCS 5108 Chasis with 4 2204XP I/O modules
16 B200 M2 blades
16 Cisco UCS VIC 1240 modular LOM for M3 blade servers
16 Cisco UCS VIC 1280 dual 40Gb capable Virtual Interface Card
1 Core Switch Nexus 5548 UP chasis 32 10 GbE Ports
1 SAN Switch MDS 9148
1 Storage EMC VNXe 5100 with 12 TB
My confusion is can i connect 2 FEX from one chassis to 1 Fabric Interconnect and what is the use of 1240 and 1280 VIC if I can not connect 2 FEX to 1 FI?
Also can you please tell me what kind of cable (part number) I need to connect MDS to Fabric Interconnect and VNX 5100.
Can I use VMware Free Hypervisor in above environment?
Regards
Hi
In answer to your questions.
Q) “Can i connect 2 FEX from one chassis to 1 Fabric Interconnect”
A) Absolutley Not! Both fabrics must remain isolated from each other (see below)
Q) what is the use of 1240 and 1280 VIC if I can not connect 2 FEX to 1 FI
A) You will still get 20Gb of usable Server bandwidth with a VIC 1280 and your 2204XP IOM over your single Fabric
Q) Can I use VMware Free Hypervisor in above environment?
A) Yes Stand-a-lone ESXi (Free) is fully supported.
Regards
Colin
Thanks Guru.
As per your explanation. Even though I have 2 FEX per chassis, I can connect only left side of IOM from each chassis to Single Fabric Interconnect and my right side of FEX from each chassis will be ideal.
Do I need both 1240 and 1280 VIC? Can I work with only VIC 1280 and get 20 GB of usable server bandwidth?
Is 1240 and 1280 for adapter redundancy? If yes how to configure it?
With above setup in future can I add one more FI and get redundancy easily?
Regards,
Hello Guru,
Was reading through multiple posts of yours. Great stuff. The vmware port-group diagram with vNIC is an awesome depiction.
Can you please share me the dropbox link. The link you have provided doesn’t seem to work.
https://dl.dropbox.com/u/36029701/VMware-Stencil1-UCSguru.zip
Thank you
Cheers,
Sandeep
Hi Sandeep
Yes appologies, I only have a 2GB Dropbox and needed the space, maybe one day I’ll buy some more space
Have added the file back in
https://dl.dropbox.com/u/36029701/VMware-Stencil1-UCSguru.zip
Regards
Colin
Hi Colin,
I have a couple of good questions about direct-connected SAN cabling.
1.When we connect directly VNX to Fabric Interconnects and want to use native FC, which TwinAx cables do we use? “Active”?
2. When we connect VNX directly to Fabric Interconnects and want to useFCoE, which TxinAx cable should we use?
3. When we connect VNX directly to Fabric Interconnects and want to use iSCSI protocol, which TwinAx cables do we use?
Thanks!
Hi Alexey
1) TwinAx is Ethernet not Native FC
2) / 3) The VNX range support Active TwinAx cables
There is a section on VNX and TwinAx in the below whitepaper
http://www.emc.com/collateral/hardware/white-papers/h8217-introduction-vnx-wp.pdf
Regards
Colin
Hello UCSGuru,
How to connect FC Tape library to UCS environment to take data backup of UCS blades? Can we connect it to Fabric Interconnect?
Regards,
Hi Yes you can and with UCSM 2.1 you can now create Zones of the FI’s
You will need to put the FC portion of the FI’s in switch mode for direct N Port connection and configure your FI port which connects to the Tape Library as an “FC Storage Port”
Have a read of the below doc which covers all the FC Zoning configuration http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/2.1/b_UCSM_GUI_Configuration_Guide_2_1_chapter_011011.html
Regards Colin
Hello UCSGuru, I have a problem that is really puzzling me. I have a new VM on the UCS chassis that is utilizing a newly created vlan on the Nexus 1kV, 5k, and Nexus 7k devices. After configuring the ip information on the VM it will not gain network connectivity. I have stand alone servers on the same vlan outside of the UCS working fine. The 1kV, which is connected directly to the UCS chassis can ping the vlan gateway(10.130.1.254) just fine, but it cannot ping the VM ip address (10.130.1.16). All other vlans are working on the UCS chassis. Any help would be appreciated.
Hi
Nexus 1000v Side
I would confirm that the VLAN you are using for your VM (the vEth port-profile) is allowed on your Physical system uplink (Eth Port profile)
Also that your VLAN exists in the VLAN database of the Nexus 1000v.
UCS Side
Also that the VLAN is created within the Cisco UCS VLAN Database and tagged on the ESXi Host vNICs that provide connectivity to the VEM
The fact the Nexus 1000v can PING the gateway only means the N1kv management is OK, the N1kv is only a layer 2 switch, so unless your VM is actually on the same subnet as your N1kv MGMT0 interface, this test doesn’t help much.
Good luck
Regards
Colin
We are planning some migrations to the UCS platform. There will be virtual as well as physical servers. Have you successfully completed any sort of bare metal restore (or P2P) to the Cisco UCS blades?
Hi Jason
Yes many of most flavours, my product of choice for this is PlateSpin Migrate, I have had issues getting the Cisco VIC drivers into the PlateSpin boot ISO, but found our PlateSpin supplier fairly helpful in assisting with this if we sent them the Cisco drivers.
Regards
Colin
Update capability catalog?
Since I am still pretty new with UCS I was adding a C rack mount server to our UCS which is still on 1.4 – anyhow I have the FEX 2232PP on there and notice that they are stuck in “identifying” this if the FI’s cant see those they arent going to see the C server. So I read some place from a fault message (cant think of the number off the top of my head) basically saying the catalog needs updating. I assume this is as simple as downloading the catalog and updating it? Should not disturb anything in the environment? firm ware is at 1,4 and the catalog is at 1.039 (or something like that. Any gotchas I need to know about?
Another question I have in a potential upgrade from 1.4 to 2.1 in the cisco doc it states:
Default Zoning is Not Supported in Cisco UCS, Release 2.1(1a) Onwards
Default zoning has been deprecated from Cisco UCS, Release 2.1(1a) onwards. Cisco has not supported default zoning in Cisco UCS since Cisco UCS, Release 1.4 in April 2011. Fibre Channel zoning, a more secure form of zoning, is available from Cisco UCS, Release 2.1(1a) onwards. For more information about Fibre Channel zoning, see the Cisco UCS Manager configuration guides for the release to which you are planning to upgrade.
Caution
All storage connectivity that relies on default zoning in your current configuration will be lost when you upgrade to Cisco UCS, Release 2.1(1a) or a later release. We recommend that you review the Fibre Channel zoning configuration documentation carefully to prepare your migration before you upgrade to Cisco UCS, Release 2.1(1a) or later. If you have any questions or need further assistance, contact Cisco TAC.
What is that exactly? Is it asically saying “If you use the default zone that cisco has when you install UCS it will no longer support it and if you use your own VLAN’s then you have no worries” ?
Sorry for the ignorance.. I just like this stuff and unfortunately an upgrade like this is nothing I can practice on except for the production equipment.
I also notice it says its not supported in 1.4 which is the version I am on. I am curious as I want to stay away from any issues
Hi Pete
What is meat by “Default Zoning” is that the default Zone in each VSAN can either “enabled” (fully open) or “Disabled” (fully closed). enabled meaning that all traffic is permitted among members of the default zone, thus relies soley on masking at the array to prevent initiatiors seeing LUNs they shouldn’t.
Zone inheritence was supported, in that id you uplinked to a Cisco MDS/Nexus SAN switch the FI could “Inherit” the Zone info, but in reality if you have a MDS or Nexus SAN switch in your environment why would’t you just connect your targets to them rather then the FI direct. (I guess if you have run out of ports / capacity perhaps, but still not a great solution)
Default zoning is not recommended for production environments, due to the openess and potential initiator to initator traffic or target to target traffic or any combination of each, which may cause issues.
If you upgraded a <2.0x working default zoned FI, to 2.1 all of your member ports within that default zoned VSAN would suddenly stop being able to talk to each other, and the FI would have to be zoned, as per a normal SAN switch to re-establish the required fc connectivity.
Hope that clears things up for you.
Regards
Colin
If you have a multhop FCoE design, edge-core-edge. And you use QOS System Classes in the UCS, do you need to have all the upstream switches support the COS’s also, so that the markings are honored on the packets?
Hi Saied
I would think if all of your ports end to end are FCoE ports and as such support the DCB standards (in particular 802.1Qbb, Priority-based Flow Control) The the default CoS assigned to fc traffic (3) will be honoured with the standard non drop class and 50% weight.
If at any point your traffic traverses native Ethernet links, then you would need to define a policy for CoS 3 to have jumbo frames and the weight you require.
Anyone else got any input?
Colin
Hi Colin!
I just had an odd event on one of my UCS installations. Both FI connections to my core switches blipped several times over a 8 minute period, dropping for right around a second each time. I don’t see to see anything in the logs on the UCS. Is there a good place to go looking for that? Thanks!
Jerry
Hi Jerry
Not somthing I have seen, before.
Assuming your upstream swicth ports are configured and channelled correctly i.e. “mode active” and set to “Edge Trunk” if Nexus and carrying multiple VLANs.
We would certainly need more info to troubleshoot this, i.e. UCS Code level, Config of UCS and Upstream switches, was this a one off, or does it occur regulally etc..
Probably not the best forum here, but definatley check the upstream switch config.
Regards
Colin
Hello,
We need to enable jumbo frames in our 7k/UCS/1k environment, and I have read the docs that explain fully how to accomplish this (by changing the QoS system class), there is nothing mentioned about any impact to traffic. I know, for instance, that when you change the MTU on physical links (vPC links, or even regular links) the ports bounce, so I am wondering if anyone here can point to documentation, or has explicit experience with whether any traffic disruption will occur when we change the QoS class to an MTU of 9216.
We are using updating vNIC templates in our environment, which call on the QoS policies defined.
Hi James
I have certainly changed the default MTU on the fly on a live production 5500′s with no impact, and would expect the same on a 7K, just remember that if there are multiple hops in the path your MTU would need to be set end to end, otherwise excessive fragmentation could occur, which may actually reduce performance or cause connectivity issues.
Proceedure for enabling Jumbo Frames on 5k and 7k
http://www.cisco.com/en/US/products/ps9670/products_configuration_example09186a0080b44116.shtml
The default MTU on the Nexus 7k is 9216 anyway (look for the line “system jumbomtu 9216″ in your running config) if you need to change the default mtu on L3 interfaces (1500) just use the mtu command under that interface.
Regards
Colin
How does UCS work with Hyper-V? It seems that all I can find is with VMWare. I am specifically looking for information with integrating NetApp storage and Blade servers virtualized with Hyper-V.
Hi Mike
I’m similar in that 99% of my Cisco UCS Work is vSphere based, but I have done some Hyper-V deployments.
I’ve certainly not had any major issues with it, there is a fair bit of docs out there around Hyper-V on UCS
Try this one for an overview and access to additional white papers.
http://www.cisco.com/en/US/solutions/collateral/ns340/ns517/ns224/ns1150/ns1154/white_paper_c22-713212.html
Regards
Colin
Hi,
we doing a POC / Lab of UCS with 2 6248 FI and 3 blades. we are having issues connecting the uplinks to 3750x with port cahnnel group of 2 1 gig ports on both FI. port channel is configured as active on 3750 x. but still no luck on seeing the FIs on the network. vlans are same on both the sides.
Thanks
Zubair Rahman
Hi Zubair
I would check you have set your uplink ports on the FI’s to 1Gb.
Double Click a port>show interface> click the radio button for 1Gb
http://www.cisco.com/en/US/docs/unified_computing/ucs/sw/gui/config/guide/2.0/b_UCSM_GUI_Configuration_Guide_2_0_chapter_0101.html#task_544C54884DAD4B4982CC499A9EF41AA0
Regards
Colin
What are WWxN Pools?
Hi Saied
Just my Lazy way of representing WWPN and WWNN Pools at the same time
Colin
Does single wire management for c series rack mount require twinax or can I use fiber ?
Thanks
Hi Pete
Yes you can use the standard fibre sfp+’s
Regards
Colin
I have tried with no success. Firmware is at 2.1(1d) 2232pp fex and the end points of the c240 m3
Initially I had it with the 1 gb connected to the lom ports. I decommissioned the server and removed the lom connectivity. I powered on the server and the server does not discover unless I put the lom port connectivity back in play.
dear colin,
i have ucs5108 with 4 blade servers.i have installed esxi 5.0 on each blade..one of my blade does not ping the gate if i choose the nic 0..i have reconfigure the network and need to choose the 2 nic so it starts pinging the gateway..and when i try to add in my vmware cluster it keeps say ” the host couldnt see the HA..cant see the isolated ip add ex: 192.168.222.1
Hi Sam
It sounds to me like you NICs are ordered differently on each of you hosts.
I would suggest you check the order of the NICs in your service profile and reorder as required I.e your HBA’s would generally 1 & 2 so set your NICs to use 3, 4, 5. Just ensure your NICs have the same numbers across hosts I.e vMotion = 3, MGT = 4 and so on. So on you hosts you know the NICs are I’m the same consistent order. Then double check you are tagging the right VLANs on the correct NIC.
You could also bind these service profile to an updating template to ensure consistency.
Regards
Colin
hello colin,
thanks for the prompt reply..would u help me to check the nic card in service profile and could u pls tell me ..how to change it….
Hello Colin,
I have a production UCS environment with a single chassis that was originally provisioned without FEX port-channels, but the customer would now like to change this. Changing the global policy requires a re-ack of the chassis to pick up the change, but causes the whole chassis to lose network and SAN connectivity for a period of time.
Is the downtime period a serious concern to VMware’s access to its datastores, or will the hosts successfully buffer datastore access for a short period?
If the full disconnect from datastores is alarming, is there a method to modify the uplinks on a per-FI basis so that SAN connectivty can be maintained as the fabrics are updated one at a time?
There will be a off-hours maintenance window for this task, but the customer has indicated that they are not willing to do a full powerdown of the environment.
Hi Dave
I have done the same thing on many live Systems, A Chassis Re-Ack certainly does drop and re-establish all of the Virtual Network Links to that chassis and will cause approx 30 seconds of disruption.
In my experience I have never had a server crash through Re-Acking a Chassis (Most environments I build are boot from SAN). I have had a cluster shut down all of its VM’s as HA was set up incorrectly (both hosts in the same chassis referencing only the other host and set to shut down all VM’s if isolated.
Rather than Re-ack the chassis I have had success in shutting down the UPlinks from an FI, as there are then no uplinks for the vNICs and vHBAs to pin to The system shuts down all VIFs on that fabric and then once you bring the uplinks back up again, the system then picked up that the FEX to FI links were now a port channel. You might want to try this before the Re-Ack to see if that works for you. (Obviously check you have fabric fault tolerance correctly configured for you Service Profiles first)
Regards
Colin
What would cause a FC-Uplink port to fail with “initializing”? The MDS side sees the port and is showing up (green). The FC port on the UCS side is set as FC and is licensed.
Hi Michael
Could be a few things (besides dodgey hardware / cables), VSAN mismatch or another mismatch with the settings at one of the ends. I most commonly see this when users are trying to Trunk multiple VSANs down the link, or Port-Channel multuple links or both.
My advice would be take all of the ports out of the channel if using PC’s both ends and just check the link comes up with a straight F to N port setup, then you can add the links into the channel one by one.
Funnily enough I had this issue last week on an 12 port fc port-channel with VSAN tagging, all links were up, we then shut down the fc port-channel at the UCS end (Fabric A) to test the multipathing, this was all fine, but when I re-enabled the fc port-channel at the UCS end all fc ports just stuck on initialising.
I shut the ports down several more times and confirmed all settings, even just tried a single port, but that single port also stayed in initialising. Despite numerous Shuts/No Shuts at the MDS and the FI end (and a reboot of the FI) what finally cleared the fault was a reset of the line card at the MDS end.
This was traced to be a bug with the MDS code we were running.
Anyway hope that helps.
Regards
Colin
We have our UCS environment in a pre-production state and are doing some FC failure simulations for testing. We have (2) 6296 Cisco interconnect switches on the “A” and “B” side of the fabric. We have 8 NPIV connections from each interconnect to Brocade 6510s. We have 15-16 vWWNs (representing ESX servers) on each of the 8 ports on both sides of the fabric as it stands. Our test was to disable 1 of the 8 ports on the “A” side of the 6296 to see if the vWWNs successfully migrate to the 7 surviving ports on the Brocade 6510. That test was successful. All the vWWNs moved across the surviving ports. We then re-enabled that port and saw it come back on-line on the Brocade 6510. We kind of expected vWWNs to make there way back to the port that has now been re-enabled but that did not happen. The port came back up successfully but no WWNs moved back onto that port. We decided to disable another port on the “A” side to see if vWWNs would go back to that first disabled port. When that second port was disabled, all of those vWWNs moved to the original port we disabled. Excellent!
My question is, what (other than another link failure) will migrate vWWNs back to that failed port? After this test, we now have 7 active connections on the “A” side of the fabric and 8 on the “B”. I’m assuming that will hurt us bandwidth wise on the A side with 7 connections rather than 8.
Hi Keith
That is the correct bahavior to how UCS manages fc link failures, it can be made a bit more graceful using fc port-channels but as you probably know fc port channels are not supported between the FI and a Brocade switch.
The reason I guess there is not auto failback is that you want to be as least disruptive to fc traffic as possbile if a link has failed and the FI has had to to an fdisc (fabric discovery, (think flogi on an NPIV enabled port) then it’s best to just leave it alone rather than fail it back.
As soon as tou have an event that requires a vHBA to be pinned to an fc uplink, (link failure (as you found) or a service profile reboot, or a new server brought online) then your “unused” link once brought back up will quickly be used again. Don’t worry best just let the system handle this for you.
Regards
Colin
Gents,
Great content!! Quick question on the C series and managing through UCSM. I’m pretty sure I know the answer but want to double check. Is there a way to connect the C series directly to its Fabric Interconnects without going through a 2232 FEX? The question applies to both single and dual wire management.
Thanks!
John
Hi John
No, back in UCSM 1.4 days you connected the 10Gb ports of the C Series directly to the FI and the on board 1Gb ports to a 2248FEX, however in UCSM 2.0 this was dropped and from then on only connecttion (single wire, or otherwise) via a 2232 FEX is supported.
Regards
Colin
What up Guru!?
Awesome site – thanks for the all the time you clearly put into it!
Can you please affirm my thinking? I have a test lab with Controller A of a NetApp system directly attached to the interconnects of a UCS – so the interconnects are configured in FC Switch Mode. The interconnects will have FC uplinks to two Nexus 55xx’s. For testing native FC on the Nexus switches, I’d like to configure NPIV using a vSphere VM. All blades in my chassis are ESXi hosts. I’ve already moved Controller B of our NetApp system to the Nexus’. I’ve included a link to a ridiculously awesome mspaint picture of where I am currently.
http://www.evernote.com/shard/s290/sh/45cb1fc8-dddc-4325-8c40-3bfe7f54d459/a3ae983ec889d15282f15974d6340137
So my Nexus switches will be configured in NPIV mode. I understand that the interconnects will need to be configured in End-Host Mode in order for a VM on UCS to use NPIV. Is it possible to keep Controller A directly connected to the interconnects and switch to FC End-Host Mode? I know the interconnects will need to bounce, but will everything still function? Just remember, I’m holding you to this
All the best,
Mike
http://VirtuallyMikeBrown.com
https://twitter.com/VirtuallyMikeB
http://LinkedIn.com/in/michaelbbrown
Hi Mike
Thanks for the detailed question and masterful artwork
In short No, As I’m sure you are aware in order to use direct attach fc storage to the FI, you must be in fc switch mode.
You are right that the best way forward would be to put your FI’s in fc N port virtualization (NPV) mode (Think End Host Mode for fc)
Bascially this turns your FI’s into a Host which will then send FDISCs rather than Flogi’s to your Nexus Switch F Ports.
The Nexus switches will then have multiple WWPN’s/FCID’s associated with a single F port and be recieving FDISCs rather than Flogi’s, this is why you need to enable feature NPIV on your Nexus switches.
If you leave Controller A directly connected to your FI’s after your reboot to switch back to NPV Mode, Controller A will no longer have connectivity, as fc Storage ports are only valid in fc switch mode.
PS Had, a look at your blog, I like your down to earth writing style
Regards
Colin
Hi Colin,
Thanks for the prompt reply. That’s excellent news – thanks for the affirmation. And thanks for checking me out
All the best,
Mike