UCS Manager 2.2 (El Capitan) Released

Last week saw the latest major update to UCS Manager in the form of version 2.2 codenamed “El Capitan”

It certainly doesn’t seem a year since I wrote the summary for the then eagily awaited 2.1 “Delmar” release” but I guess time really does fly when your having fun!

UCSM 2.2 will be the last Major version to include support for Generation 1 hardware. 6100 FI’s, 2104 IOM, M1 Servers and M1 Only Adapters. As such it is expected to be a long-lived release, so expect patches and major bug fixes for approximatley 12 months longer than normal major releases (Circa 4 years).

Remember that  Cisco offer the “UCS Advantage Trade in program” which provides an easy path in which to upgrade Generation 1 hardware to the latest versions.

USCM 2.2 Features Overview

UCSM 2.2 Features

UCSM 2.2 Features

Fabric Enhancements

  • Fabric Scaling:
    As you may expect UCSM 2.2 supports more of most things VLANs, VIFs, IGMP Groups, Adapter Endpoints (Physical network adapters across all servers in the UCS domain) This is possible since UCSM 2.2 syncs to an updated underlying NxOS code. Up until now I have never done a design constrained by any of the above, but more is always better right? :-)The table below shows the config maximums for UCSM 2.2 and previous releases.

Fabric Maximums

  • IPv6 Management Support:
    All 3 IP addresses  (2 physical and 1 cluster) are now able to have IPv6 addresses as are the new CIMC “in band” addresses. Services such as NTP, DNS are also reachable via IPv6.
  • Uni-Directional Link Detection (UDLD) Support:
    Rapidly detects and optionally disables/resets broken bidirectional links. We’ve had this for a long time in Nexus and now its an option on the Fabric Interconnects. And can be enabled either via a global or per port policy.
  • User Space NIC (usNIC) for Low Latency:
    Designed for High Performance Compute (HPC) applications that require low latentcy fabric and host adapters. usNIC allows latency sensitive MPI (Message-Passing Interface) applications running on bare-metal host OSes to bypass the kernel (Supported on 6200 with “Sereno” based adapters only VIC1240, VIC1280, VIC1225).
  • Virtual Machine Queue (VMQ) Support:
    Enables support for MS Windows VMQs on the Cisco UCS VIC adapter and Improves VM I/O performance in cases where VM-FEX cannot be used for I/O acceleration..

Operational Enhancements

  • Direct Connect C-Series To FI without FEX:
    Probably one of the biggest enhancements for me this one, and one Cisco have been gradually working towards. With UCSM 2.2 It is now possible to directly connect a C-Series Rackmount to the Fabric Interconnect by a single cable without the need for a 2232PP FEX.  You still have the option of using an extenal FEX which would still be the way to go for a solution with a larger number of integrated C-Series as there will come a point where several 1:1 FI/Port Licences to C-Series will be less cost effective than just buying the 2232PP FEX. But for an environment with just 1 or 2 the “No FEX” option is a clear winner.
C-Series no FEX Option

C-Series no FEX Option

  • Two-Factor Authentication for UCS Manager Logins:
    This is one to make the Security Admin happy. Support for strengthened UCSM authentication (requiring second factor of authentication after the username + password) such as RSA Secure ID, or Symantec VIP Enterprise Gateway.
  • VM-FEX for Hyper-V Mgmt with Microsoft SCVMM:
    VM-FEX Support on Hyper-V hosts was added in UCSM 2.1, but it lacked a centralized VM Network management (SCVMM integration) A Cisco provider plugin gets installed into SCVMM, fetches all network definitions from UCSM and periodically polls for configuration updates.
VM-FEX Hyper-V SCVMM

VM-FEX Hyper-V SCVMM

  • CIMC In-band Management:

If you have ever been a bit frustrated that loading a huge bare metal ISO to a CIMC took a while as you had to go via the 1Gbs FI MGMT port then this should make you happier. With UCSM 2.2 it is now possible to optionally access the CIMC of M3 blades over the same in band network as the data path giving access to all those those lovley 10Gb uplinks. You may also have a requirement to seperate UCSM Management traffic from CIMC Management traffic well now you can. CIMC Out of band is the same as it was you just have the option of connecting to either the In Band or Out of Band CIMC Address. CIMC In-band access supports KVM console, vMedia & Serial over LAN (SoL)

In-band CIMC

In-band CIMC

  • Server Firmware Auto Sync:
    Server Firmware can now be automatically synchronized and updated to the version configured in the new ‘Default Host Firmware Package’ without the need for an Service Profile associated.

Compute Enhancements

  • Secure Boot:
    Establish a chain of trust on the secure boot enabled platform to protect it from executing unauthorized BIOS images.
    UEFI Secure Boot utilizes the UEFI BIOS to authenticate UEFI images before executing them
    UCSM GUI will expose:
    * Boot Mode radio button (Legacy/UEFI)
    * Boot Security check box (visible only when UEFI is selected)

    Secure Boot

    Secure Boot

  • Enhanced Local Storage Management:
    Thanks to a new Out-of-Band communication channel developed between the CIMC and RAID Controller there is now:
    * Enhanced monitoring capabilities for local storage
    * Allow real-time monitoring of local storage without the need for host-based utilities.
  • Precision Boot Order Control:
    Enables the creation of boot policies with multiple local boot devices.
    Provides precision control over the actual boot order.
Precision Boot

Precision Boot

  • FlexFlash (Local SD Card) Support:
    Customers can now manage the FlexFlash Controller configuration from UCSM.
  • Flash Adapters and HDD Firmware Management:
    UCSM Firmware bundles now contain Flash Adapter firmware and local disk firmware.

Trusted Platform Module (TPM) Inventory:
Allow access to the inventory and state of the TPM module from UCSM (without having to access the BIOS via KVM).

TPM
TPM
  • DIMM Blacklisting and Correctable Error Reporting:
    Improved accuracy at identifying “Degraded” DIMMs. DIMM Blacklisting if enabled will forcefully map-out a DIMM that hits an uncorrectable error during host CPU execution

Well thats about it, hope there is somthing in this update for you, there sure is for me :-)

Posted in Product Updates | 14 Comments

Under the Cisco UCS Kimono

If you have ever wanted a sneaky peak under the UCS Kimono (GUI) then this posts for you.

The goal of this post is to clarify the end-to-end path from a Cisco UCS vNIC through the UCS Infrastructure to the point we egress from the Cisco UCS Fabric Interconnects.

Having this information and being able to check utilization and statistics of all virtual and physical interfaces within the Cisco UCS environment will save you allot of time and give you a much better understanding of how all the elements tie together.

This post builds on from a previous post “Understanding UCS VIF Paths” where we used a combination of the GUI and CLI to establish the end-to-end traffic path used by a vNIC/vHBA. In this post we exclusively use the CLI, so if you haven’t done so already perhaps worth checking the previous post out first.

http://ucsguru.com/2012/05/18/understanding-ucs-vif-paths/

Anyway I was troubleshooting an intermittent performance issue the other day from a Cisco UCS Blade all the way back to the Storage Array. And thought it would make a useful post to document this part of the process.

And certainly if you ever get as far as needing to open a Service Request (SR) with Cisco, being able to provide the below information will save you and TAC allot of time.

During this process I will be attaching directly to the ASICs within the IO Modules and these ASICs differ depending on whether you are using Generation 1 or Generation 2 Hardware.

As a nice “Cheat Sheet” I have provided the below table and graphic to show the relevant Cisco UCS ASIC and code names some of which we will need for this process.

UCS ASICs

UCS ASICs

In this example we will confirm the end-to-end path of a vNIC named vNIC_FabB1 of Service Profile DCN4PBKW001 which is in Chassis 3 Slot 2

First determine the Virtual Interface (VIF) and Which Fabric the vNIC/vHBA is currently using (If Fabric Failover is enabled)

Command “show service-profile circuit server <Chassis #>/ <Slot #>”

Active Fabric and Assigned VIF

Active Fabric and Assigned VIF

As we can see vNIC_FabB1 is Active and Primary on Fabric B and Passive and Standby on Fabric A. Therefore we can determine that the Active Fabric for this vNIC is Fabric B.

We can also see that the VIF associated with this vNIC is VIF 2024.

SideNote) This vNIC will connect via a Virtual Network Link (VN-LINK) to a vEth port of the same name vEth2024 on Fabric Interconnect B which can be viewed and statistics collected via the Connect NXOS command at the UCSM CLI.

The next thing to determine is which internal IOM (FEX) Port VIF 2014 is using.

From the UCSM CLI

Connect iom  <Chassis #>

Show platform software woodside sts (use “redwood” for IOM 2104)

FEX Diagram

FEX Diagram

I love the above command because it shows a representation of exactly how the FEX is being used. You can see all the Internal Blade Facing “Satelitte” ports or Host Interfaces (HIFs)  and all the External FEX Network Ports or (NIFs).

As can be seen Blade 2 has access to internal FEX ports 3&4 but has only one active connection to the FEX on FEX Port 3, which maps to HIF 27 (Outlined in Red)

NB) The reason FEX Port 4 is disabled, is that the ports of a 220x FEX alternate between the mLOMs and the Mezzanine slots of the Blades, the Mezzanine slots in the above example being empty (hence all alternate (even) ports display as “–“  for Disabled)

Now we know which Host Interface (HIF) we are using, we next need to determine which FEX Network Interface (NIF) is being used.

If you are using a Port-Channel between the FEX and the FI all servers will be mapped to the port-channel and distributed over the members by the LACP algorithm.

In this case the FEX Links have been left at the default setting which is “Discrete Pinning” mode and as such then the relationship between server slot and FEX Network Interface is as follows:

HIF to NIF Mapping (4 FEX Links)

HIF to NIF Mapping (4 FEX Links)

So as can be seen above FEX Port 3 maps to Network Interface 2.

The HIF to NIF mapping differs depending on the IOM used and how many FEX cables are actually connected, the above shows all four links of a 2204XP connected, the below example shows how the HIF to NIF mapping occurs  if 2 FEX Cables are used:

HIF to NIF Mapping (2 FEX Ports used)

HIF to NIF Mapping (2 FEX Ports used)

So Blade 2 (FEX Ports 3 &4) maps to Network interface (NIF 2)

OK so next we establish which Server Interface (SIF) on the Fabric Interconnect we are using , which we do with the below command:

Show fex <Chassis #>  detail

FEX Port to Fabric Interconnect Server Port Mapping

FEX Port to Fabric Interconnect Server Port Mapping

So as you can see FEX Port E3/1/3 is using FI Fabric Port Eth1/10

The last port we need to know is which FI Uplink port we are pinned to

Show pinning server-interfaces | inc Veth2024

vEth to Uplink Port Pinning

vEth to Uplink Port Pinning

NB) show pinning border-interfaces active can also be used to see the information from another perspective.

As you can see Veth 2024 is pinned to FI Uplink Port-Channel 11

So armed with all the above information you can draw out all the ports in the Cisco UCS traffic path. This in itself will save a lot of time if you need to engage TAC.

End-to-End Traffic Path within the UCS Infrastructure

End-to-End Traffic Path within the UCS Infrastructure

Have Fun!

NB) For further information on advanced Cisco UCS Troubleshooting at the CLI I would strongly suggest checking out the recorded session by Robert Burns (First CCIE DC, TAC and Cisco Community Legend) available for free at https://www.ciscolive.com just set up an account.

BRKCOM-3002 – UCS Performance Troubleshooting

https://www.ciscolive.com/online/connect/sessionDetail.ww?SESSION_ID=8196&tclass=popup

Posted in General | 4 Comments

Cisco UCS Processor Journey

While not perhaps the most interesting topic for some, this is a post I have been meaning to do for some time, and the recent Intel E5-2600v2 CPU Additions into the Cisco UCS lineup have kicked my butt into writing this post.

Like most blogs, this site started off purley as an online respositpory for my own reference, and if the infomation helped someone else, then hey happy days.

One of the most enjoyable aspects of my job is training internal staff and external customers, and as such not only am I required to have good practical skills but also good classroom theory.

In every Cisco UCS Course I deliver, I always give a session on Intel processor architecture and how the Intel CPU’s have evolved and how that evolution matches into the Cisco UCS product line.

In the “old days” this was easy; an M1 Blade  = Intel XEON 5500 (Nehalem) and an M2 Blade = Intel XEON 5600 (Westmere), then came the Nehalem EX (6500/7500) the Westmere EX (E7-2800 and E7-4800) ,  the Sandy Bridge E5’s and now the Ivy Bridge E5’s. And with all these numbers and codenames flying around it is no surprise that people can get a bit confused.

This prompted me to knock up a nice little “Crib Sheet” on what processors are used in what models along with their codename and official launch name designators.

For infomation:

Intels Processor evolution happens in two steps a “Tock” which is a microarchitecture change and then a “Tick” which is the same microarchitecture only made smaller. For example the Cisco UCS journey began using the Nehalem Microarchitecture “Tock” on a 45nm High-K Process, then came the Westmere “Tick” where the process was shrunk to give us the same Nehalem Microarchitecture but this time on a 32nm High-K process. This reduction in process size usually is coupled with an increase in core count due to the fact that as the technology is made smaller, Intel can fit more cores onto the die.

Intel also have certain “Segments” or types of CPU’s which are EN, EP and EX

EN = Entry Level (Used in B22M3)

EP = Efficient Performance (2 Socket)

EX = Expanded (up to 4 Socket with Expanded memory architecture)

So all the above leads nicely into the below crib sheet. which details the Microarchitecture, Process Size, The Cisco UCS Server it is used in, and the Maximum Core/Memory it can support.

Enjoy :-)

Cisco UCS CPU's

Cisco UCS CPU’s

Joined at the Chip

Posted in General | Tagged , , , , , , , , , , , | 3 Comments

Response to the Video “HP OneView + HP BladeSystem: Faster, Simpler, Smarter than Cisco UCS Manager”

Now I don’t usually involve myself in Vendor hype and “FUD Spreading” including that from Cisco, I understand it’s the world we live in, and my role and value as an independent Consultant is to cut through all that, and be a trusted advisor to my Client.

So what’s changed? Well nothing really and I don’t see this as becoming a habit, but I do think I need to call HP out on the latest competitive video of HP OneView Vs Cisco UCS Manager on YouTube

See below link

HP OneView Vs Cisco UCS Manager

Now I’m not going to list every inaccuracy or inefficiency by time stamp, (although I am tempted) but it is obvious that HP are not showing UCS in its best light, or put more bluntly are not using the product correctly.

For instance if I were to make the below statement:

“Look how long it takes to cut the lawn!”

And then proceed to use scissors to cut each blade of grass, I’m sure you would all immediately spot my flawed logic. but use the right tool, and see how long it takes with an Ultra Power Mower!

And this was my main issue with the Video, they failed to use the right tool for the right job, Chris Bradley was doing everything manually and commenting how long it would all take, whereas if he had used the correct tool (vNIC and Updating Templates) he would have been done in no time.

And re: moving a Service Profile to another blade, you certainly do not have to validate that the hardware matches! that’s how I do a lot of stateless upgrades (and downgrades for that matter) moving Service Profiles between different blade types/specs is a great way to Flex up or Flex Down a workload or host as your needs change.

The Video specifically called out the complexity involved in “Pattern Matching” compatible servers for a Service Profile move, again using the right tool (Server Groups) this doesn’t even require thinking about.

Now I haven’t played with HP OneView yet and on the face of it, it looks interesting and certainly a big step in the right direction, but trying to score “Cheap Shots” and inaccurate ones at that, doesn’t seem to me the best marketing strategy.

I for one will be comparing the two products and will engage one of our internal HP Experts to ensure that both products are shown and demo’d in their best light.

Come on HP you’re better than that, this video is your Biggest “Own Goal” since Tolly.

Colin

Posted in General | Tagged , , , , , , , | 3 Comments

ScienceLogic Enterprise Manager 7 (EM7) Review

ScienceLogic

For more infomation and to obtain your own EM7 eval visit the ScienceLogic Web-Site

OK so after what seems like an age, I have finally managed to get round to blogging down my initial thoughts on the first Monitoring Solution I have evaluated.

Disclosure: This review has not been sponsored in any way, and is just my opinion

Firstly big thanks to Mike Riley  and Ray Wood of ScienceLogic for coming in and running me through the setup and initial config of EM7. While not particular difficult it does make my life a whole lot easier, and gives me an opportunity to ask all the questions I might have. Plus in my view this shows a good indication of customer service, and makes a change from the “Just download an Eval and get back to us with any questions” type attitudes.

Lab Setup:
The Lab setup I will use for all these evaluations is shown below.
• Cisco UCS Manager Version 2.1(1a)
• Cisco UCS B Series 2 Chassis (4xB200M2, 4xB250M2)
• Cisco UCS C 200 Running the ScienceLogic and Cisco UCS Central Appliances
• EM7 Version Evaluated Version used 7.2.2.6

Lab Setup

Ease of install:
ScienceLogic EM7 supports a distributed model or a Stand-a-lone “All-in-One” Model for smaller environments. I choose the All-in-One option, where all components reside on the same virtual appliance.
I created a Virtual Machine with the recommended specs, 8GB RAM, Mounted the supplied ScienceLogic ISO then went through the very easy wizard driven initial install. To define settings like Hostname, Admin Username and Password and IP address.

Ease of Licensing
Again very easy just browse to the IP address of the virtual appliance port number 7700 download your reg file, E-mail the reg file off to ScienceLogic and they mail back the licence file which you then just upload in the same screen.
EM7 is licensed on a “per Monitored entity” basis, so as an example the system will detect and monitor a Cisco UCS Blade and VMware ESXi Host. Now these 2 entities may all represent the same logical workload, but will be separately licensed and monitored for statistics pertinent to that particular entity, i.e. the Blade will be monitored for x and the ESXi Host Monitored for Y. So while this may be seen as a licensing “inefficiency”, I can see the logic and value in having these entities split out.

Now one of my questions was that “if I had an issue/fault with a blade, would that issue show up as a potential impact to my Application” i.e. is the system clever enough to know that my Exchange server is running on Blade X, which has just started reporting memory problems.
The answer was “Yes”, but at present this would be a manual merge exercise, i.e. I would need to make a manual association between the Blade and the Exchange Entity.

But just to clarify EM7 would do a thorough job at monitoring the application and running test transactions against the particular application to ensure the application is running within the defined parameters.

Cisco “UCSishness”
ScienceLogic has a Cisco UCS “Power-Pack” which is a pre-configured template which knows how to discover and monitor Cisco UCS. Once the Cisco UCS has been discovered it will appear as a monitored device as shown below:

UCS View

I really liked the layout of the Cisco UCS Topology as it clearly shows all the components and the current state via Colour codes. Each entity can be drilled into for ever increasing granular information. There is also a “Cisco UCS Central” Power-Pack which gives visibility and statistics from a UCS Central instance. Once the Cisco UCS System is discovered statistics and parameters of each entity are collected.

Each entity has an associated “Asset Record” which contains info like

Model, serial number
Maintenance details
Owner details
IP, upstream switch
And loads of other info as well as a free text section for other vital or relevant info

I noticed that most of the above fields needed to be manually populated, which I can fully appreciate for variable details like “Owner” Details, but was surprised that details like Serial Number were not auto populated. But have been informed that assuming these detail are available from the API, ScienceLogic will be adding additional asset information in a new release of the UCS Power Pack.

Once all the Asset Record information has been populated EM7 can be configured to populate standard asset management solutions like Configuration Management Database (CMDB)

Likewise EM7 can integrate with a ticketing system like remedy.

The Figure below shows the ScienceLogic VMware view from the vCenter server down through the hosts to the individual VMs.

vCenter View

VM View

EM7 is a big product, I’m sure with the time I had I only scratched the surface of it.

EM7 also has preconfigured Templates “Power Packs” for
Cisco Nexus
NetApp
EMC
As well as being FlexPod and Vblock ratified

Rough Costs based of 1000 Managed Components (Large Enterprise)

ScienceLogic is licensed per managed “device” at a cost of $12 (£8) per device, per month. So an environment of 1,000 Managed entities would cost around $144,000 (£96,000) per year. Volume discounts are available and have not been applied to this price.

Scores:
A bit difficult to grade some of these as this was the first product I reviewed, so I may tweak scores up or down as I review more products and can make comparrisons.

Scores

Posted in Monitoring | 1 Comment

My initial thoughts on SDN

Hi All

As you all know I have been a Cisco UCS Specialist for the past 3 years, but I have recently also been made the Subject Matter Expert (SME) for Software Defined Networking (SDN) Now don’t worry I am still SME for Cisco UCS, so I’ll carry on blogging about that, but as this site says “Cisco UCS And Complimentary Technologies” I thought I would dump down my initial thoughts on SDN.

Just to Clarify in the 24 years I have been in IT I have been a Server Specialist, a Storage Specialist, a Virtualization Specialist and a Network Specialist, so have pretty much covered all of the bases within the Datacenter. All this experience gave me a great background for Cisco UCS and equally now for working on what SDN and Network Virtualization can bring to the Enterprise Datacenter.

Unlike Cisco UCS, SDN is a topic I am certainly no expert in (yet) but I have a huge passion for it, and find it really interesting. As such at present this is just my take on it, and how it may benefit the majority of my Customer base (The Enterprise Datacenter)

SDN, What you need to know about it (At the moment)

OK So I’m sure you have all heard of Software Defined Networking (SDN) by now, and if you haven’t you need to be aware of it, We all at least should have an opinion on it.

I have been following the evolution of SDN for about 18months now, and I’ve always felt it will have a major impact on how we design, build and manage networks, but I (like most) thought that the realities of SDN were probably still a good 5 years away, recent events and acquisitions have dramatically altered my view,  and SDN (or variations of it) are already changing our industry.

In short if you believe the hype “The Iron Age” may soon be over

What I hope to do with this “Primer” is cut through the ever growing hype and misinformation around SDN and answer the simple questions that few seem to be asking or answering, mainly what will SDN Actually do for the Enterprise Datacenter?.

So What is SDN?

Simply put SDN is the separation of the Data Plane (packet forwarding) and the Control Plane (Inteligence) of the Network with dynamic programmability provided by a central controller. Basically an intelligent dynamically programmable Network.

What Problems is SDN Trying to solve

Moving packets from one point to another quickly and efficiently does not need addressing; The Networks as we know them today do this really well.

Moving them intelliently and adapting to dynamic changes in the Network on the other hand, can be a complexity nightmare or at least a big challenge, i.e. splitting flows by sending voice or trading events down the lowest latentcy path and data down another path, or secure tenant seperation in a dynamic multi-tenant environment, these are just some of the current challenges SDN could help with.

But the current main pain points around networking, is the flexibility, agility and management of the Network. In essence the Network is now perceived as “In the Way” as it has not evolved to provide the dynamic requirements of today’s virtualized workloads.

VLANs, VRFs, NAT, ACLs, QoS at present are quite manual tasks, which need to be configured across multiple devices usually by CLI.

So at present if a user wants an Application / server stood up;  Through Virtualization we can do this within minutes, however the Connectivity, QoS, Security, Loadbalacing etc.. that the workload needs then becomes the bottleneck, as these are presently quite complex manual tasks which can take weeks to implement and sometimes requiring several specialists to implement. And if that workload wants or needs to move to another location or Datacenter, Oh Man that’s another big headache.

Sure we can use expensive proprietary solutions to address some of these issues, but if we could do this simply,  cheaply, dynamically and safely using a software overlay, well now that’s the promise of SDN and Network Virtualisation.

I certainly get what SDN brings to the party in areas I don’t really get too involved in i.e. the Service Provider and Hyper Scale Datacenter arenas, many of these companies are already using SDN or a derivative of it, and several created their own versions or helped define the current SDN standards, when they found that they had outgrown the capabilities of many current technologies,  but there are compelling use cases for my particular sweet spot, The Enterprise Datacenter.

Particularly around Datacenter Interconnection (DCI) and Enterprise Network Virtualization. Now Network Virtualization by strict definition is not SDN as there is no central controller involved, but it is where the revolution of our industry will start.

Having been heavily involved in all aspects of the Datacenter, I can certainly see the end to end picture and why Network Virtualization has so much potential.

VMware as I’m sure you all know, developed ESX which has revolutionized how quickly Servers can be provisioned, deployed and dynamically moved within the environment.

During this time the Network has remained almost static with regards to its ability to adapt to this huge change and flexibility in the compute layer.

Just like with ESX where vCPUs, vDISK, vRAM and vNICS can be combined to present a logical X86 Environment for a Virtual Machine to consume. Within NSX a Virtual Network can be defined, this Virtual Network can contain, VLANs, vSwiches, vRouters, vLoadBalancers etc…

NSX is a new product announced by VMware due for launch later this year, which combines the best elements from Nicira (acquisition last year) and VMware. The main components of each which form the core of NSX are:

Nicira: Distributed Controller Cluster (Layer 2 – 4 Programmable vSwitch)

VMware: VMware vCloud Networking and Security (VCNS) Portfolio (vLoadBalances, vFirewalls VPN, VXLAN etc..)

While NSX is a VMware product it is Vendor, Hardware and Hypervisor independent!

As mentioned NSX is a software OVERLAY which relies on having a “Dumb” low latency IP network beneath it, with all the intelligence defined in software.

I for one did not study my butt off to be an “UNDERLAY Fitter” so am obviously interested in how this progresses to ensure I am always where the Fun is!

This is not “pie in the sky” in my view VMware with NSX has the serious potential to revolutionize the Network in the same way it has the Server Industry with ESX.

Anyway managed to dump down my thoughts, at present which may well change once I get more knowledgeable on the subject and offerings.

If you have a view or disagree with mine, please leave a comment.

Regards

Colin

Posted in SDN | Tagged , , , , , , , , | 5 Comments

20Gb + 20Gb = 10Gb? UCS M3 Blade I/O Explained

There comes a time, when if I have to answer the same question a certain number of times, I think “this obviously requires a blog post”, so I can just tell the next person who asks to go and read it.

This is such a question.

“Ok so I have a VIC 1240 mLOM on my M3 Blade which gives me 20Gb of Bandwidth per Fabric, Correct?

Correct!

Cool, I also have a 2204XP IO module that gives me 20Gb of Bandwidth per Fabric to each of my blade slots, Correct?

Correct!

Fantastic, so if I use one with the other I get 20Gb of I/O per Fabric per Blade, Correct?

Wrong!

Huh?

Ok lets grab a white board marker and lets go!

I can really understand the confusion around this, because at first the above logic makes perfect sense, it’s only when you open the UCS Kimono that you see the reason for this behaviour.

So as we all know the M3 blades give us a nice Modular LAN on Motherboard (mLOM) which is a VIC 1240, this gives us 2 x 10Gb Traces (KR Ports) to each IO Module.

We also have a spare Mezzanine adapter slot which can be used for a Port-Expander (to effectively turn the VIC1240 into a VIC1280, or can be used for any other compatible UCS I/O Mezzanine card or an I/O Flash module like the IO Drive 2 from Fusion I/O.

This Mezzanine slot also provides 2 x 10Gb Traces (KR Ports) to each IO Module.

Ok, now the “issue” is that the ports of the I/O Module alternate between the on board VIC1240 and the Mezzanine Slot, So to use a Blade in Slot 1 as an example with a 2204XP I/O module. I/O  module backplane port 1 goes to the VIC1240, and Port 2 on the I/O module goes to the Mez slot. This is why you only get 10Gb of usable I/O with this combination.

Not sure why Cisco did not trace I/O Module ports 1 and 2 to the mLOM and 3 and 4 to the Mez, I guess the way they have done it allows you to always have access to the Mez slot even if using a 2204XP I/O module. (as mentioned above the Mez slot can be used for other cards not just CNA’s)

So as you can see, when using a 2204XP and the VIC1240 with no Mez adapter, only one of the two 10Gb traces actually matches up. (See Below) 

B200M3 VIC 1240, No MEZ, 2202XP

B200M3 VIC 1240, No MEZ, 2202XP

OK, so how do you get your extra bandwidth well one of two ways, either add a mezzanine adapter, or use the 2208XP IO Module or Both.

If you were using a 2208XP I/O module with your VIC 1240. Backplane port 1 on the I/O Module goes to the VIC1240, port 2 on the I/O module goes to the Mez slot, Port 3 on the I/O Module goes to the VIC1240 and port 4 on the I/O Module goes to the Mez. So as you can see, this comdination does give you the two 10Gb traces to your VIC1240.

2 B200M3 VIC1240 no Mez 2208

The other combinations of modules and resulting bandwidth are explained below.

For clarity only the backplane ports of the I/O Module that map to Blade slot 1 are shown.

2204XP Combinations

3 B200M3 with 2204XP

2208XP Combinations

4 B200M3 with 2208XP

2208XP Combinations

Note that while resulting bandwidth may be the same with certain combinations, the hardware based port-channels are different. Obviously the more ports in the same port-channel will make traffic distribution more efficient.

Also bear in mind that when using the port-expander UCS Manager sees the VIC1240 and the port-expander as a single VIC.

If a VIC1280 is used in conjunction with the VIC1240 they are completely independant adapters from each other, and are treated as such by UCS Manager.

 As ever comments most welcome.

Posted in General | Tagged , , , , , , , , , , , | 25 Comments

Monitoring Cisco UCS

I am sure I’m opening up a huge can of worms with this post, but all those who know me, will know that I am never one to shy away from controversy or from encouraging debate.

In my role as Subject Matter Expert for Cisco UCS and integrated Systems, I am often asked by customers for my views on how best to monitor them and the applications that run on them. To which my historical response was somthing like “what is it you use now? and if your happy with it, we’ll look to integrate Cisco UCS in to your existing Solution” or perhaps “Just use UCS Manager for the UCS Components and use somthing else for workload and application monitoring.

The fact is Cisco UCS and other converged offerings for that matter have heralded a new age of how workloads are delivered, operated and are managed within a Data Center, across Data Centers and across Cloud Infrastructures whether Private, Public or Hybrid.

And if this is a new age of converged and Cloud offerings, surely we need a new age of monitoring solution for them. Just because a customer or vendor has always done somthing a particular way does not necessarily mean, that they should just carry on in that fashion. Customers deserve and indeed are demanding better!

There’s a lot to be said for having the view of “What good is it being alerted that I have a CPU running hot, or errors increasing on a particular DIMM indicating a possible imminent failure, I want to know what impact that is or could have on my application or service”

This mind-set is ever more relevant now we are in a world of statelessness and Cloud, where workloads can be mobile and running across infrastructure that may or may not be under your own control.

Having this granular and detailed visibility into the Cloud is essential in this new age, of ever-increasing demands to reduce cost and increase efficiency through consolidation and multi-tenancy.

My intention with this post is to have a comparison summary of different monitoring solutions I have installed, tested or played around with.

Now each monitoring product will get a blog post in their own right but as I test a new one I will add it to this comparison summary for a nice quick single point of reference.

Comparing monitoring solutions is always difficult as there are few instances where there can be a genuine “Apples with Apples” comparison. Each generally have their own strengths , weaknesses and focus. Some may monitor the hardware some won’t, some may only monitor the Hosts and not the Guest, some may focus more on the application and so on and so on.

Rather than have numerous sub categories I will list all the solutions I have tested in the same comparison table and score each accordingly, which should make it pretty obvious which solutions fit where.

It is certain that not all of these solutions compete with each other, but there are undoubtably overlaps in many cases, some may compliment each other and intergrate nicley others may not. Which ever way you slice it, it may well be that to get the full solution you want you may require a combination of products.

So all that said the first “Yard Stick” I will be looking at is what do these products give me over and above what I get with UCS Manager and UCS Central, which as I’m sure you will agree are great at monitoring the Cisco UCS hardware, components and configuration.

So as a starter for 10 I have listed my own view on where UCS Manager (including the added functionality of UCS Central) sits as the first column to which all future solutions can be “Compared and Contrasted”

The second is how much functionality I get “out of the box” now I’m no developer, Script or API guru so while I have seen some immense bespoke monitoring solutions fronted by cool bespoke Apps, I have neither the time nor skill to go to that level with my testing. I want to click install and then start getting some cool useful infomation, OK perhaps that’s a bit unrealistic but you get the point.

I will also look at Cost and licensing, for me its a simple equation, Cost should = Value, if I get a lot of value from a product over that of UCSM and UCS Central, then I look at the cost and if that cost is reasonable for the value I get, then in my book that’s a viable product.

I’m sure most of these products will come with reams of info on how it reduces TCO and the ROI it promises, usually by reducing troubleshooting time or identifying an issue before it becomes an issue (Proaction rather than Reaction), the engineer in me has never cared much for marketing but rather just on the facts and tangible results.

I will also look at how these products may aid with compliance, to either internal or regulatory policies and standards. Like PCI DSS

So stay tuned, hoping to write-up the review of the first product over the next week or so.

I have posted the below spreadsheet with the solutions and categories that have come to mind for testing and scoring, so that the community has a chance to give me their comments on them and suggest additions / alterations prior to the testing.

Not sure on the timescales on this as I am kept very busy with my day job, but every so often I get some Lab time to do some testing, or even better a booking to design / install one of the products I am evaluating.

So I see this as an ongoing project.

And as always this is just my view,  and my testing, no Vendors have sponsored any of these tests, or will influence any of my opinions and I will try and minimise any harm to animals juring my research.

UCS Monitoring Comparions

UCS Monitoring Comparions

Posted in Monitoring | Tagged , , , , , , , , , | 17 Comments

UCS Manager 2.1

As you may be aware a major UCS Manager update has been in development for the past 12 Months or so, I have been keeping a keen eye on this as there are several aspects to the new release which I have wanted for a long time.

As some of my blog readers would know, about 8 months ago I wrote a post entitled “UCS the perfect solution?” where I detail my top five gripes or features I would like to see in Cisco UCS Manager. Well with the imminent release of UCSM 2.1 they are now all pretty much crossed off.

This release previously only referred to by the Cisco Internal Code Name “Del Mar” has been allocated the version number 2.1 currently due for general release Q4 this year.

UCSM 2.0x Features

The above shows the maintenance releases for Capitola (UCS Manager 2.0) including the current 2.0(4) release required to support the new B420 M3 4 Socket Intel E5 “Romley” Blade.

I have summarised the key features of Del Mar below and picked out some of the key ones.

DelMar Features

1. Multi-Hop FCoE
So first off and one of the most eagerly awaited features is full end to end FCoE. This means we will no longer have to split Ethernet and Native Fiber Channel out and the Fabric Interconnect. But have the option of continuing the FCoE connectivity northbound of the FI into a unified FCoE switch like Nexus and beyond or even plug FCoE arrays directly into the FI itself. As shown below.

Multi-Hop FCoE

Main Benefits: further cost reduction in cabling etc.. No dedicated native fiber channel switches required, full I/O convergence in the DC now available.

2. Zoning of Fabric Interconnects
Full Zoning configuration now supported on the FI. Previously the FI could only inherit zone information from a Nexus or MDS switch, with UCSM 2.1 the FI will support full Fiber Channel zoning

Benefits: Fabric Interconnect could now also be used as a fully functional FC switch for smaller deployments negating the requirement for a separate SAN fabric.

3. Unified Appliance Ports.
You will now be able to run both Block and File data over a single converged cable directly into your FCoE Storage array (NetApp will be the only Array supported initially), as shown below

Unified Appliance ports

Benefits: Further cost reductions by consolidating ports and cabling, and running both Block and file data over the same cable.

4. Single Wire C series Integration
C series Integration is now where it should be i.e. a single 10Gbs connection to each Fabric by way of a 10Gbs External Fabric Extender (Nexus 2232PP). This single 10Gbs connection to each fabric carries both data and management (in the same way as the B-series blades). Prior to 2.1 you had to cable the C-Series with a separate cable for Data and Management.

In essence you are creating a blown out chassis, with external FEX’s and Compute Nodes.

I’m a great believer in the right tool for the job and not all roads lead to a Blade form factor. So having tight seamless rack mount integration is great. And if for whatever reason you want to move a work load from a blade to a UCSM integrated rack mount, it’s just a few short clicks to accomplish.

C Series
(Supported Single Wire platforms C22M3, C24M3, C220M3, C240M3)

5. Firmware Auto Install
Anyone who has done a UCS infrastructure Firmware upgrade knows it is a bit of a procedure and obviously has to be done in a particular order to prevent unplanned outages. UCSM 2.1 comes with a Firmware Auto Install wizard which automates the upgrade.

The Firmware Auto install below upgraded my entire UCS Infrastructure in 35mins, in the correct order, with only a Fabric Interconnect reboot user acknowledgement required.

Firmware Auto Install

Benefits: Should provide a consistent upgrade process and outcome, reduce margin for human errors, speed up upgrade time.

Firmware Auto Install

6. Rename Service profiles
Hurray, been waiting for this for a long time.
You will now be able to non disruptively rename Service Profiles,

This puts the power back into using Service Profile Templates, as I found myself cloning SP’s rather than generating batches from Templates, purley because I did not want a generic prefix that I could not change.

Service Profile Templates cannot be renamed, nor will you be able to move service profiles between organisations. But hey that’s no real biggy they are easy enough to clone into a different Org then just change the addresses manually (Pools will update themselves with these manual address assignments)

Rename Service Profiles

7. Fault Suppression

I’m sure you have all at some point rebooted a blade or made planned config changes to an SP only to see UCSM display a plethora of errors while the change is being applied. Obviously if this was planned you don’t want your Call Home or Monitoring system to alert on these “Phantom Errors”
Worry Not! you will now be able to put an SP into “Maintenance Mode” and while in Maintenance Mode UCSM will not report any errors for that SP.

Also Existing error conditions that are “expected” will no longer raise faults. Ie. VIF flaps during service profile association/de-association etc.

8. Support for UCS Central
UCS Central previously known under the Cisco Internal code name “Pasadena” is due out later this year. UCS Central will allow full management and pooling of addressing between separate UCS domains. UCS Central will be released in two functional phases. Phase 1: Able to pool and share resources between multiple UCS domains.
Phase 2: Able to move Service Profiles between multiple UCS domains.

See my full post on UCS Central Here.

9. VM-FEX Supported in Hyper-V

VM-FEX will be supported in Microsoft Hyper-V as will Single Root I/O Virtualisation (SRIOV) where the Hypervisor will support dynamic creation of PCI devices on the fly. (Currently this is done via UCSM)

10. VLAN Groups
You will now be able to group VLANs and associate these groups to certain uplinks, (will be a nice feature when using disjoint layer 2)

11. Org Aware VLANs
Another nice feature is that Organisations can now be given permissions to particular VLANs, so in essence Service Profiles can be limited to only being able to use VLANs assigned to the Organisation they are in. In fact when creating a Service Profile the admin only has visibility of the VLANs granted to the Org they are creating the Service Profile in.

Great for multi-tenancy environments as well as reducing the possibility of misconfigurations and enforcing security policy.

Anyway that’s my summary, lots of good stuff coming.

Regards
Colin

Posted in Product Updates | Tagged , , , , , , , , , , | 50 Comments

HA with UCSM Integrated Rack Mounts

Hi All
One of the founding members of the Cisco UCS Avengers , Fabricio Grimaldi (who also happens to be the Cat who first introduced me to Cisco UCS) came up with a great question.

“How does HA (Split Brain avoidance) work with UCS Manager integrated Rack Mounts when there is no Chassis and therefore no SEEPROM.”

Great question and one I had to admit I did not know the answer to.

Luckily another 2 Cisco UCS Avengers founding members were also CC’d in on the question Scott Hanson and Sean McGee and it was Sean who came back with the answer.

But before we get into the answer a quick recap on how this works in a B Series environment

B Series Split Brain

If you are ever in the unlikely scenario that both of your Fabric Interconnect cluster links fail (L1 and L2) the Active UCS Manager remains active but the standby UCS Manager no longer sees heart beats from the Active UCS Manager and as such would try to go active resulting in two isolated active brains. This is referred to as “Split Brain” or a “Partition in Space”

Luckily the smart folks in Cisco anticipated this and added something that prevents this from happening.

There is a Serial EPROM (SEEPROM) on the mid-plane of each Cisco UCS Chassis that is used as shared storage and updated by both Fabric Interconnects, so each can be aware that the other FI is still active by checking the SEEPROM for updates from the other FI.

In this scenario both FI’s will go into standby state and try and claim as many Chassis as possible and the FI that claims the most chassis will promote itself to active.

In order to prevent a tie breaker i.e. both FI’s claiming the same amount of chassis, again there is a mechanism in place to prevent this from happening.

If there is an odd number of chassis in the UCS Domain, no problem as one FI will always claim more than the other.
If there is an even number of chassis then there is a potential for a tie breaker. So each chassis is designated whether it can be claimed or not in the event of a Split Brain; these “claimable” chassis are designated as Quorum Chassis and their SEEPROMs are marked as such.

The UCS Domain always ensures there is an odd number of Quorum chassis, i.e. if there is an odd number of chassis then all chassis SEEPROMs will be marked as Quorum chassis if there is an even number of Chassis then all but one will be designated as Quorum chassis to ensure an odd number.

OK hope that’s all clear.

So coming back to the main question in the opening paragraph, how the heck does this work in a UCS Manager Integrated C Series Rack Mount server environment that do not have a SEEPROMs?

Well…… Tune in next week, same UCSguru time, same UCSguru channel, OK Just kidding

I’m sure we are all familiar with the diagram below, which shows the current method of integrating a UCS C Series Rack Mount server into a UCS Manager Domain (It gets simpler in UCSM 2.1 as you only need the 10Gb connections from the server) In essence it is an exploded chassis with external FEX’s and Compute Nodes, but one thing is missing from this exploded chassis……. Yes that’s right the SEEPROM.

C Series Integration

OK so while a C Series Rack Mount does not have a SEEPROM they do have a Cisco Integrated Management Controller (CIMC), previously referred to as a Baseboard Management Controller (BMC)

In a Mixed environment of Chassis and Rack Mounts a file in /mnt/jffs2 on the CIMC of the Rack Mount has the same layout as the SEEPROM in a chassis (There was an update to UCS Manager to recognise these “fake” SEEPROMs

The below output shows a UCS System that is using both Chassis and Rack Mounts as shared storage to prevent Split Brain

UCS-DEMO-A# show cluster extended-state
Cluster Id: 0x70af642e8d7811e1-0x8d99547fee0d3804

Start time: Mon Nov 5 17:08:55 2012
Last election time: Tue Nov 6 10:20:47 2012

A: UP, PRIMARY
B: UP, SUBORDINATE

A: memb state UP, lead state PRIMARY, mgmt services state: UP
B: memb state UP, lead state SUBORDINATE, mgmt services state: UP
heartbeat state PRIMARY_OK

INTERNAL NETWORK INTERFACES:
eth1, UP
eth2, UP

HA READY
Detailed state of the device selected for HA storage:
Chassis 1, serial: FOX1530G861, state: active
Server 1, serial: FCH1525V01T, state: active
Server 8, serial: WZP1615000E, state: active

I hope you found this post as interesting to read as I did to write and again big thanks to Sean, Scott and Fab for their input!

Regards
Colin

Posted in HA | Tagged , , , , , , , , , , | 5 Comments