My Journey with VMware NSX

A few times recently I have been asked how I went about expanding my skill set to include software defined networking solutions after being a “traditional networker” for the past 20 years. So here is my story so far.

Three years or so ago, having achieved my 2nd CCIE, I was looking for my next challenge, Software Defined Networking (SDN) was already gaining momentum and so I looked at what I could do in that space. I already had a fairly good handle on Cisco Application Centric Infrastructure (ACI) but at the time there were no certification tracks geared around Cisco ACI.

VMware NSX seemed the obvious choice, I was already familiar with the Nicira solution prior to the VMware acquisition, along with the fact that NSX being a Network Function Virtualisation (NFV) based solution, uses constructs that are very easy for “traditional networkers” to understand, i.e if you know what a physical router does and how to configure it, then it isn’t much of a departure to understand and configure a distributed logical router (dLR) in NSX, and the same thing goes for NSX logical switches and firewalls.

If you’re familiar with setting up emulators like GNS3 and Cisco VIRL then again you’re already adept at setting up virtual networks so the gap to understanding and configuring NSX really isn’t that much to bridge.

Like most people when trying to learn something new I started playing with NSX in a home lab environment, just a couple of low grade dual core servers with 64GB RAM in each was plenty to create a nested NSX environment, but I quickly found the VMware Hands on Labs ( were so available and functional that I pretty much just used them instead.

I Progressed to VCP-NV (VCP-Network Virtualisation) and then attended the “NSX Ninja” 2 week boot camp, on the back of which I took and passed (2nd time round) the VCIX-NV (Implementation Expert) an intense 3hr practical assessment on building and troubleshooting an NSX solution.

The NSX Ninja course was great! taught by Paul Mancuso @pmancuso and Chris McCain @hcmccain and gave a great insight into the process of submitting and defending a VCDX-NV (Design Expert) design. VCDX-NV being my main goal for this year which requires the submission of an NSX design and then you defend that design to a panel of experts. The NSX Ninja course was possibly one of the best courses I have ever attended, purely for the amount of interaction and constructive feedback.

Of course what also stood me in great stead was the invaluable experience I had picked up, having spent 3 years working with NSX day in and day out, and having now delivered 3 large production deployments in multi vCenter cross site topologies, as no matter how much training you do, nothing quite burns in that knowledge than working in the field delivering solutions meeting real customer’s business requirements.

As with most expert level certifications it is not reaching the destination that makes you an expert, it’s what you learn and the scars you pick up along the path.

This year I was very proud to be selected as a vExpert and am very much looking forward to participating in the program.

Good luck on your own journeys.


Posted in SDN, VMware NSX | Tagged , , , , | Leave a comment

Introducing Cisco UCS S-Series

Today Cisco announced the Cisco UCS S Series line of storage servers.

S-Series-fronts-series-logoNow the more eagle eyed among you may think that the new Cisco UCS S3260 Storage Server looks very much like the Cisco UCS C3260 Rack server (Colusa), well you wouldn’t be too far off, however the S3260 has been well and truly “Pimped” to address the changing needs of a modern storage solution, particularly an extremely cost effective building block in a Hybrid Cloud environment.

The C-3160/C-3260 was particularly suited to large cost effective cooler storage solutions, that is to say the retention of less / inactive data on a long-term or indefinite basis at low cost, use cases being, archive or video surveillance etc.. The fact is data is getting bigger and warmer all time time and it shows no signs of slowing down anytime soon. And even on these traditional colder storage solutions the requirement for real time analytics on this data is requiring an ever increasing amount of compute coupled with this storage.

So Cisco have created this next generation of Storage Server to meet these evolving needs.

If there is a single word to describe the new Cisco UCS S-Series it is “Flexibility” as it can be configured for:

Any Performance, Right sized to any workload
Any Capacity, Scale to Petabytes in minutes
Any Storage: Disk, SSD or NVMe
Any Connectivity. Unified I/O and Native Fiber Channel


  • Fully UCS Manager Integrated


Since UCS Manager 3.1  (Grenada) all Cisco UCS products are supported under a single code release, including the S-Series storage servers (UCSM 3.1.2)

  • Modular Components to allow independent scaling.

As we know different components generally have different development cycles, the S-Series storage servers  are built with a modular architecture to allow components be upgraded independently, for example, as and when the 100Gbps I/O module is released it’s a simple I/O module replacement, similarly when the Intel Skylake Purley platform (E2600 v5) is available it’s just a server module upgrade.

  • Up to 600TB in a 4U Chassis, then scales out beyond that in additional 4U Chassis
  • 90TB SSD Flash

As can be seen below the Cisco UCS S3260 can house up to 2 x Dual Socket M3 or M4 Server nodes, but you also have options of using a single server node then adding either an additional Disk Expansion module , or an additional I/O Expansion module.



Server Node Options.



System I/O Controller


The 2 Fabric (SIOC)  modules map to each of the server nodes, i.e Fabric Module 1 to Server Node 1 and Fabric Module 2 to Server Node 2. This provides up to a 160Gb of bandwidth to each 4U chassis.


Disk Expansion Module.


Adds up to another 40TB of storage capacity to reach 600TB Max, Support for 4TB, 6TB, 8TB, 10TB drives


I/O Expansion Module

In order to allow for the maximum amount of flexibility with regards to connectivity or acceleration Cisco have the option of an I/O expansion module to allow additional Ethernet or Fiber Channel (Target and Initiator) connectivity options.

Flash Memory from Fusion I/O or SanDisk are also supported.



I/O Expansion Module 3rd Party Options.




Cisco UCS S-Series Configuration Options.

The figure below shows the various configuration options depending on how you wish to optimize the server.



Where does Cisco UCS S-Series fit?

Cisco are positioning the S-Series as a pure infrastructure play, they are not bundling any Software Defined Storage (SDS) software on it, as that space if filled with the Cisco HyperFlex Solution, but perhaps the S-Series could be an option for a huge storage optimized HyperFlex node in the future.

That does not of course preclude you from running your own SDS software on the s3260, like VMware VSAN for example.

And for clients that want that off the shelf pre-engineered solution then solutions like vBlock/VxBlock  or FlexPod are still there to fill that need.


One thing’s for sure still lots of innovation planned in this space, in the words of Captain Jean-Luc Picard ” Plenty of letters left in the alphabet”

For more information refer to the Cisco UCS S-Series Video Portal




Posted in Product Updates | Tagged , , , , , , , , , , , , , , , , | 4 Comments

Sizing your Cisco HyperFlex Cluster


Update 31st July 2018

Since my post below, Cisco have release a Hyperflex profiler and sizer tool, for more info of where to get it and how to install and use it, refer to the below video by Joost van der Made.


But still feel free to read my original post below, as it will teach you the under the covers Math.

/End of update


Original Post:

Calculating the usable capacity of your HyperFlex cluster under all scenarios is worth taking some time to fully understand

The most significant design decision to take, is whether to optimise your HyperFlex cluster for Capacity or Availability, as with most things in life there is a trade-off to be made.

If you size your cluster for maximum capacity obviously you will lose some availability, likewise if you optimise your cluster for availability you will not have as much usable capacity available to you.

The setting that determines this is the Replication Factor

As the replication factor is set at cluster creation time, and cannot be changed once set , it is worth thinking about. In reality you may decide to choose a different replication factor based on the cluster use case, i.e. Availability optimised for production and capacity optimised for Test/Dev for example.

It is absolutely fine to have multiple clusters with different RF settings within the same UCS Domain.

So let’s look at the two replication factors available:

Replication Factors

As you can no doubt determine from the above, the more nodes you have in your cluster the more available it becomes, and able to withstand multiple failure points. In short the more hosts and disks there are on which to stripe the data across the less likely it is that multiple failures will effect replicas of the same data.



There is 8% capacity that needs to be factored in for the Meta data ENOSPC* buffer

*No space available for writing onto the storage device


Now for some good news, what the Availability Gods taketh away, HyperFlex can giveth back! remember that “always on, nothing to configure, minimal performance impacting” Deduplication and Compression I mentioned in my last post? Well it really makes great use of that remaining available capacity.

VDI VM’s will practically take up no space at all, and will give circa 95% capacity savings. For persistent user data and general Virtual Server Infrastructure (VSI) VM’s the capacity savings are still a not to be sniffed at 20-50%.


The last figure to bear in mind is our old friend N+1, so factor in enough capacity to cope with a single node failure, planned or otherwise, while still maintaining your minimum capacity requirement.


As in most cases, it may be clearer if I give a quick manual example.

I’m only going to base this example on the storage requirement,  as the main point of this post is to show how the Replication Factor (RF) affects capacity and availability. Obviously in the real world vCPU to Physical Core ratios and vRAM requirements would also be worked out and factored in.

But let’s keep the numbers nice and simple and again not necessarily “real world” and plan for an infrastructure to support 500 VM’s each requiring 50GB of storage.

So 500 x 50GB gives us a requirement of 25TB capacity.

So let’s say we spec up 5 x HyperFlex Nodes each contributing 10TB RAW capacity to the cluster which gives us 50 TB RAW, but let’s allow for that 8% of Meta Data overhead (4TB)

50TB – 4TB = 46TB

So we are down to 46TB we now need to consider our Replication Factor (RF), so let’s say we leave it at the default of RF 3, so we need to divide our remaining capacity by our RF, in this case 3, to allow for the 2 additional copies of all data blocks.

46TB / 3 = 15.33TB

So that puts us down to 15.33TB actual usable capacity in the cluster.

So remember that 25TB of capacity we need for our 500 Virtual Servers, well lets be optimistic and assume we will get the upper end of our Dedupe and Compression savings to lets reduce that by 50%

25TB / 2 = 12.5TB

And that’s not even taking into account thin provisioning which although variable and dependant on a particular clients comfort factor would realistically reduce this capacity requirement by a further 30-50%

So let’s now say our 12.5TB on disk capacity is realistically likely to be 6.25 -8.75 TB

meaning we are OK with our available 15.33TB on our 5 node cluster.

So the last consideration is could we withstand a single node failure or planned host upgrade and still meet our minimum storage requirement?

So either we could work out all the above again but just using 4 nodes (contributing a total of 40TB RAW) to the cluster. Or, since we know that 5 nodes contribute a total of 15.33TB usable, then we know that each node is contributing approx. 3.06TB of usable capacity, so if we take a node out of the cluster.

15.33TB – 3.06TB = 12.27TB usable remaining, which is still above our 8.75TB realistic requirement.

Luckily to make all this sizing a walk in the park, a HyperFlex sizing calculator will soon be available on CCO.


Posted in HyperFlex | Tagged , , , , , , , , | 5 Comments

Cisco HyperFlexes its muscles.

If “Software Defined X” has been the hot topic over the last couple of years then “Hyperconvergence” is certainly set to be one of the hottest topics of 2016, like most buzzwords it’s a very overused term, the word even has “hype” in the name, hell, even as I type this post auto complete finished off the word after only 2 characters.

This market has been so confusing with so many overlapping offerings and players coming and going, alliances forged and broken and like all intensely competitive markets there has been a long line of casualties along the way, but from all this competition hopefully it should be the consumer that emerges the real winner. I did think about doing an elaborate “Game of Thrones” esque opening credits video showing all the key players, but my CGI skills aren’t up to much so just pretend I did it.

So before we get stuck in, what is Hyperconvergence?

Well traditionally a  hyper-converged infrastructure (HCI) is modular compute building blocks with internal storage, plus a magical hyperconverged software layer than manages the compute and storage elements, abstracts them, and presents them to the Application as virtual pools of shared resources. This facilitates and maximizes resource utilization at the same time minimizing wastage and inefficiency. And it is this magical hyperconverged software layer that is generally the differentiating factor in the plethora of offerings out there. I say “traditionally” as you may notice there is a critical element missing from the above definition. Which I will cover later in this post.

The rise in popularity of hyperconverged offerings, is also due, in part to the ability to scale from small to very large deployments as well as negating the requirement for a complex enterprise shared storage array, thus minimizing the initial upfront Capex costs and allowing a “Pay as you Grow” cost model, which actually increases in efficiency and performance the larger it scales due to its distributed nature.

While Cisco are well established in the integrated Systems markets, contributing network and compute to the converged Infrastructure offering of Vblock, VxBlock and Integrated Systems like FlexPod, there has always been a bit of a gap in their hyperconverged portfolio, sure there is the OmniStack partnership with SimpliVity, but nothing as far as a complete Cisco HCI offering goes.

Introducing The Cisco HyperFlex System





Today Cisco announced their new hyperconverged offering in the form of the Cisco HyperFlex System, a complete hyperconverged solution combining, next generation Software Defined Compute, Storage and Networking, thus providing a complete end-to-end software-defined infrastructure, all in one tightly integrated system built for today’s workloads and emerging applications.

I say complete hyperconverged offering as the Cisco HyperFlex System, also comes with full network fabric integration. One of the significant competitive advantages the Cisco HyperFlex System has over other HCI offerings that do not integrate or even include the network element. In fact if the fabric isn’t part of the solution, is the solution really even hyperconverged?

HyperFlex is built from the ground up for hyperconvergence, leveraging the Cisco UCS platform, along with software provided by Springpath, a start-up founded in 2012 by VMware veterans. This hyperconverged software is fully API enabled and has been branded the HX Data Platform

If being a bit late to the hyperconverged party has had one advantage, it’s that Cisco have had time to listen to customers about what they felt is lacking in the current generation of hyperconverged offerings and to properly address the architectural shortcuts and design trade offs made by some other Vendors in order to get to market quickly.

And with HyperFlex, Cisco feels they have leap-frogged any other HCI offering out there by some 2 – 4 years!

Key features of the Cisco HyperFlex System

There are so many features covered in the announcement today, each worthy of a blog post in their own right, which I will no doubt cover here, once more details are released and I can actually get my hands on one to play with. But until then, here is the list of the HyperFlex features that most caught my eye.

  • Simplicity And Easy Independent Scaling.

Cache and Capacity can be scaled up within nodes, or additional nodes can be added, thus allowing compute and capacity to be completely independently scaled up and out as required. Whereas traditional hyper converged solutions only scale in a linear form i.e. you are forced to add both compute and storage in fixed ratios, even if you only need to scale one of them.


New cluster nodes are automatically recognized and are added to the cluster with a few mouse clicks.

Cisco claim that is possible to stand up a Cisco HyperFlex System, including all the networking in under 1 hour, well I’ll certainly look forward to testing that claim out.


  • The HX Data Platform

The HX Data Platform is implemented using a Cisco HyperFlex HX Data Platform controller which runs as a VM on each cluster node,  this controller implements the distributed file system and intercepts and handles all I/O from guest virtual machines.

HX Data Platform Controller

HX Data Platform Controller


The HX nodes connect to the hyperconverged presented storage via 2 vSphere Installation Bundles (VIB) IO Visor & VAAI, within the hypervisor that provides a network file system (NFS) mount point to the distributed storage.

The IO Visor VIB  can also be loaded on a non HyperFlex node to provide access to the Hyperconverged storage to add additional compute power in a Hybrid solution.


  • Superior Flash Endurance.

Built upon a Log-structured file system, enables superior flash endurance by significantly optimizing writes and reducing program/erase cycles.

  • Dynamic Data Distribution

Unlike systems built on conventional file systems which first need to write locally then replicate, creating hot spots, the HX Data Platform stripes data across all nodes simultaneously. It does this by first writing to the local SSD cache, the replicas are then written to the remote SSD drives in parallel before the write is acknowledged.

For reads if the data happens to be local it will usually be read locally otherwise the data will be retrieved from the SSD of a remote node, thus allowing all SSD drives to be utilised for reads eliminating I/O bottlenecks.

  • Continuous Data Optimization.

The always-on inline deduplication provides up to 30% space saving followed by inline compression which provides up to an additional 50% space saving and all with little to no performance impact. And did I mention it’s always-on? Nothing to turn on or configure.

And these figures do not even include the additional space savings achieved by using native optimized clones and snapshots, if it did the overall space saving would be circa 90% or more.

This combined with thin provisioning gives the most efficient use of the storage you have, so you only need buy new storage as you need it.

  • High reliance and fast recovery

Depending on the chosen type of replication mode, based on maximizing availability (Replica Mode 3) or capacity (Replica Mode 2 ) the platform can withstand the loss of 2 HX nodes without data loss. Virtual machines on failed nodes simply redistribute to other nodes via the usual vSphere methods, with no data movement required. Then with the combined functionality of stateless service profiles and the built in self-healing within the HX Data Platform the replacement node is simply and dynamically replicated back in, again with no data movement required which eliminates the issue of sessions pausing/timing out in solutions which rely on data locality, which attempts to locate the data on to the hosts that are using it.

  • Management Simplicity

100% Administered via the Cisco HyperFlex HX Data Platform Administration Plug-in for vCenter. This plugin provides full management and monitoring of the data platform health as well as providing data which can be used to determine when the cluster needs to be scaled.

The initial UCS Manager elements can also be managed via the often forgotten UCS Manager Plugin for vCenter.

There will also be a UCS Manger wizard, to guide the user through the initial UCS Manager configuration of pool address population and Service Profile creation, something I’m sure we will see in UCS Classic not long after.

  • Flexible

At FCS vSphere with File based storage will be supported on the Cisco HyperFlex System, with Block and Object based storage planned for the future, along with Hyper-V, Bare Metal and Container support.


  •  Built on industry leading Cisco UCS Technology

Cisco UCS now tried, tested and trusted by over 50,000 customers worldwide

The Cisco HyperFlex System will come with Gen 2 6248UP or 6296UP Fabric Interconnects (FW 2.2(6f)), with the Gen 3 Fabric Interconnects already released and waiting to provide 40-Gbs connectivity to the Cisco UCS as and when data throughput demand increases within the HyperFlex system.

While the network with many HCI offerings is at best an afterthought, or at worst not even included, with the Cisco HyperFlex System, the network is fully integrated and optimized for the large amount of east/west  traffic required in a hyperconverged system. With every HyperFlex node just a single hop away providing deterministic and consistent performance.

Having Cisco UCS as the solid foundation for the platform also provides a single central management system for both integrated and hyperconverged infrastructure as well as offering integration with Cisco UCS Director and UCS Central.



As can be seen from the above diagram there are 2 models of HyperFlex rack mount nodes each requiring a minimum cluster size of 3 nodes. The 1U HX220c ideal for VDI and ROBO use cases, and the 2U HX240c for Capacity heavy use cases, with a third hybrid option for combining Blade and Rack mounts for compute heavy workloads.

HX220c M4

HX220c M4

HX220c M4

Each HX220c Node contains:

2 x Intel Xeon E5-2600 v3 Processors (up to 16 Cores per socket)
Up to 768 GB DDR4 RAM
1 x 480GB 2.5inch SSD Enterprise Performance (EP) for Caching
1 x 120GB 2.5 SSD Enterprise Value (EV) for logging.
2 FlexFlash SD cards for boot drives and ESXi 6.0 hypervisor (ESXi 5.5 also supported)
Up to 6 x 1.2TB SFF SAS Drives contributing up to 7.2 TB to the cluster.
1 Cisco UCS Virtual Interface Card (VIC 1227)

HX240c M4

HX240c M4

HX240c M4


Each HX240c M4 Node contains:

1 or 2 x Intel Xeon E5-2600 v3 Processors (up to 16 Cores per socket)
Up to 768 GB DDR4 RAM
1 x 1.6TB 2.5inch SSD Enterprise Performance (EP) for Caching
1 x 120GB 2.5 SSD Enterprise Value (EV) for logging.
2 FlexFlash SD cards for boot drives and ESXi 6.0 hypervisor (ESXi 5.5 also supported)
Up to 23 x 1.2TB SFF SAS Drives contributing up to 27.6 TB to the cluster.
1 Cisco UCS Virtual Interface Card (VIC 1227)


Common Use Cases

 Looking at what the Early Access Customers, are doing with HyperFlex by far the main use case looks to be VDI. The low up front cost, consistent performance and user experience along with Predictable Scaling certainly make HyperFlex an ideal solution for VDI.

 Also high on the list was Test/Dev environments, features like Agile Provisioning, instant native cloning and native Snapshots make a compelling case for entrusting your Test/Dev environment to HyperFlex.

 And while the above are two compelling use cases and sweet spots for HyperFlex I’m sure as customers experience the ease, flexibility and scalability of the HyperFlex System we will see it used more and more for mixed workload general VM deployments as the resilience and performance is certainly there for critical applications.

Remote Office Branch Office (ROBO) also was mentioned all though I would think this would likely be a larger remote office , as any use case requiring only 2 or 3 servers, would likely be more cost effectively served with the current UCS C Series in conjunction with StorMagic SvSAN


With an initial bundle price for 3 x HX220c nodes, including a pair of Fabric Interconnects expected to be circa $59,000 which also includes the first year’s software subscription, Cisco are obviously dead set on making this a compelling solution based not only on outstanding Next Gen functionality, performance and agility but also on cost.

Other Questions you may be thinking about

Now as with all new products, a line has to be drawn somewhere, for that First Customer Ship date. Only so much validation, testing of various hardware combinations, features and scale limits can be conducted.

Now I’m a curious chap, and I like to ask a lot of questions, particularly the questions I know my readers would like the answers to.

The running theme in the answers from Cisco to most of my “Could I” questions was that they wanted to get it right, and ensure that the product was as optimized as possible and that Cisco were not prepared to make any compromises to user experience, performance or stability by casting the net too wide from day 1.

All answers are paraphrased.

Q) Will HX Data Platform be available as a software only option?
A) No HX Data Platform will only be offered preinstalled on the badged HyperFlex HX nodes.

Q) Can I just load the HX Data Platform on my existing UCS Servers if I use the exact spec of the HyperFlex branded ones?
A) No (see above answer)

Q) Are there any hardware differences in the HX nodes and there equivalent C Series counter parts?
A) No, but the specification and settings are highly tuned in conjunction with the HX Data Platform

Q) Will I be able to mix HX220c and HX240c in the same HyperFlex Cluster?
A) Not at FCS, all nodes within the same cluster need to be identical in model & spec.

However each Cisco UCS Domain supports up to 4 separate clusters, and each of those clusters could be optimised for a particular use case or application. for example:

Cluster 1:  Replica Mode 2 on HX220c to support Test/Dev workloads
Cluster 2:  Replica Mode 3 on HX240c to support Capacity heavy workloads
Cluster 3:  Replica Mode 3 on HX240c and B200M4 to support Compute heavy workloads

Q) Why is the maximum HX cluster size 8?
A) 8 seemed a reasonable number to start with,  but will certainly increase with additional validation testing. While the initial Cluster size is limited to 8 HX nodes per cluster, with the Hybrid option an additional 4 classic B200M4 Blades can be added for additional compute power, giving a total number of servers in a Hybrid cluster of 12. In the Hybrid solution the B200M4 local storage is not utilized by the Cisco HyperFlex System..

Q) Will I be able to have a mixed HyperFlex and non HyperFlex node UCS Domain?
A) Not at FCS, HX Nodes will require a separate UCS Domain, except for the 4 supported blades in the hybrid model

Q) Are FEX’s supported to connect in HyperFlex nodes to the FI’s?
A) Not at FCS, but no Technical reason why not, once validated, but oversubscription of FEX uplinks needs to be considered.

Q) Will the 3000 Series Storage Optimized Rack Mount servers (Colusa) like the 3260 be available as HyperFlex nodes.
A) Not at FCS, These Servers are more suited to lower performance, high capacity use cases like archiving, and cold storage. Plus that the 3000 series servers are managed via CIMC and not UCSM.

Q) Can I setup a Cisco Hybrid HyperFlex System by directly connecting the HX nodes to my UCS Mini?
A) Not at FCS

Closing thoughts

Both the Converged and Hyperconverged markets continue to grow and will co-exist, but with HyperFlex Cisco have certainly strengthened what was the only chink in their armour, meaning that there is now a truly optimized solution based on a single platform under a single management model for all requirements and use cases. Providing many HCI features not available until now.


One platform


One thing is clear the Hyperconverged game changes today!


Until next time.


Keep up to date with further HyperFlex announcements on social media by following the hashtag #CiscoHX

Posted in HyperFlex | Tagged , , , , , , , , , , , , , , | 7 Comments

Cisco UCS Generation 3 First Look.

I noticed last week that Cisco have just released the eagerly awaited Cisco UCS 3.1 code “Granada” on CCO.

Whenever a major or a minor code drop appears the first thing I always do is read the  Release Notes , to see what new features the code supports or which bugs it addresses. Well in reading the 3.1 release notes it was like reading a Christmas list. There are far too many new features and enhancements to delve into in this post, so I will just be calling out the ones that most caught my eye

Cisco UCS Gen 3 Hardware

At the top of the release notes is the announcement that 3.1 code supports the soon to be released Cisco UCS Generation 3 Hardware.

With the new Gen 3 products the UCS infrastructure now supports full end to end native 40GbE.

The Generation 3 Fabric Interconnects come in 2 models the 6332 which is a 32 Port 40G Ethernet/FCoE Interconnect. And the 6332-16UP which as its model number would suggest comes with 16 Unified ports, which can run as 1/10G Ethernet/FCoE or as 4,8,16G Native Fibre Channel.

The 6 x 40G Ports at the end of the Interconnects do not support breaking out to 4 x 10G ports, and are best used as 40G Network uplink ports.

The Gen 3 FI’s are a variant of the Nexus 9332 platform and although the Application Leaf Eungine (ALE ) ASIC which would allow them to act as a leaf node in a Cisco ACI Fabric is present, that functionality is not expected until the 4th Gen FI.



Cisco 6332 Fabric Interconnect




Cisco 6332-16UP Fabric Interconnect


6300 table

6300 Fabric Interconnect Comparison



Now that there is a native 40G Fabric Interconnect we obviously need a native 40G Fabric Extender / IO Module, well the UCS-IOM-2304 is it. 4 x Native 40G Network Interfaces (NIFs) and 8 x Native 40G Host Interfaces (HIFs), one 40G HIF to each blade slot. These 40G HIFs are actually made up of 4 x 10G HIFs but with the correct combination of hardware are combined to a single native 40G HIF.

Now as the purists will note in Network speak 1x native 40G is not the same as 4x10G, It’s a speed vs. bandwidth comparison. The way I like to explain it, is if you have a tube of factor 40 sun lotion, and 4 tubes of factor 10, they are not the same thing, 4 x 10 in this case does not equal 40, you just have 4 times as much 10 🙂 It’s a similar analogy when comparing Native 40G Speed & BW and 40G BW made up of 4 x 10G speed links. Hope that makes sense.


Cisco 2304 IO Module


As mentioned above depending on the hardware combination you have, will give differing speed and bandwidth options. The three diagrams below show the three supported options.

2304 all 3

One Code to Rule Them All.

Next up is that UCSM 3.1 is a Unified code, which supports B-Series, C-Series and M-Series all within the same UCS Domain as well as support for UCS Mini and all this in a single code release.

HTML5 Interface.

I’m sure you have all shared my pain with Java in certain situations, so you will be glad to hear that UCSM 3.1 offers an HTML5 user interface on all supported Fabric Interconnects. (6248UP,6296UP,6324,6332 and 6332 16UP) 6100 FI and UCS Gen 1 Hardware is no longer supported as of UCSM 3.1(1e) so do not upgrade to 3.1 you will need to stay on UCSM 2.2 or earlier until you have removed all of your Gen 1 hardware!. (refer to release notes for full list of depreciated hardware)

Support for UCS Mini Secondary Chassis.

I have deployed UCS mini in various situations mainly DMZ, Specialised Pods like PCI-DSS workloads or branch office, and I while 8 Blades and 7 rack mounts has always been enough in many cases an “all blade solution” would have been preferred , well now UCS mini can scale to 16 blades. , You can now daisy chain a second UCS Classic chassis from the 6324FI allowing for an additional 8 blades. You can either use the 10G ports for the secondary chassis or with the addition of the QSFP Scalability Port License you can break out the 40G scalability port to 4 x 10G, as shown below.

UCS Mini 2 Chassis

UCS Mini Secondary Chassis


Honourable Mentions.

  • There is now an Nvidia M6 GPU for the B200M4 which would provide a nice graphic intensive VDI optimized blade solution.
  • New “On Next Boot” maintenance policy that will apply any pending changes on an operating system initiated reboot.
  • Firmware Auto Install will now check the status of all vif paths before and after the upgrade and report on any difference.
  • Locator LED support for server hard-disks.

Well that about covers it in this update, am looking forward to having a good play with this release and the new hardware when it comes out.

Until next time




Posted in Product Updates | Tagged , , , , , , , , , , , , , | 21 Comments

CCIE Program Refresh, My thoughts.

The world of technology as we know is changing fast, faster than many predicted. This is certainly true in the data center. In the “Old days” (more than 3 years ago) most of my time was spent evangelising a particular product or adjudicating a bake off between two or more vendor platforms.

These days the infrastructure conversations tend to be far shorter, the fact is infrastructure these days is a given, and the true differentiator is how easy that infrastructure is to consume, automate and orchestrate in a cloud stack or converged solution.

Gone are the days of product led engagements (and rightfully so) these days it’s all about solution led engagements. Taking a business requirement and translating that into a technical solution which truly drives business outcomes.

This solutions led approach, inevitably leads to a closer collaboration between teams across all elements of the cloud stack, portal developers, applications developers, automation/orchestration, backup, infrastructure, storage etc.

The Human API

Just like all the above elements use various API’s to hook seamlessly into the other components of the solution, the true modern day Consultant needs “Human API’s” in their skill set to design deliver and integrate their elements into the overall business solution.

It is unrealistic to expect a networking consultant to be an expert in all the various solutions into which the network could integrate for example OpenStack, the automation/orchestration and Cloud Management Platforms like Cisco UCS Director, or to fully understand a culture like DevOps.

What the modern day networker does need to know however, is how all these various elements interact and consume the network. They need the “Human API” into each of these technologies, which means knowing enough about the other teams, to talk a common language in order to design and implement a truly holistic and optimised solution.

With this in mind Cisco have revamped the CCIE program to address this shift in skill set and the need to align certifications much closer to these evolving job roles. Cisco have done this by adding in these “Human API’s” across all CCIE Tracks. No longer can you afford to be isolated into a particular technology track with no consideration to the bigger picture, sure you still need to be an expert in your chosen technology but there is a framework common to all tracks which are essential skills in the modern day consultant or engineer.

So what’s changing?

Cisco is adding in a common framework around these evolving technologies (E.T’s) across all CCIE Tracks, this framework will make up 10% of the CCIE Written Exams with the other 90% focused on the specific track. The Lab exams will not however include these E.T’s and remain 100% focused on the particular track.


As Cisco always give at least 6 months notice of any blueprint changes, this additional common Evolving Technologies section will come into effect in the written exams in July 2016


CCIE Cloud?

I’m sure like me, when you see the Certification road map below, you cannot help but notice the missing box in the top right hand corner.


This has led to the speculation of an imminent CCIE Cloud track, incorporating Nexus 9000 and Cisco Application Centric Infrastructure (ACI). I for one was certainly hoping for one, in order to pursue adding a 3rd CCIE to my resume. This however is not the case, but I’m sure this news will come as a great relief to my wife.

However Nexus 9K and Cisco ACI will be added to the CCIE Data Center track in the CCIE DC 2.0 Refresh due July 2016, which make sense, and updates the DC Lab blueprint in-line with the skills required to design, implement and integrate a Cloud ready data center.

UCS Director, UCS Central, REST API, and Python are all now listed in the ‘Data Center Automation and Orchestration’ section of the Written and Lab Blueprints. Which up until now have never been covered in the CCIE DC program but do form a huge part of almost every discussion I have with my clients, so great to see they are now included, thus aligning the certification much closer to the actual modern day job role.

The Data Center Lab 2.0 exam will also add a 60min “Diagnostic” section in which no console access is given but you need to ascertain likely causes of issues from various evidence trails like E-mails, diagrams and screenshots. This will be followed by the 7hr Troubleshooting and Configuration section. You need at least a minimum score in both sections to pass the overall exam.

Security Everywhere

Scarcely a day goes by without the press reporting a hacking attempt or a compromise in security of house hold named companies.

The fact is, as we embrace the innovation that cloud brings along with the borderless network and IoT, security can no longer be thought of as a set of products but instead it needs to be a mind set and be holistically integrated throughout the entire solution.

The threat landscape is ever changing, cyber criminals along with deploying very advanced techniques now have widely distributed attack surfaces.

This coupled with the fact that there is a huge industry shortage of trained security professionals has led Cisco to also revamp the CCNA Security certification to include all these modern security concerns, by expanding the scope of the certification to cover topics like Cloud, Web and Virtualisation Security, BYOD, ISE, Advanced malware protection as well as including the FirePOWER and FireSIGHT product portfolio.

All this begins at the associate level and rightly so as Security needs to be woven into everything that we do, security needs to be everywhere.

Final Thought

I for one welcome this announcement from Cisco, as there are an awful lot of traditional networkers out there wondering what skills they will need to stay relevant in the new world order of software defined networking and Cloud.  And it’s great to see Cisco addressing the needs of its core advocates and bringing them on this new and exciting journey with them. Once again Cisco raises the bar for industry talent at every level.


Posted in CCIE DC, Cisco Champion, SDN | Tagged , , , , , , , , , , , , | 3 Comments

Why did “Renaming” an unused VLAN bring down my entire production management environment?

I was recently asked to investigate an issue which on the face of it sounded very odd.

We had installed a fairly large FlexPod environment running VMware NSX across a couple of datacentres

During pre-handover testing the environment suffered a complete loss of service to the vSphere and NSX management cluster.

When I asked if anything had been done immediately prior to the outage, the only thing they could think off was that a UCS admin (to protect his identity let’s call him “Donald”) had renamed an unused VLAN, which had no VM’s in it, so was almost certainly not the cause and just a coincidence.

Hmmmm I’ve never really been one to believe in coincidences, and armed with this information, I had a pretty good hunch where to start looking.

As I suspected both production vNICs (eth0 & eth1) of all 3 hosts in the management cluster were now showing as down/unpinned in UCS manager.

This was obviously why the complete vSphere production and VMware NSX management environment were unreachable, as all the Management VM’s including vCenter, NSX manager along with an NSX Edge Services Gateway (ESG) that protected the management VLAN resided on these hosts, all of which had effectively just had their networking cables pulled out.

So what had happened?

As you may know you cannot rename a VLAN in Cisco UCS so Donald had deleted VLAN 126 and recreated it with the same VLAN ID but a different name (“spare” in this case). This wasn’t perceived as anything important as there were not yet any VM’s in the port-group for VLAN 126.

Donald then went into the updating vNIC template to which the 3 vSphere management hosts were bound and added in the recreated VLAN 126.

And that is when all management connectivity was lost.

The issue was, as per best practice when using vPC’s on Cisco UCS with NSX, there were two port-channels northbound from each Fabric Interconnect one for all the production VLANs connected to a pair of Nexus switches running virtual port-channels vPC and the other a point to point port-channel to carry the VLAN used for the layer 3 OSPF adjacency between the Nexus switches and the virtual NSX Edge Services Gateways (ESG’s) as it is not supported to run a dynamic routing adjacency over a vPC.

So obviously VLAN groups have to be used to tell UCS which uplinks carry which VLANs (just like a disjointed L2 setup)

Cisco UCS then compares the VLANs on each vNIC to those tagged on the uplinks and thus knows which uplink to pin the vNIC to.

Unfortunately as Donald found out this is an “All or nothing” deal, unless ALL of the VLANs on a vNIC exist on a single uplink that entire vNIC and ALL its associated VLANs will not come up. Or as in this case will just shut down.

So when VLAN 126 was deleted and recreated with a new name, this new VLAN did not exist on the main vPC UCS uplinks (105 & 106) hence all hosts bound to that updating vNIC template immediately shut down all their production vNICs (eth0 & eth1) as there was no longer an uplink carrying ALL their required VLANs to which to Pin to. (Cisco UCS 101 really)

As soon as I added the recreated VLAN to the vPC uplink VLAN group, all vNICs re pinned, came up and connectivity was restored. (I could have also just removed this new VLAN from the vNIC template) either way the “All or nothing” rule was now happy.

As per best practice all the clients user workloads and supporting vSphere and NSX infrastructure were located on different vSphere clusters and thus were unaffected by this outage.

There are numerous ways to avoid the above issue, for example you could take out the vPC element and just have a singular homed port-channel carrying all VLANs from FI A to Nexus A and the same from FI B to Nexus B.

Or as was done in this case, the run book was updated and everyone informed that in this environment VLAN groups are in use, thus ensure that a newly created VLAN is added to the relevant VLAN group, before it is added to the vNIC template.

I would like to see a feature added to UCS that changes this behaviour to perhaps only isolate the individual VLAN(s) rather than the whole vNIC, but I can think of a few technical reasons as to why it likely is how it is. Or at least a warning added, if the action will result in a vNIC unpinning.

In a previous post UCS Fear the Power? I quoted Spider-man that “with great power comes great responsibility”

This was certainly true in this case, that seemingly minor changes can have major effects if the full ramifications of those changes are not completely understood.

And before, anyone comments. No I am not Donald :-), but I have done this myself in the past so knew of this potential “Gotcha”

But if this post saves just one UCS Admin from a RGE (Résumé Generating Event) then it was worthwhile!

Don’t be a Donald! and look after that datacentre of yours!


Click on the Images to enlarge.



Posted in General | Tagged , , , , , , , , , , | 8 Comments

Cisco UCS Integration with Cisco ACI (Manual Method)

This video gives a complete walkthrough of configuring a Cisco UCS domain into a Cisco ACI Fabric, and then extending the Cisco ACI Policy into a VMware vSphere environment within that UCS infrastructure.

There has also been an update to this video showing the same configurations using the Wizards and Canvas available in later versions of Cisco ACI (Link below)


Have fun!


Posted in Cisco ACI | Tagged , , , , , , , | 3 Comments

Where’s the Tyrrel gone?

Those that know me, will be well aware that I’m no stranger to the inside of a saloon, and it was during one of these evenings that I got into a debate on Formula 1, now I don’t really follow Formula 1, and don’t usually have an opinion on it, but one of my comments on this occasion was something like “I miss the good old days when the cars were all different, and that I always wanted the 6 wheeler to win”

Now some of my Formula 1 following colleagues are a little younger than myself, and had no idea what I was talking about, but with the aid of a smartphone 30 seconds later they were all amazed.

The car of course was the Tyrrel P34








Now-a-days all cars are computer developed and optimised in wind tunnels, and not surprisingly it turns out that there is only one optimum design, which is now why all the cars look practically identical.

Sadly gone are the days of wacky ideas and cars that truly differentiated from each other.

The same can be said of IT, not all that long ago it was easy to compare vendors, mock the “wacky” ones, and promote the features that we as independent consultants felt would truly benefit our clients

But as with race cars, in the science of IT there is generally a single optimised method of doing or making something, to which most hardware vendors are now aligned, hence the influx of merchant silicon products and commoditised integrated systems.

In the “old days” I could often be found at a white board, chatting speeds and feeds with a client, and while I occasionally still get dragged into those discussions, the fact is if we’re at that point the client is really barking up the wrong tree.

So the question is: How can we help our customers truly differentiate themselves from their competition in a world of ever increasing IT commoditisation?

Well the answer, as in Formula 1, is in the “Variables”

Taking out of the equation the odd technical failure, the key to winning most F1 races is the skill and consistency of the driver.
The driver in F1 equates to the Software stack in IT, and that is where most solutions can really differentiate themselves.

Most of my discussions with clients these days start with topics like, what they consider as their key differentiators from their competition, and the pain points and barriers that they currently experiencing. And from these discussions we can translate these business requirements into technical solutions.

An example being, telling a client we can improve the de-duplication within their storage array, may not mean much, but telling them we can increase the number of tenants they can host on the same infrastructure by 20% may sound a lot more compelling.

All this is just another example of how the IT industry is changing for the “Traditional Networker” the differentiation in the hardware is becoming less and less, and increasingly in the intelligence and software that consumes it.

Fun times ahead.

Posted in SDN | Tagged , , | 5 Comments