I am sure I’m opening up a huge can of worms with this post, but all those who know me, will know that I am never one to shy away from controversy or from encouraging debate.
In my role as Subject Matter Expert for Cisco UCS and integrated Systems, I am often asked by customers for my views on how best to monitor them and the applications that run on them. To which my historical response was somthing like “what is it you use now? and if your happy with it, we’ll look to integrate Cisco UCS in to your existing Solution” or perhaps “Just use UCS Manager for the UCS Components and use somthing else for workload and application monitoring.
The fact is Cisco UCS and other converged offerings for that matter have heralded a new age of how workloads are delivered, operated and are managed within a Data Center, across Data Centers and across Cloud Infrastructures whether Private, Public or Hybrid.
And if this is a new age of converged and Cloud offerings, surely we need a new age of monitoring solution for them. Just because a customer or vendor has always done somthing a particular way does not necessarily mean, that they should just carry on in that fashion. Customers deserve and indeed are demanding better!
There’s a lot to be said for having the view of “What good is it being alerted that I have a CPU running hot, or errors increasing on a particular DIMM indicating a possible imminent failure, I want to know what impact that is or could have on my application or service”
This mind-set is ever more relevant now we are in a world of statelessness and Cloud, where workloads can be mobile and running across infrastructure that may or may not be under your own control.
Having this granular and detailed visibility into the Cloud is essential in this new age, of ever-increasing demands to reduce cost and increase efficiency through consolidation and multi-tenancy.
My intention with this post is to have a comparison summary of different monitoring solutions I have installed, tested or played around with.
Now each monitoring product will get a blog post in their own right but as I test a new one I will add it to this comparison summary for a nice quick single point of reference.
Comparing monitoring solutions is always difficult as there are few instances where there can be a genuine “Apples with Apples” comparison. Each generally have their own strengths , weaknesses and focus. Some may monitor the hardware some won’t, some may only monitor the Hosts and not the Guest, some may focus more on the application and so on and so on.
Rather than have numerous sub categories I will list all the solutions I have tested in the same comparison table and score each accordingly, which should make it pretty obvious which solutions fit where.
It is certain that not all of these solutions compete with each other, but there are undoubtably overlaps in many cases, some may compliment each other and intergrate nicley others may not. Which ever way you slice it, it may well be that to get the full solution you want you may require a combination of products.
So all that said the first “Yard Stick” I will be looking at is what do these products give me over and above what I get with UCS Manager and UCS Central, which as I’m sure you will agree are great at monitoring the Cisco UCS hardware, components and configuration.
So as a starter for 10 I have listed my own view on where UCS Manager (including the added functionality of UCS Central) sits as the first column to which all future solutions can be “Compared and Contrasted”
The second is how much functionality I get “out of the box” now I’m no developer, Script or API guru so while I have seen some immense bespoke monitoring solutions fronted by cool bespoke Apps, I have neither the time nor skill to go to that level with my testing. I want to click install and then start getting some cool useful infomation, OK perhaps that’s a bit unrealistic but you get the point.
I will also look at Cost and licensing, for me its a simple equation, Cost should = Value, if I get a lot of value from a product over that of UCSM and UCS Central, then I look at the cost and if that cost is reasonable for the value I get, then in my book that’s a viable product.
I’m sure most of these products will come with reams of info on how it reduces TCO and the ROI it promises, usually by reducing troubleshooting time or identifying an issue before it becomes an issue (Proaction rather than Reaction), the engineer in me has never cared much for marketing but rather just on the facts and tangible results.
I will also look at how these products may aid with compliance, to either internal or regulatory policies and standards. Like PCI DSS
So stay tuned, hoping to write-up the review of the first product over the next week or so.
I have posted the below spreadsheet with the solutions and categories that have come to mind for testing and scoring, so that the community has a chance to give me their comments on them and suggest additions / alterations prior to the testing.
Not sure on the timescales on this as I am kept very busy with my day job, but every so often I get some Lab time to do some testing, or even better a booking to design / install one of the products I am evaluating.
So I see this as an ongoing project.
And as always this is just my view, and my testing, no Vendors have sponsored any of these tests, or will influence any of my opinions and I will try and minimise any harm to animals juring my research.