What’s that server doing?
Like me, you’ve probably been asked that question many times, usually when looking at reclaiming resources, planning migrations or simply enforcing your company’s security policy.
And if the answers to the above questions cannot be answered in a timely manner the old method was simply to power the workload down and see what broke or who screamed, unfortunately in this day an age that is no longer a reasonable approach.
In a world where we are ever moving towards network white-list models where all flows are denied by default, except those specifically allowed in the policy, visibility into what the applications are doing and how they interact with each other are now critical to understand. The 2 last projects I have done enforced a network white-list policy and mapping all the required application flows involved a lot of detective work and traffic capturing to establish what all the application dependencies were.
Customers face the same challenge when wanting to harness the full benefit of Cisco Application Centric Infrastructure (ACI), how can they define all the connectivity contracts between their applications if they do not fully understand them. There is no magic wand that can work all this out for you and make intelligent contract recommendations…or is there?
The best we have done upto now is use combinations of Syslog, SNMP and server logs and perhaps NetFlow, but the issue with NetFlow is that is only takes a sampling of the data and in a world where we are seeing 40Gbs and 100Gbs becoming the norm NetFlow simply cannot keep up
Those that watch Game of Thrones will be familiar with a character called the Three-Eyed Raven.
The Three-Eyed Raven is what the program calls a “Greenseer”, now a Greenseer, is a being that is all knowing, and has the power to have visibility into all things, past, present and future. They have the power to go back in time and reply the events of the past and use this information to influence the future.
Now I’ve always thought, how cool would it be to have a “Greenseer” within the network, an entity that is all knowing, sees everything in real time, with no compromises, and makes very intelligent recommendations based on the complete picture it has built up.
Well today Cisco announce the Cisco Tetration Analytics system which addresses all the above and so much more!
The Cisco Tetration Analytics system is made up of 3 components
- Data Collection
- Analytics Engine
- User Access
Any Analytics system, can only ever be as good as the collectors, as without comprehensive and rich information to analyse, the process is flawed before you even begin. The Cisco Tetration Analytics system supports 3 types of collectors.
As previously mentioned NetFlow, great though it is, does have limitations around performance and scalability. Cisco have now created an ASIC that captures all telemetry and metadata from every single packet, at line rate, with zero impact on latency regardless if the packet is encapsulated or not. This functionality has been baked into the Generation 2 N9K ASICs announced last week and now shipping in the Nexus 9300EX Range and Nexus 9200-X (ACI and Standalone Modes) Try doing that on merchant silicone.
Now while I’m sure Cisco would love every client to have a Nexus 9300EX/9200-X edge network, the reality is there is an awful lot who will not. God forbid these customers may not even have a Cisco based network at all, but that of a competitor, and what about extending these analytics into a public cloud? Well for these use cases Cisco have released a Server based agent for Windows and Linux operating systems at First Customer Ship (FCS) with more to follow along with support for containers.
These server based sensors have been written to add minimal load to the server (circa 0.3% of CPU), but can also be restricted within strict parameters if required. The Server Sensors are also restricted to only forwarding their information the Analytics Engine they have a mutual trust with, and all communication between them is encrypted. The management and upgrading of all these Server Sensors is handled by the Analytics engine so there is no additional management overhead regardless of how many servers you may have.
3rd Party Sources
The Cisco Tetration Analytics system can also collect information from 3rd party sources like load balancers and Configuration Management Databases (CMDB) this information can add valuable context to reports by replacing things like IP addresses with CMDB defined names etc…
This is the brains where all the analysis is done.
As you can imagine, capturing and processing the metadata from every single packet in the network and from all servers in real-time, will produce a huge amount of information. The analytics engine is capable of processing 1,000,000 + events in real time and store billions of records which equates to several months of data.
The Analytics Engine is a logical appliance, prebuilt by Avnet, which comes pre-wired, all software pre-installed with a guided setup wizard which asks a few questions then has the whole system up running and capturing data in approx. 3hrs.
The Analytics Engine is currently a 39U Rack containing 3 Cisco Nexus 9300 switches and 36 Cisco UCS C220 Servers, there is no option to change the number of nodes but a half rack 11U option is being looked at.
The Cisco Tetration Analytics system can be driven by a GUI, or REST API, and can be configured to push events to other systems if certain criteria is met.
There are several modules that run on the Analytics Engine, each providing a particular function.
The Application Insights module, learns everything about the applications in use, how they interact with other applications and any dependencies they have. From this a complete application policy can be automatically created, recommended and if approved pushed to the infrastructure. This in itself could be worth the price of admission as I have seen clients pay huge amounts of money and spend many months establishing a baseline before moving from a traditional blacklist to far more secure white-list or zero trust model.
Before pushing this new policy to the infrastructure you can first assess the effect it will have, and that is the job of the Policy Impact Assessment module, will it break some critical connectivity? Far better to find that out first. This module can even run “what if” impact assessments, i.e. what if I pushed this policy 6 months ago?, would my application have survived against the exploit that caused it to fail last month? Would the security breach we experienced previously have been prevented had this policy been in place? So as you can see this information could be invaluable.
The Automated Whitelist Policy can then export this policy in many standard formats compatible with most infrastructure devices like SDN controllers, Firewalls and as you would expect there is predefined processes for pushing these policy’s to the Cisco ACI APIC. And then once pushed the policy is continually monitored to ensure compliance and even if a single packet somehow violates this policy, real-time alerts are given, and an automatic remediation plan can be applied.
Over and above the obvious analogy of having complete visibility, It’s probably no co-incidence that Tetration was being launched on the 102nd floor of One World Observatory New York, as its main initial market is likely large financial institutions.
Back in 2008 I was called in to do some work for huge investment bank that sadly went into administration, and while I was engaged I worked alongside a digital forensics team, that spent several years piecing together exactly what happened and accounting for all the banks financials and assets. I can only imagine how much that would have cost and the benefit to the administration process had they had a “digital forensics time machine” that could go back in history and show any information at any point to granular detail, and view or replay those flows that interested them. Well the Forensics module is just that.
The use cases for Tetration are endless, whether it be a Healthcare institute that need to prove regulatory compliance, a High Frequency Trading environment that require hop by hop latency analytics to ensure the optimum latency path of all flows.
Working for a Solutions Provider myself, my thoughts also turn to providing a service around Tetration Analytics, perhaps hosting the Analytics Engine off premises and deploying sensors to your clients or Tenants. So my next topic of research will be, can this safely be done within the same Analytics Engine or would you need a 1:1 ratio of AE’s to Customers.
Initially Tetration will have a perpetual licence, based on the number of workloads being monitored rather than amount of traffic processed, but Cisco are also looking at a more subscription based licence model.
I haven’t seen any pricing as yet, but whatever it is, the answer from any company who appreciates, values and needs the analytics Cisco Tetration provides, will be “Shut up and take my money”