A very basic rule of thumb when it comes to system monitoring is that Red is Bad, and Green is Good. Yellow/Orange/Amber then might mean that something is wrong, definitely not quite right, and needs attention before it goes Red.
The VxRail ACE Health Score helps customers prioritize those issues that need attention.
// Update 08/10/2020 VxRail ACE has been re-branded to VxRail SaaS Multi-cluster Management via the MyVxRail portal
VxRail ACE initially provides an overall VxRail system health score, from where the user can identify issues requiring attention.
The Health Score Overview tab displays 3 primary sections:
- Health Score Summary
- Indicates the overall Red / Yellow / Green status if the system
- Categorized by Datacenters, Clusters, and Hosts
- Health Score Issues
- Categorizes the Issue type
- Storage Capacity
- Categorizes the Issue type
- Health Score Timeline
- Charts the rise/fall of issues over a user-defined period of time
- Highlights the top offending VxRail Cluster
- Displays the overall Health Score figure
So looking at that overall Health Score figure, we want to understand what is impacting the desired score of 100, that figure which brings ultimate peace and happiness.
By selecting the Health Score Details tab, or the View Details hyperlink, VxRail ACE will present us with the offending issues.
When selecting any one of the listed issues, VxRail ACE will provide details of the issue, it’s source, as well as guidance to resolve it.
Taking a closer look at each line item, we can see that each item/issue has an associated Impact number. Hover the mouse over the i icon next to Impact for an explanation of the weighting of this number.
While this explains what issues are causing our overall Health Score to decrease, the more important detail here, and the Main Takeaway from this post, is that VxRail ACE is telling us WHICH issue to remediate first. With multiple issues being flagged, the user needs to understand which issue should be prioritized.
Many times we see issues that we recognize, and we know how to resolve, and therefore we take action against those issues first. Low hanging fruit etc etc. But that’s not necessarily what we should do in the greater scheme of things.
Rather than take a whack-a-mole approach to resolving individual issues as they pop up, what VxRail ACE Health Score is telling us to do is to fix the item with the heaviest weighting first, as THAT issue is having the greatest impact on the system.
When we see weightings of -1, -2, and -5, VxRail ACE will present the -5 weighted issue as the top priority to be remediated. That issue is causing the most damage, or presents the most risk to the system.
For example a system may require more capacity but also require a code upgrade. VxRail ACE will indicate which should be resolved first, by placing a greater weighting on the priority issue.
Today, these issues should be actioned and resolved through their respective usual/native channels.
Looking forward, once VxRail ACE provides Active Management, the user will be able to take action against these issues directly from the VxRail ACE portal. At this time of writing, VxRail ACE is only available in Early Access mode, which does not include Active Management.
That’s it for this post on what you VxRail ACE Health Score is really telling you. Hope that helps!
For more VxRail ACE information, please see VxRail ACE – A Technical Overview