Published August 1st 2019
VxRail ACE provides the ability to identify performance anomalies on the VxRail system, while at the same time learning which anomalies are to be expected as the norm.
// Update 08/10/2020 VxRail ACE has been re-branded to VxRail SaaS Multi-cluster Management via the MyVxRail portal
First, a quick Google of ‘Anomaly’ (only after I mis-type it several times as usual)
Odd, Strange and Peculiar are descriptions that a lot us of will have used when troubleshooting performance issues over the years!!!
An Anomaly doesn’t necessarily mean that something is Wrong. What VxRail ACE does is identify something in the system performance that is different from what it has seen before, where ‘what it has seen before’ is considered the operating norm.
VxRail ACE is always learning, always using the VxRail system data-points to build a picture of how the system is operating and performing, how it may be optimised, and how to be pro-active about system health.
In the VxRail ACE UI (existing VxRail customers go to https://myvxrail.dell.com), expand and select the VxRail Appliance in the navigation pane, and then click to the Performance tab and Performance Details. You can view a Custom time range if required, as shown below.
When examining the performance metrics, be aware that the Anomalies option may not be available to you. Not to worry, this is because no Anomalies were detected in the time range selected.
See the example below where Anomalies are available to view for VxRail Memory and Disk, but not for Networking.
An Anomaly will be highlighted (in Blue) for its duration across the Performance Timeline. It is often easier to zoom into the Anomaly. Just select and drag the cursor in the graphic as required.
The example above is highlighting a specific increase in the Usage Average for Disk on the VxRail. The metrics for each VxRail node can be displayed from the data-point, immediately showing us that it is the disk usage on the first VxRail node (vxrail-node-01) that has spiked or increased above the expected or know norm.
This Anomaly can be viewed from the perspective of the Disk Latency metric, as shown below. This again identifies the first VxRail node (vxrail-node-01) as the node hosting the increase in the performance metric being highlighted as an Anomaly.
Aside from any specific performance problem, an Anomaly could be caused initially by something like a spike in resource utilisation from a VDI boot storm or a heavy processing window. VxRail ACE will learn that these types of performance spikes are to be expected, maybe as part of a weekly or month-end event. Once VxRail ACE recognises these events as the operating norm, then they will no longer be highlighted or identified as Anomalies.
In terms of frequency of data points, the VxRail ACE Adaptive Data Collector (ADC), a service residing local to the VxRail nodes, polls and collects information every 5 minutes. This telemetry data package is then aggregated and sent back to the VxRail ACE platform over the secure SRS transport mechanism. More info here
Hope that helps. personally I am traumatised by the fact that I mis-typed ‘anomaly/anomalies’ every.single.time during this post. Are these fingers even mine? I’m not sure!
For more VxRail ACE information, please see VxRail ACE – A Technical Overview