This post takes a look at some of the monitoring and alerting options available for VxRail administrators, including native, on-premises, and off-premises Software as a Service options.
The purpose of this post is not to comprehensively state what is required in order to monitor and manage your VxRail systems, rather to inform of the basics of what information and alerting is available within the platform, and how to gain further visibility into day-to-day operations.
When you need to, you want to be able to quickly view and identify the information required as fast and as easily as possible. And while nobody wants to be watching the systems all of the time, you do want to be safe in the knowledge that IF something does start to go awry, then you will be notified appropriately.
This post will highlight some of the monitoring and alerting capabilities of the following components:
- vCenter Server
- vRealize Suite
- vRealize Operations Manager (vROps)
- vRealize Log Insight (vRLI)
- VxRail SaaS Multi-Cluster Management (MyVxRail)
Every VxRail system is managed by a single vCenter Server. The relationship between vCenter Server:VxRail can be 1:1 or 1:many, and the vCenter Server may be internal to the VxRail or the vCenter Server may be external (customer supplied). Either way, vCenter Server is the primary tool for managing and monitoring your VxRail!
Using the VxRail plugin in vCenter, users can access a hardware view of the appliances which will display any related alerts, as shown below:
Also via this plugin, the Dell EMC Secure Remote Services should be configured to enable the VxRail system to send health data back to VxRail Customer Support so that they may proactively address potential issues, as shown below. With the exception of Dark Sites, this SRS connection is configured as part of standard VxRail deployment services.
vCenter Server contains a wealth of performance and monitoring information for all components involved from top to bottom of vSphere. The vSphere Monitoring and Performance document provides great information on monitoring the status of the various objects, their associated performance metrics, and notification of alerts.
One of the most common areas of the platform that requires basic monitoring is capacity management, specifically storage utilisation. vCenter makes it very easy to monitor all objects by providing out-of-the-box Alarms which can be configured to trigger Email, SNMP traps, or Scripts to run in response to the Alarm, as shown below:
Custom Alarms can be configured as required also, whereby a user can determine the exact conditions under which they require an Alarm to be triggered. As shown below, the user first sets out the condition(s) for the Alarm to trigger, then determines the associated Severity, before finally selecting the Action to be taken upon the Alarm being triggered.
- IF Datastore Disk Usage is above 70%
- THEN trigger the Alarm, and show as Warning
- Send email notification, with Subject and Recipient
Alarms can be configured across any component in the vCenter inventory, simply by selecting the object in the navigation tree, and on the Configure Tab, check under Alarm Definitions.
This level of monitoring and alerting in vCenter is the absolute minimum for VxRail admins. Regardless of a what anybody may tell you about ‘set it and forget it’ or an ‘easy button’, if you have paid out good money for your VxRail, you need to make the effort to monitor it!
While monitoring your VxRail systems via vCenter Server may be seen as the minimum effort, many customers also choose to go a bit further and leverage products within the VMware vRealize Suite to gain further visibility into their systems. Two of these products are vRealize Operations Manager (vROps) and vRealize Log Insight (vRLI). Both of these products are deployed locally in a customer datacenter, consuming resources in the same way as every other virtual machine or appliance, and connect to vCenter Server(s) as their data source.
vRealize Operations Manager (vROps)
vROps provides a wealth of information on an environment, from Health and Risk, to Capacity Management, Cost Comparisons, and Performance Optimisations. Information for vCenter, ESXi, vSAN, and Networking is presented to users via a large amount of pre-built dashboards and widgets. vROps is also capable of providing multiple Views, Dashboards and Reports of VxRail and associated VxRail components via a VxRail Management Pack.
vROps is heavily customisable, offering users the ability to create their own custom widgets and dashboards to suit their own business use case. An overview of the VxRail vROps Management Pack can be found here as well as an example of how to create your own customised VxRail widget and dashboard here.
vROps provides an Alerting function similar to vCenter Server, whereby users can choose to configure alert responses (Email, SNMP trap, etc) based on pre-built or user-customised Alert Definitions.
Where possible, vROps also provides suggested remediation actions, including direction to associated Knowledge Base articles. A wide variety of Notification Methods are available to be configured based on their respective plugins for vROps, as shown below.
As of the most recent VxRail Management Pack for vROps (v1.1), a total of 205 Alert Definitions, sourced from VxRail Manager, VxRail, and iDRAC, are available out of the box, as shown below in the VxRail Management Pack Content tab:
These alert definitions and associated notifications can be applied to the VxRail systems as appropriate. Duplication of alerting between vCenter and vROps is not required, and possibly a decision could be made to standardise all alert notifications from vROps if preferred.
vRealize Log Insight (vRLI)
vRLI comes bundled with VxRail, and provides for a consolidated view of all system logs collected and forwarded from various components associated with the VxRail platform, such as vCenter, ESXi, VxRail Manager, and iDRAC. Much like vROps uses Management Packs, vRLI uses Content Packs to provide out of the box integration with pre-built widgets and Dashboards. Some of the more relevant Content Packs and Dashboards include vSphere, vSAN, and iDRAC, as well as vROps (if in use).
Custom dashboards can also be created, such as the custom VxRail dashboard shown below, displaying VxRail Manager event details as a result of configuring VxRail Manager to forward its syslog to vRLI.
While monitoring logfiles may not be everybodies idea of fun, vRLI does have a very nice integration point with vROps called ‘Launch in Context’, whereby a user in vROps can select an Alert to examine in more detail in vRLI, launching the vRLI UI directly to the associated logfiles for the object and its Alert, as shown below.
This works in the opposite direction also, where a vRLI user may select a particular event in the Interactive Analytics view and choose to view the wider context of that object in vROps, as shown below.
In the context of this post, vRLI is not intended to be positioned as an alerting tool, rather a (very powerful) log inspection tool that can assist in more detailed logfile analysis and troubleshooting as required.
VxRail SaaS MultiCluster Management
Last, and by no means least, we take a look at VxRail systems from a much higher vantage point via the MyVxRail portal provided by VxRail Software-as-a-Service Multicluster Management.
This is an off-premises software as a service offering, included as part of the VxRail HCI System Software, entirely decoupled from the customer datacenter, requiring zero deployment time or compute resources. The primary dependency here is that a customer has an active SRS connection back to Dell EMC support. All that a customer need do is point their browser at https://myvxrail.dell.com and enter their MyService 360 login details.
Upon login. a customer is immediately presented with a consolidated view of all of their entire VxRail estate, whether that is locally or spread nationally or globally.
From a monitoring perspective MyVxRail elevates the user above the various individual vCenter Servers, providing a view of all VxRail systems and vCenter Servers at the same time, as shown below, whereas a single vCenter UI will only show it’s respective VxRail cluster(s).
While the MyVxRail portal does not provide a direct alerting mechanism today, the information and detail available within the MyVxRail portal, backed by Machine Learning and Advanced Analytics, varies from high level system Summary Information, simplified Health Scores, Alarms, Performance Anomaly Detention, and Capacity Planning to VxRail Lifecycle Management and Upgrades.
Purely from a VxRail Health perspective, the Health Score provided for each VxRail is based upon alarms generated from vCenter, ESXi, VxRail Manager and the iDRAC, where the Alarms are individually weighted relative to their severity in the overall context of the VxRail system. Where possible, an associated Knowledge Base article will be provided for each System Issue.
This post does not intend to cover ALL features of MyVxRail but please do read more about MyVxRails Fix This First approach, how it helps detect Performance Anomalies, and what’s new with the latest release.
MyVxRail is complimentary to vCenter Server and vROps, taking the lead for overall monitoring of VxRail systems from a top-down approach. The intention is not that the MyVxRail portal is a one-stop shop for real-time monitoring of your VxRail environment, rather it’s intended to be the first stop, from where the user can quickly view the status of their VxRail estate, and quickly identify where further investigation is required.
And that is where vCenter, vROps and vRLI come back into play, where more detailed performance or troubleshooting is required. In many cases vCenter server will be sufficient as the next step, and VxRail Customer Service are always available too, but as is the way, other tools such as vROps and vRLI are always available to provide more detailed analysis!
Some customers may not have the in-house skillsets or resources to deploy vROps, or maybe they just do not have the use case today. It’s not an Either/Or decision between MyVxRail and vROps. While both address some common high level use cases, both are very different in what they ultimately provide in terms of features and overall product direction.
And let’s not forget that VxRail Customer Support are always available to help out! One of the goals of the MyVxRail portal for VxRail customers is to provide more insight into the VxRail operations, so that if and when a customer does need to contact Customer Support, that the customer would have better visibility into the issues.
BOTTOM LINE: Regardless of whether it is vCenter Server, vROps, or anything else that is used for alerting, it is vital that users configure appropriate system thresholds, alerts and notifications for their VxRail.
Hope that helps,
Update: I was later fortunate enough to be invited onto the VMware Community Podcast to discuss this blog post with the guys. It was a fun chat. You can watch it below: