Thursday, April 12, 2012

Troubleshooting HP Systems Insight Manager (HPSIM) High CPU Usage

Scenario:


A service provider monitoring 750+ systems on a single HP SIM box, experiences high (near to 100% or 3GHz) CPU utilization – mainly due to the processes sqlservr.exe and mxdomainmgr.exe. The HP SIM box is in a VMware environment, has 1 vCPU, and – as in all good environments – they don't want to just chuck resource at it - especially as it is only a monitoring box and otherwise works fine albeit a bit slow - but see if some performance optimization can be done.


Resolution:


Optimize the HP SIM Scheduled tasks


1: Log in to HP SIM https://HPSIM Server:50000 with an authorized account
2: From the menu bar → Tasks & Logs → View All Scheduled Tasks...







3: Enable the 'Delete Events Older Than 90 Days' task (by default runs once a week and this is fine)





4: Configure the task 'Hardware Status Polling for non Servers' to poll using the schedule guidelines below


Number of systems -> Hardware Status Polling for non Servers


Less than 250 --> Use default of 30 minutes
250 to 500   --> Change to 1 hour
501 to 2000   -->  Change to 2 hours
2001 to 5000   -->  Change to 4 hours or greater


5: Configure the task 'Hardware Status Polling for Servers' to poll using the schedule guidelines below


Numbers of systems -> Hardware Status Polling for Servers


Less than 250 -> Use default of 5 minutes
250 to 500 -> Change to 15 minutes
501 to 2000 -> Change to 30 minutes
2001 to 5000 -> Change to 1 hour


*These are based on the suggested polling intervals from HP


Note: The tasks should be configured so there is no crossover - don't have both running at the same time
The chart below shows CPU usage before and after the changes.







Postscript:


i: Even though this was for HP SIM 5.3, this similarly applies to other versions of HP SIM – HP SIM 5.X, HP SIM 6.X ….


ii: The service provider in question runs 5 Insight boxes, managed by different internal teams, monitoring over 2500 systems.


iii: Semi-related - if operators report the web ui is slow, try disabling script scanning from the workstations anti-virus (at least for the internet browser process)

No comments:

How to use DiskSpd to simulate Veeam Backup & Replication disk actions

This HOW-TO contains information on how to use Microsoft© DiskSpd to simulate Veeam Backup & Replication disk actions to measure disk pe...