While working at a customer environment, I found some wmiprvse processes consuming a lot of CPU time. This customer is running most of its workload on large bare metal servers so they are running more users than the normal average for a single OS instance (between 80 and 150). WMI is quite complex to troubleshoot as it’s a “black box” called by almost all part of the environment (user or system processes, monitoring tooling, inventory software…). After few hours (days? ) spent I had to acknowledge that I need help from “someone who knows better”. When thinking about this kind of knowledge, my first tough is let’s give a call to my fellow CTP Remko (https://www.linkedin.com/in/remkoweijnen)! On top of being a good friend, he is the guy I can listen for hours talking about reverse engineering or API hooking!
The first thing I provided to Remko was this screenshot:
Basically an error event in the WMI-Activity eventlog for an “unknown failure” for this WMI query:
Select * from Win32_OperatingSystem
This query seems pretty simple as it request all properties of the Win32_OperatingSystem WMI class (https://docs.microsoft.com/en-us/windows/win32/cimwin32prov/win32-operatingsystem). If you try to run it on your laptop or a machine with few users, this query shouldn’t take more than few milliseconds. When we tried on a loaded server the result was between 15 and 20 seconds. That is because this class include the Number of Processes running on the machine (and the query needs to count each process running). As reference we have similar execution time between a “select * from win32_operatingsystem” and a “select * from win32_process”.
Looking at the event more closely, you can see a “IDProcessusClient” (French version for ProcessID ) and this is the real surprise here! After looking at the task manager what was the process behind this query, we found out it was winword.exe.
What? Microsoft Word is doing WMI queries? But why?
This was part of the question I was not able to answer and Remko suggested me to run a troubleshooting tool: API Monitor (http://www.rohitab.com/apimonitor)
From the official website: “API Monitor is a free software that lets you monitor and control API calls made by applications and services. It’s a powerful tool for seeing how applications and services work or for tracking down problems that you have in your own applications.”
There is a filter allowing you to track WMI activity so I just selected “Windows Management Instrumentation (WMI):
To start monitoring a process, you just need to press “Ctrl+M” and select the process you want to launch (here Word):
After clicking OK, the process is started and API Monitor shows you the API calls:
Boom! We now know the name of the DLL responsible for the WMI query (mso20win32client.dll), and it seems that it’s not the only WMI query Word is doing during the launch process!
Interesting fact: This capture is from the 1902 version of Office, we did some testing with 2002 and 2009 and here are the result about the “Win32_OperatingSystem” WMI class:
- 1902 – Select * from Win32_OperatingSystem is launch at every launch of any Office component (tested: Outlook, Word, Excel, PowerPoint, Skype)
- 2002 – No trace found of any call to the Win32_OperatingSystem WMI class (same tested)
- 2009 – SELECT BuildNumber, Caption, Version FROM Win32_OperatingSystem at every launch of any Office Component (same tested) (as you can see there is no more reference to the NumberOfProcesses
So depending on the version the WMI query is ugly, not there or optimized! In term of performance it means that based on your Office version, the CPU time for the WMI provider will vary from few milliseconds to 30 seconds. That is a huge impact!
For this specific customer the migration to a newer version of Office has to be validated and I was a bit disappointed but again, Remko had something to propose!
With some kind of magic (and a lot of skill), he managed to provide me an “updated” DLL with the “optimized” WMI query instead of the ugly one!
Just a part of our WhatsApp discussion to show you how it is to work with Remko:
It has been a lot of fun and I learnt a lot… Thanks again Remko for your help.
In the next post, I’ll show you the next step of my way to troubleshooting WMI! It will use a script from Remko:
A teaser from Twitter:
Why on hell Winword.exe needs to query the Win32_OS* and Win32_Disk* classes through WMI? The first one is long because it's retrieving all processes from every single session and the second because there are about 70 "physical drives" used by FSLogix… #Citrix #Troubleshooting pic.twitter.com/fAjnBBswuI
— Samuel Legrand (@legsam59) November 24, 2020
1 thought on “My way to WMI troubleshooting on a Citrix environment – part 1”
super interesting ! I’m looking forward to the next post. Thanks for sharing.