Performance Analysis Toolbox: Timeouts

Music is the space between the notes
– Claude Debussy

I recently spent some time looking at a slow shutdown issue. When product X was installed, the system was taking 34 seconds to shut down, vs 22 seconds when the product was not installed. Our automation team was able to reproduce this issue using the Full Boot Assessment in the Windows ADK. This gave us a very good place to start. In addition to a regular .etl trace file, the ADK also provides some extra information to help organize the millions of system events into a coherent narrative.

Image

After comparing this trace to a normal shutdown, it was clear the the IO Shutdown System section was much longer than normal — 10 seconds vs. close to 0 seconds in the normal case.

So, I got out my performance analysis toolbox and looked at:

– CPU usage
– Wait chains
– Disk usage

None of these approaches proved fruitful. There was very little CPU activity, nobody seemed to be waiting on anyone else, and disk usage was minimal. How could performance be impacted when nobody was doing anything? I reached out to a colleague and mentioned the major facts: slow shutdown, 10 second delay during the IO Shutdown phase, not much activity. His psychic response? “10 seconds? If it is exactly 10 seconds it sounds like a timeout”. I looked at a couple more traces and sure enough, the delay was always 10 seconds plus a tenth of a second or so.

We figured out the problem in the end but the big takeaway for me is a new tool for my performance analysis toolbox — timeout analysis, where the space between the notes is more important than the notes themselves.

Posted in Performance | Tagged , , | Leave a comment

Build 2014: Favourite Sessions

Build2014

Build 2014 is a distant memory at this point. Luckily, all sessions are still available for viewing on Channel 9. Here are my favourite sessions from the conference.

Modern C++: What You Need to Know
If you’re still in the dark about C++11/14, check out Herb Suter’s talk on the power and ease of using the new additions to the C++ language. There’s also a great section in the talk about the performance benefits of C++ arrays (and contiguous arrays in general).

Tips and Tricks in Visual Studio 2013
Cathy Sullivan presents an hour of ways to increase productivity and decrease frustration in Visual Studio 2013. If you use Visual Studio on any sort of regular basis, you will benefit from this hour. Ctrl+Q is your friend.

Diagnosing Issues in Windows Phone XAML Apps Using Visual Studio
I’ve never written a Windows Phone App but still found this session very interesting. Dan Taylor uses Visual Studio to analyze CPU, UI throughput, memory, and energy efficiency issues. Visual Studio debugging and analysis tools have come a *long* way

Windows and the Internet of Things
Steve Teixeira helped me understand what the internet of things is all about. Great presentation and great content. Most of Build 2014 was focused on convergence of Windows, Phone, and XBox; this session introduced me to some divergent flavours of Windows that I had not heard of.

Native Code Performance on Modern CPUs: A Changing Landscape
There was a noticeable lack of low-level sessions at Build 2014 and so the minor headache induced by Eric Brumer’s complex talk was most welcome. The big takeaway: with ever-faster CPUs, memory operations have become the bottleneck. Dovetails nicely with Herb Suter’s session above, but goes much more in-depth.

Avoiding Cloud Fail: Learning from the Mistakes of Azure with Mark Russinovich
As always, Russinovich is engaging and informative. Though the talk is focused on cloud there are lessons here for all types of development, for example, pragmatic exception handling strategies.

Posted in Performance, Training, Troubleshooting | Tagged , , , , | 2 Comments

Visual Studio Productivity Power Tools 2013 Kills Productivity

Update: This issue has been resolved and verified. You can get it from Visual Studio Gallery at http://visualstudiogallery.msdn.microsoft.com/dbcb8670-889e-4a54-a226-a48a15e4cace

At Build 2014, Microsoft did a suitable job of persuading me to run the latest version of Visual Studio. I updated to Visual Studio 2013 Update 2 RC and also installed the Productivity Power Tools 2013 extension. Soon afterwards, I started seeing long-lasting hangs in Visual Studio while editing my fairly modest C++ solution.

I initially tried to ignore the problem by killing and restarting Visual Studio, but this quickly compounded my frustration. My next tactic was to record one of these hangs using Windows Performance Recorder (see my previous post for background on this tool).

Upon opening the recorded trace in Windows Performance Analyzer, I came to the conclusion that Visual Studio had stopped pumping Windows messages. As we see in the graph below, thread 5959 went 179 seconds without checking for new messages. Why?

vs-hang1

One possibility was that the UI thread was waiting on another thread. However, when I opened the CPU Usage (Sampled) graph and filtered it to thread 5956, I could see that the thread was pegging the CPU. My machine has four logical processors and so 25% CPU usage means that one of my processors was being utilized completely.

vs-hang2

After loading symbols, I was able to drill down into the aggregated stack. I didn’t know exactly what I was looking for, but I eventually saw some enlightening function names on the stack.

vs-hang3

The aggregated stack usage showed that a lot of processing time was being spent in SolutionErrorFilter.dll. At this point, I remembered that I had installed the Productivity Power Tools 2013 extension so that I could see errors highlighted in Solution Explorer.

After proving to myself that SolutionErrorFilter.dll was not part of the base Visual Studio 2013 installation (it was installed to the extensions folder), I uninstalled the Productivity Power Tools 2013 extension. At this point, somewhat ironically, my productivity increased back to normal levels.

Big thanks to Microsoft for releasing symbols for Visual Studio, as well as for appropriately naming their binaries.

Despite this issue, I still recommend the Build 2014 talk entitled Tips and Tricks in Visual Studio 2013. I will post other favourite talks soon.

Update (from Microsoft): 

Thank you for your feedback. We are aware of the issue and are investigating a fix. We’re tracking this issue through connect id 816883.http://connect.microsoft.com/VisualStudio/feedback/details/816883/ide-becomes-unusable-if-error-list-contains-too-many-entries

We will resolve this issue as a duplicate of that one. You can vote 816883, but we’re already looking to make a fix.

Aside | Posted on by | Tagged , , | 4 Comments

Troubleshooting Windows Startup Using Autoruns

Recently, my Windows laptop started the odd behaviour of hanging every time I logged in. It didn’t blue screen; rather, it just froze completely every time the desktop came up. This is a pretty uncommon scenario so I figured that one of my installed applications was misbehaving. I didn’t have the time nor the inclination to muck around with the Run registry key and its ilk, so I turned to Autoruns for Windows to help me diagnose the issue.

Autoruns lets you view and disable pretty much anything that starts up with Windows — drivers, services, Explorer extensions, tasks that run at Logon, and many others.

Autoruns Main UI

Since I was seeing the hang on Logon, I rebooted into Safe Mode and used Autoruns to disable various programs listed under the Logon tab. After going through this cycle a couple of times, I figured out that my VPN client was causing the hang.

Unfortunately I needed the VPN client to connect to my corporate network so I had to keep digging. Eventually, I had a vague recollection of seeing this problem before, when enabling kernel debugging on a test machine, but not connecting the debugger. <SMACK> For some reason, I had enabled kernel debugging then forgotten about it. I quickly disabled this option via ‘bcdedit /debug off ‘and reboot my machine.

After getting my system up and running again, I used Autoruns to re-enable the programs that I had previously disabled. Now I just have to remember to not enable kernel debugging on my dev box…

Posted in Troubleshooting | Tagged | Leave a comment

Writing a WPA Regions of Interest File

In my previuos article I wrote a very simple Windows Performance Recorder profile to enable a particular ETW Provider. After recording a trace, the next step is to open it up in Windows Performance Analyzer (WPA) to see what is going on. There can be a lot of events in a trace and so anything that can help with pattern detection is most welcome.

One nice feature in WPA (starting in Windows 8.1) is called Regions of Interest. This allows you to link any two events so that they appear together in the UI. For example, if you log at the start and end of a method, it is quite natural that you would want to link these events together so that you can easily figure out the duration of the method, what else happened on the system during that time period, etc…

Example: I want to know how much CPU time my HCKServiceValidation function takes. I instrument the function with events at the start and end. Next, I record a trace, open it up in WPA and add my Regions of Interest file. After adding a couple of graphs and filtering on my process,  I get something very cool — I can see that the HCKServiceValidation function took 0.12 seconds to run, and maxed out at about 25% CPU usage.

Function CPU Usage

The Regions of Interest file is quite simple. You need only define a couple of GUIDs, and then you can link events via their event IDs.

<?xml version='1.0' encoding='utf-8' standalone='yes'?>
<?Copyright (c) Microsoft Corporation. All rights reserved.?>
<InstrumentationManifest>
 <Instrumentation>
 <Regions>
 <RegionRoot Guid="{934E499F-A064-4DA4-8D96-AAEF60E37F52}"
 Name="WHQL Region File Root"
 FriendlyName="Root">
 <Region Guid="{0EC1D460-B7C7-47F8-987C-6FA51DCA24BC}"
 Name="Region-HCK-Service-Validation"
 FriendlyName="HCK Service Validation">
 <Start>
 <Event Provider="{7D0810B4-15F2-4D2F-9C88-C885B330D944}" Id="2" Version="0" />
 </Start>
 <Stop>
 <Event Provider="{7D0810B4-15F2-4D2F-9C88-C885B330D944}" Id="3" Version="0" />
 </Stop>
 </Region> 
 </RegionRoot>
 </Regions>
 </Instrumentation>
</InstrumentationManifest>

Once you open your trace in WPA, you can add your Regions of Interest file under Trace->Trace Properties.

Adding Regions of Interest File

If any of the configured regions show up in the trace, you will see an additional graph called System Activity->Regions of Interest, which is what I displayed above. More information on MSDN.

Posted in Performance | Tagged , | Leave a comment

Writing a Very Simple WPR Profile

wpa1

In a previous post I discussed the joys of using the Windows Performance Toolkit to record and analyze Windows slowdowns. In summary, you can record a system trace using Windows Performance Recorder (WPR), and then use Windows Performance Analyzer (WPA) to visually drill into CPU, memory, & disk usage to figure out what is causing your slowdown.

The power of WPT does not end there. You can instrument and register your own Windows binaries to log events that can appear side by side with Windows system events. If you’ve ever tried to correlate your application logs with the Windows Event Viewer, or Process Monitor, you will immediately see the value here.

There are few steps involved in making this happen:

  1. Instrument your binary with Event Tracing for Windows (ETW) logging. ETW is a high performance logging mechanism that is baked into Windows. The instrumentation process is different for desktop apps, Windows Store apps, and drivers but all are supported.
  2. Register your binary as an ETW provider on the system.
  3. Configure WPR to enable your provider, and run your scenario. (You could also use xperf or an ETW controller of your choice.)
  4. Save the trace and open it with WPA.

The first two steps are out of scope of this article, but they are not particularly difficult thanks to some decent tools (ecmangen, mc, wevtutil). This article is a good starting point. [Update: Article is now only available as a .chm file. Download the April 2007 edition of MSDN Magazine from the link]

The next step is to create a Windows Performance Recorder Profile (wprp) file so that WPR knows how to enable your provider. MSDN has an article on authoring wprp profiles, but it is a little more complicated than we need.

I have cribbed from a couple of sources (including Bruce Dawson) and come up with a wprp profile that is about as simple as you can get. Comments inline.


 
  

 
 
 
 
 

 
 

 <!-- A fully specified .wprp file should have four profiles, with DetailLevel set to Verbose and Light and with Logging Mode set to Memory and File. WPR enforces that the name conforms to Profile.Level.OutputType --> 
 
  
 
 
 
 
 
 
 
 

  
 

 

  

 

You can add this profile to WPR and enable any other providers that you like.

WPR Profiles

Now you just need to start recording, run your scenario, and save your trace to disk. When you open the trace, you should magically see your provider and its associated events under System Activity -> Generic Events.

WPA Generic Events

Mission accomplished! I can now see exactly what happens on the system during my events of interest. Starting in WPT 8.1, you can even link events together using Regions of Interest. This is pretty well documented so I’ll save it for another time (or never).

Posted in Performance | Tagged , , | 3 Comments

Why does Windows Performance Analyzer Disappear When I Load a Trace?

WPA main window

As mentioned in a previous post, I’ve been tracking down Windows performance issues using the remarkable Windows Performance Toolkit (WPT). Today, I tried launching the Windows Performance Analyzer tool on a fresh Windows 8.1 system but it silently disappeared when trying to load a performance trace. Monday afternoon detective time!

My first step was to isolate the problem even further. On the affected system it turned out that even clicking the File menu in Windows Performance Analyzer resulted in a silent crash.

Perhaps the Windows Event Viewer could shed some light on the situation? Not much, in this case:

Application: wpa.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: exception code c0000005, exception address 0000000000000000

If only I had a stack trace. There are a few ways to get the stack trace of a crashing application; I chose to use my favourite debugger, WinDbg. Briefly, WinDbg is a Microsoft debugger that can be used to debug both user mode applications as well as kernel mode drivers. It is also portable — I actually keep a copy of it on a USB stick for cases like this.

Before launching WinDbg, I set the symbol path so that the debugger could download any necessary symbols from the Microsoft public symbol server. You can set the symbol path within WinDbg, but I’ve recently learned that you can instead use a system environment variable:

_NT_SYMBOL_PATH=SRV*C:\Symbols*http://msdl.microsoft.com/download/symbols

After launching WinDbg, choosing File -> Open… and selecting the crashing executable file, I ended up with the screen below. We see the debugger breaking into the application during startup. I didn’t need to configure anything else, so I just hit F5.

wpa_disappear_windbg1

Next, I reproduced the silent crash in my application. At this point, WinDbg broke into the application for a second time. Notice the access violation that the Event Viewer told us about.

wpa_disappear_windbg2

Finally, it was time for the magic WinDbg analysis command:

!analyze -v

I won’t reprint all the output here, but basically this command prompts WinDbg to spit out a bunch of diagnostic information about what has gone wrong. I was only interested in the stack trace. Here’s an excerpt:

0x0
 igdumdim64!OpenAdapter+0xe4a99
 igdumdim64!OpenAdapter+0xf1af5
 igdumdim64!OpenAdapter+0xb6f51
 d3d9!CD3DDDIDX10::TexBlt+0x7d
 d3d9!CD3DBase::TexBlt+0x51
 d3d9!CMipMap::UpdateDirtyPortion+0x10c
 d3d9!CResourceManager::UpdateVideoInternal+0x144
 d3d9!CResourceManager::UpdateVideo+0x31
 d3d9!CD3DBase::UpdateTextures+0x3dd
 d3d9!CD3DBase::DrawPrimitive+0x221
 wpfgfx_v0400!CD3DDeviceLevel1::DrawPrimitiveUP+0x512
 wpfgfx_v0400!CD3DDeviceLevel1::FlushBufferFan+0x30
 wpfgfx_v0400!CD3DDeviceLevel1::RenderTexture+0x252
 wpfgfx_v0400!CD3DDeviceLevel1::TestRenderTargetFormat+0x267
 wpfgfx_v0400!CD3DDeviceLevel1::CheckRenderTargetFormat+0x76
 wpfgfx_v0400!CHwDisplayRenderTarget::Create+0x96
 wpfgfx_v0400!CDesktopRenderTarget::Init+0x128
...

It was clear by looking at the stack trace that something was going wrong in the graphics stack. d3d is part of Direct3D and doing a quick search on igdumdim64 reveals that it is a driver for Intel HD Graphtics 4000.

I fired up Windows Device Manager to see if there was a new version of the Graphics Adapter driver.

wpa_disappear_device_manager

Yes! There was indeed a newer version of the driver. After updating the driver in Device Manager, I was finally able to click on the File menu in Windows Performance Analyzer. Wasn’t that exciting?! Now back to learning how to become a human data miner…

wpa_disappear_wpa2

Posted in Troubleshooting, Windbg | Tagged , , , | 3 Comments