Symbols Do Not Load in Windows Performance Analyzer (WPA)

The Windows Performance Toolkit (WPT) has been updated with the release of Windows 10 and I’ve just upgraded. There seem to be some minor improvements, but I noticed pretty quickly that I was unable to load symbols in Windows Performance Analyzer (WPA). This is not a new problem, but I had forgotten how to fix it. Here’s an article for future me and anyone else having this problem.

BTW, Bruce Dawson is the internet’s performance guru and has written a similar article that you may want to skip to. In this article, we conclude where his article begins.

How are Symbols Loaded?

The first symptom of this problem is that when I load symbols, WPA very quickly goes through all binaries and claims that symbol load is complete. I already have some cached symbols so I see symbols for some binaries, and ? marks for others.

A quick look at the WPA Diagnostic Console shows us the crux of the problem:

DBGHELP: SymSrv load failure: symsrv.dll

We know that applications access symbol functionality via the DbgHelp API and WPA is no exception. If we take a look using Process Monitor, we see that WPA is loading dbghelp.dll from the system directory. In turn, dbghelp.dll appears to load symsrv.dll from the same directory, which fails.

wpa_symbol_load_procmon

Fix it!

At this point, we have enough information to solve the problem. You should already have Debugging Tools for Windows installed. If not, the aforementioned article by Bruce Dawson might be a quicker fix.

The solution is simple:

  1. Figure out if you are using the x86 or x64 version of the Windows Performance Toolkit.

    This is easy on x86 builds of Windows. On x64 builds, you can check the Task Manager for the *32 tag. If it’s not there, then you’re running the x64 version.

    wpa_symbol_load_task_manager

    Note that WPT always installs to Program Files (x86) regardless of architecture.

  2. Copy the dbghelp.dll and symsrv.dll files from the correct debugger directory to the Windows Performance Toolkit directory.  On my system, the relevant directories are:
    C:\Program Files (x86)\Windows Kits\10\Debuggers\x64 and 
    C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit
  3. Restart Windows Performance Analyzer so that the correct version of dbghelp.dll is picked up.
Advertisements
Posted in Performance | Tagged , , , | Leave a comment

Recovering a Deleted Draft in Gmail

So, here’s a funny thing about Gmail. If you delete a regular email, it gets moved to the Trash folder. However, if you discard a draft, it just vanishes in a puff of smoke. My wife discovered this frustrating behaviour tonight after spending an hour writing a message to her sister via gmail.com. Something went wrong when she tried to insert a picture and she hit the Discard Draft button in haste.

Given that the draft is not moved into the Trash folder when this happens, there are only a few options for recovery:

1. The “Undo” text appears on-screen, but only until you click on another folder / message. It’s easy to panic and click on something else once you realize that you’ve just discarded your draft.

2. If you have another device hooked up to the same account, you can quickly put it into airplane mode and perhaps recover the draft.

3. In some cases, the back button *may* work.

None of these worked in our case and so I thought of a hail mary option:

4. Scan the browser process memory for the discarded draft. This can only work if you have not yet closed the Gmail browser tab, and if the process memory has not yet been overwritten or reclaimed. Also, there need to be some reasonably obscure words in the text you are looking for (‘bazaar’ was the winner in my wife’s draft).

There are lots of tools to read process memory; I used HxD to search the process memory for keywords from the draft.

hxd0

Actually, HxD is a little clunky since it doesn’t show you the process ID of the process you are viewing. Incidentally, you can get the process ID for a Chrome tab via the Chrome Task manager (right-click on the tab bar).

hxd1

After about 9 attempts, I found the correct chrome.exe process and found the text of the discarded draft. Of course, by this time, my wife had already introduced another mitigation:

5. Rewrite the email.

hxd2

Posted in Data Recovery | 164 Comments

Vmware-hostd Listening on HTTPS Port 443

Recently, I needed to stand up a web server on my development machine to do some testing. Unfortunately, when I tried to bind to the default HTTPS port (443), I found out that some other process on my machine was already using it.

I used SysInternals TCPView tool to figure out who was using the port:

vmware-hostd1

Okay, now what? After a little googling, it turns out that VMWare Workstation has a feature called “Shared VMS” that I have no need for.

I disabled this feature under VMWare Workstation -> Edit -> Preferences and vmware-hostd.exe immediately stopped listening on port 443. Onwards with my testing!

vmware-hostd2

Posted in Sysinternals Tools, Troubleshooting | Leave a comment

Debugging Windows Service Startup Using Procdump

tl;dr: Sleeping and attaching a debugger? Meh. Writing copious log files? Meh. In the case of a crashing service, it’s much easier to collect the crashdump and analyze.

If you’ve spent much time developing Windows Service, you’ve probably run into the case where your service mysteriously crashes while it is starting. In cases like these, Windows isn’t always particularly helpful. For example, here’s what Windows has to say about my Crashy Service:

service_crash_eventlog

“The Crashy Service service terminated unexpectedly.” Thanks Windows!

Since the crash is happening during service startup, it’s a little tricky to attach a debugger (or downright impossible in customer environments). Instead, as an exercise, let’s use the Sysinternals procdump tool to collect the crash dump and Visual Studio 2013 to figure out what went wrong.

Here are the steps:

1. We set procdump as the system postmortem debugger using the -i flag. This allows procdump to write out the state of the process before it crashes.

service_crash_procdump

2. Now we reproduce the service startup crash. Before the service crashes, procdump will write the crashdump to the directory where we ran the -i command in the previous step.

3. (Optional) Revert the system postmortem debugger to the previous value using the procdump -u command.

service_crash_procdump_uninstall

4. Open the .dmp crashdump file in Visual Studio 2013. When I do this, I see an initial screen that tells me a bit more about the exception that led to the crash. I then have the option of digging into the crashdump via the “Debug with Native Only” action.

service_crash_vs00

service_crash_vs0

5. When I click on Debug with Native Only, I get access to the call stack as well as the exact line of code that caused the crash (give or take — nobody’s perfect!) In this case it appears that I’m trying to lower case a null string, which throws an access denied exception.

service_crash_vs1

service_crash_vs2

More info: I’m analyzing the binary on the same machine where I compiled it. In other situations, you will have to tell VS where your symbols are. Also, in general I would probably use WinDbg to analyze the crashdump as I have way more experience with it. !analyze -v is your friend.

Posted in Debugging, Sysinternals Tools | Leave a comment

Performance Analysis Toolbox: Timeouts

Music is the space between the notes
– Claude Debussy

I recently spent some time looking at a slow shutdown issue. When product X was installed, the system was taking 34 seconds to shut down, vs 22 seconds when the product was not installed. Our automation team was able to reproduce this issue using the Full Boot Assessment in the Windows ADK. This gave us a very good place to start. In addition to a regular .etl trace file, the ADK also provides some extra information to help organize the millions of system events into a coherent narrative.

Image

After comparing this trace to a normal shutdown, it was clear the the IO Shutdown System section was much longer than normal — 10 seconds vs. close to 0 seconds in the normal case.

So, I got out my performance analysis toolbox and looked at:

– CPU usage
– Wait chains
– Disk usage

None of these approaches proved fruitful. There was very little CPU activity, nobody seemed to be waiting on anyone else, and disk usage was minimal. How could performance be impacted when nobody was doing anything? I reached out to a colleague and mentioned the major facts: slow shutdown, 10 second delay during the IO Shutdown phase, not much activity. His psychic response? “10 seconds? If it is exactly 10 seconds it sounds like a timeout”. I looked at a couple more traces and sure enough, the delay was always 10 seconds plus a tenth of a second or so.

We figured out the problem in the end but the big takeaway for me is a new tool for my performance analysis toolbox — timeout analysis, where the space between the notes is more important than the notes themselves.

Posted in Performance | Tagged , , | Leave a comment

Build 2014: Favourite Sessions

Build2014

Build 2014 is a distant memory at this point. Luckily, all sessions are still available for viewing on Channel 9. Here are my favourite sessions from the conference.

Modern C++: What You Need to Know
If you’re still in the dark about C++11/14, check out Herb Suter’s talk on the power and ease of using the new additions to the C++ language. There’s also a great section in the talk about the performance benefits of C++ arrays (and contiguous arrays in general).

Tips and Tricks in Visual Studio 2013
Cathy Sullivan presents an hour of ways to increase productivity and decrease frustration in Visual Studio 2013. If you use Visual Studio on any sort of regular basis, you will benefit from this hour. Ctrl+Q is your friend.

Diagnosing Issues in Windows Phone XAML Apps Using Visual Studio
I’ve never written a Windows Phone App but still found this session very interesting. Dan Taylor uses Visual Studio to analyze CPU, UI throughput, memory, and energy efficiency issues. Visual Studio debugging and analysis tools have come a *long* way

Windows and the Internet of Things
Steve Teixeira helped me understand what the internet of things is all about. Great presentation and great content. Most of Build 2014 was focused on convergence of Windows, Phone, and XBox; this session introduced me to some divergent flavours of Windows that I had not heard of.

Native Code Performance on Modern CPUs: A Changing Landscape
There was a noticeable lack of low-level sessions at Build 2014 and so the minor headache induced by Eric Brumer’s complex talk was most welcome. The big takeaway: with ever-faster CPUs, memory operations have become the bottleneck. Dovetails nicely with Herb Suter’s session above, but goes much more in-depth.

Avoiding Cloud Fail: Learning from the Mistakes of Azure with Mark Russinovich
As always, Russinovich is engaging and informative. Though the talk is focused on cloud there are lessons here for all types of development, for example, pragmatic exception handling strategies.

Posted in Performance, Training, Troubleshooting | Tagged , , , , | 2 Comments

Visual Studio Productivity Power Tools 2013 Kills Productivity

Update: This issue has been resolved and verified. You can get it from Visual Studio Gallery at http://visualstudiogallery.msdn.microsoft.com/dbcb8670-889e-4a54-a226-a48a15e4cace

At Build 2014, Microsoft did a suitable job of persuading me to run the latest version of Visual Studio. I updated to Visual Studio 2013 Update 2 RC and also installed the Productivity Power Tools 2013 extension. Soon afterwards, I started seeing long-lasting hangs in Visual Studio while editing my fairly modest C++ solution.

I initially tried to ignore the problem by killing and restarting Visual Studio, but this quickly compounded my frustration. My next tactic was to record one of these hangs using Windows Performance Recorder (see my previous post for background on this tool).

Upon opening the recorded trace in Windows Performance Analyzer, I came to the conclusion that Visual Studio had stopped pumping Windows messages. As we see in the graph below, thread 5959 went 179 seconds without checking for new messages. Why?

vs-hang1

One possibility was that the UI thread was waiting on another thread. However, when I opened the CPU Usage (Sampled) graph and filtered it to thread 5956, I could see that the thread was pegging the CPU. My machine has four logical processors and so 25% CPU usage means that one of my processors was being utilized completely.

vs-hang2

After loading symbols, I was able to drill down into the aggregated stack. I didn’t know exactly what I was looking for, but I eventually saw some enlightening function names on the stack.

vs-hang3

The aggregated stack usage showed that a lot of processing time was being spent in SolutionErrorFilter.dll. At this point, I remembered that I had installed the Productivity Power Tools 2013 extension so that I could see errors highlighted in Solution Explorer.

After proving to myself that SolutionErrorFilter.dll was not part of the base Visual Studio 2013 installation (it was installed to the extensions folder), I uninstalled the Productivity Power Tools 2013 extension. At this point, somewhat ironically, my productivity increased back to normal levels.

Big thanks to Microsoft for releasing symbols for Visual Studio, as well as for appropriately naming their binaries.

Despite this issue, I still recommend the Build 2014 talk entitled Tips and Tricks in Visual Studio 2013. I will post other favourite talks soon.

Update (from Microsoft): 

Thank you for your feedback. We are aware of the issue and are investigating a fix. We’re tracking this issue through connect id 816883.http://connect.microsoft.com/VisualStudio/feedback/details/816883/ide-becomes-unusable-if-error-list-contains-too-many-entries

We will resolve this issue as a duplicate of that one. You can vote 816883, but we’re already looking to make a fix.

Aside | Posted on by | Tagged , , | 4 Comments