MEX is Your Friend: Analyzing 32-bit Processes in a 64-bit Kernel Dump

Introduction

In 2016, Microsoft publicly released a WinDBG extension called MEX. It has a number of commands that make life easier when kernel debugging or analyzing a crash dumps. You can download it here if you want to give it a try.

The Problem

I was recently asked to look a hang issue in Microsoft Outlook. I was given a full memory dump taken on a 64-bit system, which seemed like overkill (it wasn’t). It’s generally quite straightforward to analyze user mode apps from a memory dump — see the following OSR article for more info: Analyst’s Perspective: Analyzing User Mode State from a Kernel Connection.

I started my debugging session and quickly ran into a snag: Outlook was running as 32-bit process, leading to stacks with mostly wow64* calls in them.

0: kd> !process 0 0 outlook.exe
PROCESS fffffa800b6e6060
 SessionId: 1 Cid: 1408 Peb: 7efdf000 ParentCid: 1268
 DirBase: a9baa000 ObjectTable: fffff8a0083de430 HandleCount: 6880.
 Image: OUTLOOK.EXE

0: kd> .process /r /p fffffa800b6e6060
Implicit process is now fffffa80`0b6e6060
Loading User Symbols
.....
0: kd> !process fffffa800b6e6060
PROCESS fffffa800b6e6060
 SessionId: 1 Cid: 1408 Peb: 7efdf000 ParentCid: 1268
 DirBase: a9baa000 ObjectTable: fffff8a0083de430 HandleCount: 6880.
 Image: OUTLOOK.EXE
[...snip...]
 THREAD fffffa800b858060 Cid 1408.180c Teb: 000000007efaa000 Win32Thread: fffff900c1d3f010 WAIT: (UserRequest) UserMode Non-Alertable
[...snip...]
 fffff880`093dbec0 fffff800`03484e42 nt!KiSwapContext+0x7a
 fffff880`093dc000 fffff800`034918da nt!KiCommitThreadWait+0x1d2
 fffff880`093dc090 fffff800`0378b1af nt!KeWaitForMultipleObjects+0x272
 fffff880`093dc350 fffff800`037b8fc9 nt!ObpWaitForMultipleObjects+0x294
 fffff880`093dc820 fffff800`0348e693 nt!NtWaitForMultipleObjects32+0xec
 fffff880`093dca70 00000000`74cc2e09 nt!KiSystemServiceCopyEnd+0x13 (TrapFrame @ fffff880`093dcae0)
 00000000`0352ebe8 00000000`74cc283e wow64cpu!CpupSyscallStub+0x9
 00000000`0352ebf0 00000000`74d3d286 wow64cpu!WaitForMultipleObjects32+0x3b
 00000000`0352ecb0 00000000`74d3c69e wow64!RunCpuSimulation+0xa
 00000000`0352ed00 00000000`773898ec wow64!Wow64LdrpInitialize+0x42a
 00000000`0352f250 00000000`7734a36e ntdll! ?? ::FNODOBFM::`string'+0x22b74
 00000000`0352f2c0 00000000`00000000 ntdll!LdrInitializeThunk+0xe

The Painful Solution

After asking around, I found out that I could use .thread /w to look at 32-bit stacks but the method proved cumbersome, as I was looking for a particular thread out of 57 threads in the process. I was too lazy to write a script and so I spent over an hour going through each stack in the process like so:

0: kd> .thread /w fffffa800b858060 ; kb ; .effmach amd64
Implicit thread is now fffffa80`0b858060
x86 context set
 *** Stack trace for last set context - .thread/.cxr resets it
 # ChildEBP RetAddr Args to Child 
00 039afa64 755e171a 00000005 039afab4 00000001 ntdll_77500000!NtWaitForMultipleObjects+0x15
01 039afb00 76c019fc 039afab4 039afb28 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
02 039afb48 76a90882 00000005 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
03 039afb9c 6b16df8b 000003a0 039afbe4 ffffffff USER32!RealMsgWaitForMultipleObjectsEx+0x14d
WARNING: Stack unwind information not available. Following frames may be wrong.
04 039afbbc 2f7a5538 00000004 039afbe4 00000000 mso!MsoHrSetupHTMLImport+0x392
05 039afc10 62398646 00000000 0775e2d4 623985a6 OUTLOOK!HrMsgDownloadedNotification+0x3ff
06 039afc34 6b0a08e6 0775e2d4 00000000 572383a4 olmapi32!MSProviderInit+0x95f
07 039afc78 6b09e0fd 1191f900 039afd10 0039d16c mso!Ordinal3464+0x4a2
08 039afc98 6b09ddc3 039afd10 00000000 039afcf4 mso!Ordinal2771+0x450
09 039afcb4 6b09c027 039afd10 00000000 003997f0 mso!Ordinal2771+0x116
0a 039afce8 6b0962b7 003997f0 00000000 6b0962b7 mso!Ordinal2929+0x209
0b 039afd44 76c0336a 003997f0 039afd90 77539902 mso!Ordinal4724+0x67
0c 039afd50 77539902 003997f0 74fcff46 00000000 kernel32!BaseThreadInitThunk+0xe
0d 039afd90 775398d5 6b09625f 003997f0 ffffffff ntdll_77500000!__RtlUserThreadStart+0x70
0e 039afda8 00000000 6b09625f 003997f0 00000000 ntdll_77500000!_RtlUserThreadStart+0x1b
Effective machine: x64 (AMD64)

The MEX Solution

My solution worked but it was incredibly inefficient. The next morning I was still looking at the dump and decided to play around with MEX to see if it could help me out with something else. As it turned out, it was also able to help with my initial problem. Instead of spending an hour+ pasting commands into the debugger, I was able to get to the same point in about 1 minute.

0: kd> !mex.p fffffa800b6e6060
Name Address Ses PID Parent PEB Create Time Mods Handle Thrd User Name
=========== ======================== === ============= ============= ================ ========================== ==== ====== ==== ==========
OUTLOOK.EXE fffffa800b6e6060 (E|K|O) 1 1408 (0n5128) 1268 (0n4712) 000000007efdf000 04/26/2017 03:52:59.367 AM 281 6880 57 BOS\zy10os

Command Line: "C:\Program Files (x86)\Microsoft Office\Office14\OUTLOOK.EXE"

Memory Details:

VM Peak Work Set Commit Size PP Quota NPP Quota
 ========= ========= ========= =========== ======== =========
 963.62 MB 980.07 MB 281.02 MB 221.82 MB 1.36 MB 206 KB

Show LPC Port information for process

Show Threads: Unique Stacks !mex.listthreads (!lt) fffffa800b6e6060 !process fffffa800b6e6060 7

0: kd> !mex.lt fffffa800b6e6060
Process PID Thread Id State Time Reason Waiting On
=============== ==== ================ ==== ======= ============= ============== =================================================
OUTLOOK.EXE *32 1408 fffffa8007592060 fc0 Waiting 11s.138 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8007588840 1814 Waiting 9s.001 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b4dbb50 14bc Waiting 15ms UserRequest 
OUTLOOK.EXE *32 1408 fffffa80082c3b50 1808 Waiting 15ms UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b858060 180c Waiting 11s.138 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b76bb50 11bc Waiting 15ms WrQueue 
OUTLOOK.EXE *32 1408 fffffa800ad06b50 1910 Waiting 55s.551 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b552b20 1544 Waiting 11s.122 WrUserRequest 
OUTLOOK.EXE *32 1408 fffffa8007511060 15a4 Waiting 11s.122 WrUserRequest 
OUTLOOK.EXE *32 1408 fffffa8007404060 1670 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b71ab50 1930 Waiting 36m:05.449 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8007578060 1a5c Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8007581b50 1a0c Waiting 1m:28.733 WrQueue 
OUTLOOK.EXE *32 1408 fffffa8008fe9060 11b8 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b85b060 1590 Waiting 1s.747 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800a1a4060 cbc Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa80072f3b50 1820 Waiting 9s.984 WrLpcReply Thread: fffffa800c877060 in mcshield.exe (0n3120)
OUTLOOK.EXE *32 1408 fffffa80074a4b50 1228 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8007584b50 58c Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8009af1060 15c4 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800983e7c0 648 Waiting 6m:18.598 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b3b24b0 16b4 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b87f060 16d8 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800adc5060 1668 Waiting 1s.778 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b67e6c0 1788 Waiting 1m:39.949 UserRequest 
OUTLOOK.EXE *32 1408 fffffa80075a5b50 1658 Waiting 38s.017 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b52eb50 c20 Waiting 21m:28.505 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b40cb50 bd4 Waiting 15ms UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b375060 e00 Waiting 1m:31.993 WrUserRequest 
OUTLOOK.EXE *32 1408 fffffa800b2d67f0 10c0 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b6d8060 1990 Waiting 3m:59.196 WrUserRequest 
OUTLOOK.EXE *32 1408 fffffa800ba53b50 16e0 Waiting 6m:18.598 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800baa1060 1a98 Waiting 9m:41.197 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b697060 14a8 Waiting 15h:22:24.615 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800ba2e060 15ec Waiting 1m:30.511 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800bb68060 1924 Waiting 1m:30.511 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800ba74060 1130 Waiting 35s.271 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800bad2060 3e0 Waiting 1m:31.744 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800b700060 1900 Waiting 11s.122 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8007362640 18a0 Waiting 1m:29.528 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800bab5060 ce0 Waiting 1m:30.932 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8009e3f060 15fc Waiting 15h:22:24.615 WrQueue 
OUTLOOK.EXE *32 1408 fffffa800be1f640 1db8 Waiting 1m:31.416 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800689f060 1784 Waiting 6m:51.905 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8006b59710 1f88 Waiting 44s.304 WrUserRequest 
OUTLOOK.EXE *32 1408 fffffa800bd60900 21a0 Waiting 15m:41.044 UserRequest 
OUTLOOK.EXE *32 1408 fffffa8006c70060 1c5c Waiting 15ms UserRequest 
OUTLOOK.EXE *32 1408 fffffa800c253710 1850 Waiting 280ms UserRequest 
OUTLOOK.EXE *32 1408 fffffa800befbb50 2074 Waiting 55s.661 DelayExecution 
OUTLOOK.EXE *32 1408 fffffa8009e95840 2164 Waiting 9s.001 WrQueue 
OUTLOOK.EXE *32 1408 fffffa800c71a650 22b8 Waiting 4m:01.567 WrQueue 
OUTLOOK.EXE *32 1408 fffffa800c000870 2314 Waiting 15ms WrQueue 
OUTLOOK.EXE *32 1408 fffffa8009ede2b0 2014 Waiting 15ms WrQueue 
OUTLOOK.EXE *32 1408 fffffa800bef7460 1d24 Waiting 15ms WrQueue 
OUTLOOK.EXE *32 1408 fffffa800c28f060 1848 Waiting 11s.232 WrQueue 
OUTLOOK.EXE *32 1408 fffffa800c0a2b50 1258 Waiting 11s.122 UserRequest 
OUTLOOK.EXE *32 1408 fffffa800bfde060 21a8 Waiting 15ms WrQueue

Thread Count: 57

0: kd> !mex.t fffffa800b858060
Process Thread CID TEB UserTime KernelTime ContextSwitches Wait Reason Time State COM-Initialized
OUTLOOK.EXE *32 (fffffa800b6e6060) fffffa800b858060 (E|K|W|R|V) 1408.180c 000000007efaa000 312ms 94ms 73230 UserRequest 11s.138 Waiting APTKIND_MULTITHREADED (MTA)

[...snip...]

# Child-SP Return Call Site
0 fffff880093dbec0 fffff80003484e42 nt!KiSwapContext+0x7a
1 fffff880093dc000 fffff800034918da nt!KiCommitThreadWait+0x1d2
2 fffff880093dc090 fffff8000378b1af nt!KeWaitForMultipleObjects+0x272
3 fffff880093dc350 fffff800037b8fc9 nt!ObpWaitForMultipleObjects+0x294
4 fffff880093dc820 fffff8000348e693 nt!NtWaitForMultipleObjects32+0xec
5 fffff880093dca70 0000000074cc2e09 nt!KiSystemServiceCopyEnd+0x13
0 00000000039afa64 00000000755e171a ntdll_77500000!NtWaitForMultipleObjects+0x15
1 00000000039afa6c 0000000076c019fc KERNELBASE!WaitForMultipleObjectsEx+0x100
2 00000000039afb08 0000000076a90882 kernel32!WaitForMultipleObjectsExImplementation+0xe0
3 00000000039afb50 000000006b16df8b USER32!RealMsgWaitForMultipleObjectsEx+0x14d
4 00000000039afba4 000000002f7a5538 mso!MsoHrSetupHTMLImport+0x392
5 00000000039afbc4 0000000062398646 OUTLOOK!HrMsgDownloadedNotification+0x3ff
6 00000000039afc18 000000006b0a08e6 olmapi32!MSProviderInit+0x95f
7 00000000039afc3c 000000006b09e0fd mso!Ordinal3464+0x4a2
8 00000000039afc80 000000006b09ddc3 mso!Ordinal2771+0x450
9 00000000039afca0 000000006b09c027 mso!Ordinal2771+0x116
a 00000000039afcbc 000000006b0962b7 mso!Ordinal2929+0x209
b 00000000039afcf0 0000000076c0336a mso!Ordinal4724+0x67
c 00000000039afd4c 0000000077539902 kernel32!BaseThreadInitThunk+0xe
d 00000000039afd58 00000000775398d5 ntdll_77500000!__RtlUserThreadStart+0x70
e 00000000039afd98 0000000000000000 ntdll_77500000!_RtlUserThreadStart+0x1b

Switch to x86 - to switch back to x64/amd64 run !sw

Conclusion

MEX is your friend.

Posted in Debugging, Windbg | Leave a comment

Windows Subystem For Linux – File Reads

I’ve read a bit about the way that Microsoft has implemented the Windows Subsystem for Linux (WSL). A one sentence summary: all system calls in a WSL process are fulfilled by a special kernel driver called a pico provider. I wanted to flesh out this knowledge from the perspective of a minifilter sitting on the file system stack.

For this research, I am using the following Windows 10 RS2 preview build.
Windows 10 Kernel Version 15007 MP (2 procs) Free x64
Built by: 15007.1000.amd64fre.rs_prerelease.170107-1846

Disclaimer: This pre-release of Windows 10 RS2 was a publicly available build and I only used publicly available information in putting together this article.

Installation

Here is an article on how to get bash up and running on Windows 10 RS1 and later. One major point that I initially missed is that this is only supported on 64-bit versions of Windows.

Mostly Normal

Once everything is up and running, the bash process appears in Task Manager as normal.

bash_task_manager

First, let’s use Procmon to verify that WSL does use the normal file system stack.

$ read < "a.txt"

bash_read_procmon

That’s a good start but I had two questions after seeing the above output:

  1. Why is the process name missing?
  2. Why is there a CREATE with no permissions requested?

Pico Processes

I’m using v3.31 of Procmon and it’s not able to show the process name for a WSL process. We know that WSL processes are implemented as pico processes so this makes some sense. Even WinDBG gets confused:

0: kd> !process 0n440
Searching for Process with Cid == 1b8
PROCESS ffffa90bbef49080
 SessionId: 1 Cid: 01b8 Peb: 00000000 ParentCid: 074c
 DirBase: 79ab9000 ObjectTable: ffffc00afb7b7540 HandleCount: 0.
 Image: System Process
 VadRoot ffffa90bc07b7960 Vads 370 Clone ffffa90bc0d530f0 Private 88. Modified 924. Locked 1.
 DeviceMap 0000000000000000
 Token ffffc00afa9f0960
 ElapsedTime 00:39:46.612
 UserTime 00:00:00.000
 KernelTime 00:00:00.000
 …
 
 THREAD ffffa90bbd889080 Cid 01b8.1b24 Teb: 0000000000000000 Win32Thread: 0000000000000000 WAIT: (Executive) KernelMode Non-Alertable
 ffffa90bbeef81e0 NotificationEvent
 ffffa90bc0409bd0 NotificationEvent
 Not impersonating
 Owning Process ffffa90bbef49080 Image: System Process
 Attached Process N/A Image: N/A
 Wait Start TickCount 299722 Ticks: 2439 (0:00:00:38.109)
 Context Switch Count 1553 IdealProcessor: 1 
 UserTime 00:00:00.031
 KernelTime 00:00:00.906
 Stack Init ffffd400f8378c90 Current ffffd400f8378310
 Base ffffd400f8379000 Limit ffffd400f8373000 Call 0000000000000000
 Priority 8 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
 Child-SP RetAddr Call Site
 ffffd400`f8378350 fffff801`7fa77867 nt!KiSwapContext+0x76
 ffffd400`f8378490 fffff801`7fa76f90 nt!KiSwapThread+0x477
 ffffd400`f8378540 fffff801`7fa76583 nt!KiCommitThreadWait+0x160
 ffffd400`f83785e0 fffff80b`b452e1b6 nt!KeWaitForMultipleObjects+0x203
 ffffd400`f83786c0 fffff80b`b452e379 LXCORE!LxpThreadWait+0x126
 ffffd400`f8378770 fffff80b`b44c6d38 LXCORE!LxpThreadWaitEx+0x49
 ffffd400`f83787d0 fffff80b`b45480e4 LXCORE!LxpDevTerminalRead+0x170
 ffffd400`f8378860 fffff80b`b45251e1 LXCORE!VfsFileRead+0x1b4
 ffffd400`f8378900 fffff80b`b451e22c LXCORE!LxpSyscall_READ+0x101
 ffffd400`f83789d0 fffff80b`b453be76 LXCORE!LxpSysDispatch+0x16c
 ffffd400`f8378aa0 fffff801`800ac2c0 LXCORE!PicoSystemCallDispatch+0x16
 ffffd400`f8378ad0 fffff801`7fb8a392 nt!PsPicoSystemCallDispatch+0x20
 ffffd400`f8378b00 00007f06`a24e69b0 nt!KiSystemServiceUser+0x82 (TrapFrame @ ffffd400`f8378b00)
 00007fff`fcb27498 00000000`004aee97 0x00007f06`a24e69b0
 00007fff`fcb274a0 00000000`00000000 0x4aee97

Just for confirmation, this is indeed a Pico Process and so LXCORE must be the Pico Provider.

0: kd> dt _EPROCESS ffffa90bbf26a500

nt!_EPROCESS
 …
 +0x6d4 Minimal : 0y1
 …
 +0x718 PicoContext : 0xffffc00a`fa03c000 Void
 …
 +0x818 PicoCreated : 0y1

Bash Process

There is actually a win32 bash.exe process on the system but my understanding is that it is only responsible for launching the WSL pico processes.

PROCESS ffffa90bc17b9340
 SessionId: 1 Cid: 0c4c Peb: 4ac55f3000 ParentCid: 0f80
 DirBase: 1ebf2000 ObjectTable: ffffc00afbb466c0 HandleCount: 90.
 Image: bash.exe

0: kd> dt _EPROCESS ffffa90bc17b9340
nt!_EPROCESS
 …
 +0x6d4 Minimal : 0y0

I/O Pattern

In the Procmon output, we see two CREATE operations. The first CREATE opens the file without any permissions and the second opens the same file with a number of permissions. As it turns out, the two CREATE operations are very much related and not in an obvious way.

I used WinDBG to set a conditional breakpoint in my minifilter PreCreate operation:

bp Myfilter!PreCreate ".if (ffffa90bbef49080== $proc) {} .else {gc}"

The file object in the first CREATE operation looks completely normal. This is a relative open to a directory that has already been opened (my Procmon filter excluded this original open).

1: kd> dx -r1 (*((FLTMGR!_FILE_OBJECT *)0xffffa90bc183b530))
(*((FLTMGR!_FILE_OBJECT *)0xffffa90bc183b530)) [Type: _FILE_OBJECT]
 [+0x000] Type : 5 [Type: short]
 [+0x002] Size : 216 [Type: short]
 [+0x008] DeviceObject : 0xffffa90bbe7c8b50 : Device for "\Driver\volmgr" [Type: _DEVICE_OBJECT *]
 [+0x010] Vpb : 0x0 [Type: _VPB *]
 [+0x018] FsContext : 0x0 [Type: void *]
 [+0x020] FsContext2 : 0x0 [Type: void *]
 [+0x028] SectionObjectPointer : 0x0 [Type: _SECTION_OBJECT_POINTERS *]
 [+0x030] PrivateCacheMap : 0x0 [Type: void *]
 [+0x038] FinalStatus : 0 [Type: long]
 [+0x040] RelatedFileObject : 0xffffa90bc19e3df0 [Type: _FILE_OBJECT *]
 [+0x048] LockOperation : 0x0 [Type: unsigned char]
 [+0x049] DeletePending : 0x0 [Type: unsigned char]
 [+0x04a] ReadAccess : 0x0 [Type: unsigned char]
 [+0x04b] WriteAccess : 0x0 [Type: unsigned char]
 [+0x04c] DeleteAccess : 0x0 [Type: unsigned char]
 [+0x04d] SharedRead : 0x0 [Type: unsigned char]
 [+0x04e] SharedWrite : 0x0 [Type: unsigned char]
 [+0x04f] SharedDelete : 0x0 [Type: unsigned char]
 [+0x050] Flags : 0x20002 [Type: unsigned long]
 [+0x058] FileName : "a.txt" [Type: _UNICODE_STRING]
 [+0x068] CurrentByteOffset : {0} [Type: _LARGE_INTEGER]
 [+0x070] Waiters : 0x0 [Type: unsigned long]
 [+0x074] Busy : 0x0 [Type: unsigned long]
 [+0x078] LastLock : 0x0 [Type: void *]
 [+0x080] Lock [Type: _KEVENT]
 [+0x098] Event [Type: _KEVENT]
 [+0x0b0] CompletionContext : 0x0 [Type: _IO_COMPLETION_CONTEXT *]
 [+0x0b8] IrpListLock : 0x0 [Type: unsigned __int64]
 [+0x0c0] IrpList [Type: _LIST_ENTRY]
 [+0x0d0] FileObjectExtension : 0x0 [Type: void *]

The file object in the second CREATE is also a relative open, but relative to the file itself!

0: kd> dx -r1 (*((FLTMGR!_FILE_OBJECT *)0xffffa90bbe60bef0))
(*((FLTMGR!_FILE_OBJECT *)0xffffa90bbe60bef0)) [Type: _FILE_OBJECT]
 [+0x000] Type : 5 [Type: short]
 [+0x002] Size : 216 [Type: short]
 [+0x008] DeviceObject : 0xffffa90bbe7c8b50 : Device for "\Driver\volmgr" [Type: _DEVICE_OBJECT *]
 [+0x010] Vpb : 0x0 [Type: _VPB *]
 [+0x018] FsContext : 0x0 [Type: void *]
 [+0x020] FsContext2 : 0x0 [Type: void *]
 [+0x028] SectionObjectPointer : 0x0 [Type: _SECTION_OBJECT_POINTERS *]
 [+0x030] PrivateCacheMap : 0x0 [Type: void *]
 [+0x038] FinalStatus : 0 [Type: long]
 [+0x040] RelatedFileObject : 0xffffa90bc183b530 [Type: _FILE_OBJECT *] <-- Hey look, it's the first file object!
 [+0x048] LockOperation : 0x0 [Type: unsigned char]
 [+0x049] DeletePending : 0x0 [Type: unsigned char]
 [+0x04a] ReadAccess : 0x0 [Type: unsigned char]
 [+0x04b] WriteAccess : 0x0 [Type: unsigned char]
 [+0x04c] DeleteAccess : 0x0 [Type: unsigned char]
 [+0x04d] SharedRead : 0x0 [Type: unsigned char]
 [+0x04e] SharedWrite : 0x0 [Type: unsigned char]
 [+0x04f] SharedDelete : 0x0 [Type: unsigned char]
 [+0x050] Flags : 0x20002 [Type: unsigned long]
 [+0x058] FileName : "" [Type: _UNICODE_STRING]
 [+0x068] CurrentByteOffset : {0} [Type: _LARGE_INTEGER]
 [+0x070] Waiters : 0x0 [Type: unsigned long]
 [+0x074] Busy : 0x0 [Type: unsigned long]
 [+0x078] LastLock : 0x0 [Type: void *]
 [+0x080] Lock [Type: _KEVENT]
 [+0x098] Event [Type: _KEVENT]
 [+0x0b0] CompletionContext : 0x0 [Type: _IO_COMPLETION_CONTEXT *]
 [+0x0b8] IrpListLock : 0x0 [Type: unsigned __int64]
 [+0x0c0] IrpList [Type: _LIST_ENTRY]
 [+0x0d0] FileObjectExtension : 0x0 [Type: void *]

It is perfectly legal to open a file relative to itself, but it’s also easy to mistake this pattern for a volume open. Alex Carp has a great article that helped me understand the pattern a little better. An excerpt:

reopen (FileObject->FileName is empty (Length == 0 and Buffer == NULL) but RelatedFileObject is not null). This is used when the caller wants to open a new handle to an existing FILE_OBJECT. This is not the same as opening a new handle to the existing FILE_OBJECT (duplicating the handle) because the end result of this is to open a new FILE_OBJECT for the same underlying stream, and the two FILE_OBJECTs are not linked in any other way (for example, a filter might want to open its own handle to a user file without bothering to figure out if the user has enough access (if the minifilter might want to write to the file and the original handle didn’t allow it then duplicating the handle is more complicated) or without interfering with the additional information stored in the FILE_OBJECT (like the current pointer position) and so on. According to the FASTFAT source code this should work for volume opens as well. This actually happens occasionally so filters should be prepared to deal with it.

Conclusion

I’m happy that things look fairly normal. The WSL read pattern is a little strange, and may require some tweaking in a minifilter that tries to detect volume opens.

Posted in Debugging | Leave a comment

Why does Notepad++ hang when I open it?

I am generally happy with Notepad++ but every now and then I find that it takes a long time to open. Yesterday, I started seeing a 20-second hang and after opening Notepad++ a few times, I became frustrated enough to troubleshoot the problem. The good news is that it took about 10 minutes to figure out the problem, which is a good ROI — once I open Notepad++ about 30 more times, I will have made back the time spent fixing it.

You can read the basics of using Windows Performance Recorder (WPR) and Windows Performance Analyzer (WPA) for wait analysis here: Why does Explorer hang when I right-click on a folder? If you’ve never done this before, it may take a couple of hours to get everything set up but it’s a one-time task.

Let’s pick up after I’ve already collected the trace and opened it in WPA. My first step is to look at the System Activity\UI Delays graph, which shows that Notepad++ is failing to check for Window messages for 21.9 seconds. That matches the delay I’m seeing in opening it.

notepad++_hang_ui_delay

Next, I load up the Computation\CPU Usage (Precise) graph, since it has information on every context switch.  From the UI Delays graph, I know that the hanging UI thread ID is 94092 so I set up my table as below.

notepad++_hang_cpu_precise_startIn the table, we see that the total wait time for the thread is about what we expect – 21 seconds. The longest wait is about 1 second. Next, I drill down into the thread, each time following the stack with the longest total wait time, eventually stopping when I find something “interesting”:

notepad++_hang_cpu_precise_borneo3

Above, we see XMLTools-borneo3.dll is calling the WinSock select function. MSDN says that “The select function determines the status of one or more sockets, waiting if necessary, to perform synchronous I/O.” It does appear that the function is waiting, since we see the call to WaitForSingleObject.

The other strange thing is that borneo3 is the name of my home machine. A quick look at the System Activity\Images table shows me that XMLTools-borneo3.dll is being loaded from the Notepad++ plugins directory.

At this point, I connect the dots. I use OneDrive to sync my tools across different machines. Evidently, it had a conflict when syncing and backed up the XMLTools.dll to a new name.

Once I deleted the XMLTools-borneo3.dll, the hang during open went away. Of course, it’s not clear exactly what the problem is or why it doesn’t happen with the original XMLTools.dll but as the gambler says, you gotta know when to fold them. Back to work!

Posted in Hang, Performance, Troubleshooting | Leave a comment

Symbols Do Not Load in Windows Performance Analyzer (WPA)

The Windows Performance Toolkit (WPT) has been updated with the release of Windows 10 and I’ve just upgraded. There seem to be some minor improvements, but I noticed pretty quickly that I was unable to load symbols in Windows Performance Analyzer (WPA). This is not a new problem, but I had forgotten how to fix it. Here’s an article for future me and anyone else having this problem.

BTW, Bruce Dawson is the internet’s performance guru and has written a similar article that you may want to skip to. In this article, we conclude where his article begins.

How are Symbols Loaded?

The first symptom of this problem is that when I load symbols, WPA very quickly goes through all binaries and claims that symbol load is complete. I already have some cached symbols so I see symbols for some binaries, and ? marks for others.

A quick look at the WPA Diagnostic Console shows us the crux of the problem:

DBGHELP: SymSrv load failure: symsrv.dll

We know that applications access symbol functionality via the DbgHelp API and WPA is no exception. If we take a look using Process Monitor, we see that WPA is loading dbghelp.dll from the system directory. In turn, dbghelp.dll appears to load symsrv.dll from the same directory, which fails.

wpa_symbol_load_procmon

Fix it!

At this point, we have enough information to solve the problem. You should already have Debugging Tools for Windows installed. If not, the aforementioned article by Bruce Dawson might be a quicker fix.

The solution is simple:

  1. Figure out if you are using the x86 or x64 version of the Windows Performance Toolkit.

    This is easy on x86 builds of Windows. On x64 builds, you can check the Task Manager for the *32 tag. If it’s not there, then you’re running the x64 version.

    wpa_symbol_load_task_manager

    Note that WPT always installs to Program Files (x86) regardless of architecture.

  2. Copy the dbghelp.dll and symsrv.dll files from the correct debugger directory to the Windows Performance Toolkit directory.  On my system, the relevant directories are:
    C:\Program Files (x86)\Windows Kits\10\Debuggers\x64 and 
    C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit
  3. Restart Windows Performance Analyzer so that the correct version of dbghelp.dll is picked up.
Posted in Performance | Tagged , , , | Leave a comment

Recovering a Deleted Draft in Gmail

So, here’s a funny thing about Gmail. If you delete a regular email, it gets moved to the Trash folder. However, if you discard a draft, it just vanishes in a puff of smoke. My wife discovered this frustrating behaviour tonight after spending an hour writing a message to her sister via gmail.com. Something went wrong when she tried to insert a picture and she hit the Discard Draft button in haste.

Given that the draft is not moved into the Trash folder when this happens, there are only a few options for recovery:

1. The “Undo” text appears on-screen, but only until you click on another folder / message. It’s easy to panic and click on something else once you realize that you’ve just discarded your draft.

2. If you have another device hooked up to the same account, you can quickly put it into airplane mode and perhaps recover the draft.

3. In some cases, the back button *may* work.

None of these worked in our case and so I thought of a hail mary option:

4. Scan the browser process memory for the discarded draft. This can only work if you have not yet closed the Gmail browser tab, and if the process memory has not yet been overwritten or reclaimed. Also, there need to be some reasonably obscure words in the text you are looking for (‘bazaar’ was the winner in my wife’s draft).

There are lots of tools to read process memory; I used HxD to search the process memory for keywords from the draft.

hxd0

Actually, HxD is a little clunky since it doesn’t show you the process ID of the process you are viewing. Incidentally, you can get the process ID for a Chrome tab via the Chrome Task manager (right-click on the tab bar).

hxd1

After about 9 attempts, I found the correct chrome.exe process and found the text of the discarded draft. Of course, by this time, my wife had already introduced another mitigation:

5. Rewrite the email.

hxd2

Posted in Data Recovery | 120 Comments

Vmware-hostd Listening on HTTPS Port 443

Recently, I needed to stand up a web server on my development machine to do some testing. Unfortunately, when I tried to bind to the default HTTPS port (443), I found out that some other process on my machine was already using it.

I used SysInternals TCPView tool to figure out who was using the port:

vmware-hostd1

Okay, now what? After a little googling, it turns out that VMWare Workstation has a feature called “Shared VMS” that I have no need for.

I disabled this feature under VMWare Workstation -> Edit -> Preferences and vmware-hostd.exe immediately stopped listening on port 443. Onwards with my testing!

vmware-hostd2

Posted in Sysinternals Tools, Troubleshooting | Leave a comment

Debugging Windows Service Startup Using Procdump

tl;dr: Sleeping and attaching a debugger? Meh. Writing copious log files? Meh. In the case of a crashing service, it’s much easier to collect the crashdump and analyze.

If you’ve spent much time developing Windows Service, you’ve probably run into the case where your service mysteriously crashes while it is starting. In cases like these, Windows isn’t always particularly helpful. For example, here’s what Windows has to say about my Crashy Service:

service_crash_eventlog

“The Crashy Service service terminated unexpectedly.” Thanks Windows!

Since the crash is happening during service startup, it’s a little tricky to attach a debugger (or downright impossible in customer environments). Instead, as an exercise, let’s use the Sysinternals procdump tool to collect the crash dump and Visual Studio 2013 to figure out what went wrong.

Here are the steps:

1. We set procdump as the system postmortem debugger using the -i flag. This allows procdump to write out the state of the process before it crashes.

service_crash_procdump

2. Now we reproduce the service startup crash. Before the service crashes, procdump will write the crashdump to the directory where we ran the -i command in the previous step.

3. (Optional) Revert the system postmortem debugger to the previous value using the procdump -u command.

service_crash_procdump_uninstall

4. Open the .dmp crashdump file in Visual Studio 2013. When I do this, I see an initial screen that tells me a bit more about the exception that led to the crash. I then have the option of digging into the crashdump via the “Debug with Native Only” action.

service_crash_vs00

service_crash_vs0

5. When I click on Debug with Native Only, I get access to the call stack as well as the exact line of code that caused the crash (give or take — nobody’s perfect!) In this case it appears that I’m trying to lower case a null string, which throws an access denied exception.

service_crash_vs1

service_crash_vs2

More info: I’m analyzing the binary on the same machine where I compiled it. In other situations, you will have to tell VS where your symbols are. Also, in general I would probably use WinDbg to analyze the crashdump as I have way more experience with it. !analyze -v is your friend.

Posted in Debugging, Sysinternals Tools | Leave a comment