Complete system hang of AVD VM

**Check List**
  - [X] I checked my issue [doesn't exist yet](https://github.com/dokan-dev/dokany/issues?utf8=%E2%9C%93&q=is%3Aissue)
  - [ ] My issue is valid with mirror default sample and not specific to my user-mode driver implementation
  - [ ] I can always reproduce the issue with the provided description below.
  - [X] I have updated Dokany to the [latest version](https://github.com/dokan-dev/dokany/releases) and have reboot my computer after.
  - [ ] I tested one of the last [snapshot](https://github.com/dokan-dev/dokany/wiki/Build#user-snapshot) from appveyor CI

**Describe the bug**
We've a customer that is using our Dokan2 based network drive on a pool of about 70 Azure Virtual Desktop machines.
Every day 5-6 of them hang completely (the whole system is not responding anymore), which stops about 60 people from working.
The Azure support connected to a hanging system using (virtual) serial port and created a Kernel Dump by causing a BSOD (only thing still possible).
They found a problem with an Outlook process trying to read from our drive, and blocking lots of other kernel threads.


**To Reproduce**
Steps to reproduce the behavior:

Unfortunately, the issue cannot be reproduced by us, it (currently) only happens in the customers environment.

**Expected behavior**
System should not hang.

**Logs**
Please attach in separate files: mirror [output](https://github.com/dokan-dev/dokany/wiki/How-to-Debug-Dokan#mirrorfs), library logs and kernel [logs](https://github.com/dokan-dev/dokany/wiki/How-to-Debug-Dokan#get-logs).
In case of BSOD, please attach minidump or dump [analyze output](https://github.com/dokan-dev/dokany/wiki/How-to-Debug-Dokan#crash-report-bsod).

Memory Dump is about 1 GB, I'll ask the customer to allow the upload.

**Environment:**
- Windows version: Microsoft Windows 11 Enterprise multi-session
- Processor architecture: x64
- Dokany version: 2.3.0.1000
- Library type (Dokany/FUSE): Dokany

**Additional context**
The dump has been analyzed by Microsoft already, and they'll work together with us to solve the issue.
Also their analysis can be provided.

Mainly I need some information about internals of the driver to analyze the dump myself.
For example, in the dump I don't see any DokanProcessAndPullEvents thread running, and I'm not sure, if this is normal (because it's currently working on a request ?) or not.
And, is there a list of IRP/Requests, that have been pulled to user land ?
I know about the PendingIRP list, but I think that list contains all pending IRP, regardsless if they already have been pulled.

We see a lot of "No matching IRPs found for a reply" just before the issue occurs.
As far a I know, this means that a response for a request has been delivered from user land to the driver after the IRP has been canceled because of a timeout.
But this means, that requests are still pulled and executed by the user process, but too slow, right ?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Complete system hang of AVD VM #1311

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Complete system hang of AVD VM #1311

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions