Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 6 additions & 29 deletions docs/admin/administration.md
Original file line number Diff line number Diff line change
Expand Up @@ -642,41 +642,18 @@ This LED activity visually indicates a fault and that the device needs to be rep
longer in use by DAOS.
The LED of the VMD device will remain in this state until replaced by a new device.

!!! note
Full NVMe hot plug capability will be available and supported in DAOS 2.6 release.
Use is currently intended for testing only and is not supported for production.

- To use a newly added (hot-inserted) SSD it needs to be unbound from the kernel driver
and bound instead to a user-space driver so that the device can be used with DAOS.

To rebind a SSD on a single host, run the following command (replace SSD PCI address and
hostname with appropriate values):
- If VMD is not enabled, then in order to use a newly added (hot-inserted) SSD it needs to be
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be helpful to have more details on this procedure, such as how to use the spdk scripts to perform this task.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but you don't need to use SPDK, only the DMG command mentioned (nvme-rebind). am I misunderstanding your suggestion?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the changes make it more clear?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thansk for the update, which make it more clear for me.

unbound from the kernel driver and bound instead to a user-space driver so that the device can be
used with DAOS. To rebind an SSD on a single host, run the following command (replace SSD PCI
address and hostname with appropriate values):
```bash
$ dmg storage nvme-rebind -a 0000:84:00.0 -l wolf-167
Command completed successfully
```

The device will now be bound to a user-space driver (e.g. VFIO) and can be accessed by
DAOS I/O engine processes (and used in the following `dmg storage replace nvme` command
as a new device).

- Once an engine is using a newly added (hot-inserted) SSD it can be added to the persistent
NVMe config (stored on SCM) so that on engine restart the new device will be used.

To update the engine's persistent NVMe config with the new SSD transport address, run the
following command (replace SSD PCI address, engine index and hostname with appropriate values):
```bash
$ dmg storage nvme-add-device -a 0000:84:00.0 -e 0 -l wolf-167
Command completed successfully
```

The optional [--tier-index|-t] command parameter can be used to specify which storage tier to
insert the SSD into, if specified then the server will attempt to insert the device into the tier
specified by the index, if not specified then the server will attempt to insert the device into
the bdev tier with the lowest index value (the first bdev tier).

The device will now be registered in the engine's persistent NVMe config so that when restarted,
the newly added SSD will be used.
DAOS I/O engine processes. Now the new device can be used in the following
`dmg storage replace nvme` command.

- Replace an excluded SSD with a New Device:
```bash
Expand Down
11 changes: 5 additions & 6 deletions utils/config/daos_server.yml
Original file line number Diff line number Diff line change
Expand Up @@ -198,16 +198,16 @@
#socket_dir: ./.daos/daos_server
#
#
## Number of hugepages to allocate for DMA buffer memory
## Number of hugepages to allocate for DMA buffer memory (total value for all engines)
#
## Optional parameter that should only be set if overriding the automatically calculated value is #
## #necessary. Specifies the number (not size) of hugepages to allocate for use by NVMe through
## #SPDK. For optimum performance each target requires 1 GiB of hugepage space. The provided value
## Optional parameter that should only be set if overriding the automatically calculated value is
## necessary. Specifies the number (not size) of hugepages to allocate for use by NVMe through
## SPDK. For optimum performance each target requires 1 GiB of hugepage space. The provided value
## should be calculated by dividing the total amount of hugepages memory required for all targets
## across all engines on a host by the system hugepage size. If not set here, the value will be
## automatically calculated based on the number of targets (using the default system hugepage size).
#
## Example: (2 engines * (16 targets/engine * 1GiB)) / 2MiB hugepage size = 16834
## Example: (2 engines * (16 targets/engine * 1GiB)) / 2MiB hugepage size = 16384
#
## default: 0
#nr_hugepages: 0
Expand All @@ -228,7 +228,6 @@
#allow_numa_imbalance: true
#
#

## Allow DAOS server to run with transparent hugepages (THP) enabled on the host machine.
#
## WARNING: Transparent hugepages can conflict with how the DAOS server uses hugepages, and enabling
Expand Down
Loading