Skip to content

Commit f667f03

Browse files
committed
Remove cudnn/README.md.
And move its contents into its rustdocs. I also did a light line edit of the text while moving it, to improve readability.
1 parent 5982a89 commit f667f03

File tree

2 files changed

+130
-89
lines changed

2 files changed

+130
-89
lines changed

crates/cudnn/README.md

Lines changed: 0 additions & 88 deletions
This file was deleted.

crates/cudnn/src/lib.rs

Lines changed: 130 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,134 @@
1+
//! # cudnn
2+
//! Type safe cuDNN wrapper for the Rust programming language.
3+
//!
4+
//! ## Project status
5+
//!
6+
//! The current version of cuDNN targeted by this wrapper is 8.3.2. You can refer to the official
7+
//! [release notes] and to the [support matrix] by NVIDIA.
8+
//!
9+
//! [release notes]: https://docs.nvidia.com/deeplearning/cudnn/release-notes/index.html
10+
//! [support matrix]: https://docs.nvidia.com/deeplearning/cudnn/support-matrix/index.html
11+
//!
12+
//! The legacy API is somewhat complete and usable but the backend API is still a work in progress
13+
//! and its usage is discouraged. Both APIs are still being developed so expect bugs and reasonable
14+
//! breaking changes whilst using this crate.
15+
//!
16+
//! The project is part of the Rust CUDA ecosystem and is actively maintained by
17+
//! [frjnn](https://github.com/frjnn).
18+
//!
19+
//! ## Primer
20+
//!
21+
//! Here follows a list of useful concepts that should be taken as a handbook for the users of the
22+
//! crate. This is not intended to be the full documentation, as each wrapped struct, enum and
23+
//! function has its own docs, but rather a quick sum up of the key points of the API. For a deeper
24+
//! view, you should refer both to the docs of each item and to the [official ones] by NVIDIA.
25+
//! Furthermore, if you are new to cuDNN we strongly suggest reading the [official developer
26+
//! guide].
27+
//!
28+
//! [official ones]: https://docs.nvidia.com/deeplearning/cudnn/api/index.html
29+
//! [official developer guide]: https://docs.nvidia.com/deeplearning/cudnn/developer-guide/index.html#overview
30+
//!
31+
//! ### Device buffers
32+
//!
33+
//! This crate is built around [`cust`](https://docs.rs/cust/latest/cust/index.html) which is the
34+
//! core wrapper for interfacing with the CUDA driver API of our choice.
35+
//!
36+
//! ### cuDNN statuses and Result
37+
//!
38+
//! All cuDNN library functions return their status. This crate uses
39+
//! [`Result`](https://doc.rust-lang.org/std/result/enum.Result.html) to achieve a leaner,
40+
//! idiomatic and easier to manage API.
41+
//!
42+
//! ### cuDNN handles and RAII
43+
//!
44+
//! The main entry point of the cuDNN library is the `CudnnContext` struct. This handle is tied to
45+
//! a device and it is explicitly passed to every subsequent library function that operates on GPU
46+
//! data. It manages resources allocations both on the host and the device and takes care of the
47+
//! synchronization of all the the cuDNN primitives.
48+
//!
49+
//! The handles, and the other cuDNN structs wrapped by this crate, are implementors of the
50+
//! [`Drop`](https://doc.rust-lang.org/std/ops/trait.Drop.html) trait which implicitly calls their
51+
//! destructors on the cuDNN side when they go out of scope.
52+
//!
53+
//! cuDNN contexts can be created as shown in the following snippet:
54+
//!
55+
//! ```rust
56+
//! use cudnn::CudnnContext;
57+
//!
58+
//! let ctx = CudnnContext::new().unwrap();
59+
//! ```
60+
//!
61+
//! ### cuDNN data types
62+
//!
63+
//! In order to enforce type safety as much as possible at compile time, we shifted away from the
64+
//! original cuDNN enumerated data types and instead opted to leverage Rust's generics. In
65+
//! practice, this means that specifying the data type of a cuDNN tensor descriptor is done as
66+
//! follows:
67+
//!
68+
//! ```rust
69+
//! use cudnn::{CudnnContext, TensorDescriptor};
70+
//!
71+
//! let ctx = CudnnContext::new().unwrap();
72+
//!
73+
//! let shape = &[5, 5, 10, 25];
74+
//! let strides = &[1250, 250, 25, 1];
75+
//!
76+
//! // f32 tensor
77+
//! let desc = TensorDescriptor::<f32>::new_strides(shape, strides).unwrap();
78+
//! ```
79+
//!
80+
//! This API also allows for using Rust own types as cuDNN data types, which we see as a desirable
81+
//! property.
82+
//!
83+
//! Safely manipulating cuDNN data types that do not have any such direct match, such as vectorized
84+
//! ones, whilst still performing compile time compatibility checks can be done as follows:
85+
//!
86+
//! ```rust
87+
//! use cudnn::{CudnnContext, TensorDescriptor, Vec4};
88+
//!
89+
//! let ctx = CudnnContext::new().unwrap();
90+
//!
91+
//! let shape = &[4, 32, 32, 32];
92+
//!
93+
//! // in cuDNN this is equal to the INT8x4 data type and CUDNN_TENSOR_NCHW_VECT_C format
94+
//! let desc = TensorDescriptor::<i8>::new_vectorized::<Vec4>(shape).unwrap();
95+
//! ```
96+
//!
97+
//! The previous tensor descriptor can be used together with a `i8` device buffer and cuDNN will
98+
//! see it as being a tensor of `CUDNN_TENSOR_NCHW_VECT_C` format and `CUDNN_DATA_INT8x4` data
99+
//! type.
100+
//!
101+
//! Currently this crate does not support `f16` and `bf16` data types.
102+
//!
103+
//! ### cuDNN tensor formats
104+
//!
105+
//! We decided not to check tensor format configurations at compile time, because it is too strong
106+
//! a requirement. As a consequence, should you mess up, the program will fail at run-time. A
107+
//! proper understanding of the cuDNN API mechanics is thus fundamental to properly use this crate.
108+
//!
109+
//! You can refer to this [extract] from the cuDNN developer guide to learn more about tensor
110+
//! formats.
111+
//!
112+
//! [extract]: https://docs.nvidia.com/deeplearning/cudnn/developer-guide/index.html#data-layout-formats
113+
//!
114+
//! We split the original cuDNN tensor format enum, which counts 3 variants, in 2 parts: the
115+
//! `ScalarC` enum and the `TensorFormat::NchwVectC` enum variant. The former stands for "scalar
116+
//! channel" and it encapsulates the `Nchw` and `Nhwc` formats. Scalar channel formats can be both
117+
//! converted to the `TensorFormat` enum with
118+
//! [`.into()`](https://doc.rust-lang.org/std/convert/trait.Into.html).
119+
//!
120+
//! ```rust
121+
//! use cudnn::{ScalarC, TensorFormat};
122+
//!
123+
//! let sc_fmt = ScalarC::Nchw;
124+
//!
125+
//! let vc_fmt = TensorFormat::NchwVectC;
126+
//!
127+
//! let sc_to_tf: TensorFormat = sc_fmt.into();
128+
//! ```
129+
1130
#![deny(rustdoc::broken_intra_doc_links)]
2-
#[doc = include_str!("../README.md")]
131+
3132
mod activation;
4133
mod attention;
5134
mod backend;

0 commit comments

Comments
 (0)