Skip to content

Commit dc06cc9

Browse files
committed
mlp section
1 parent 7da9465 commit dc06cc9

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

_pages/dat450/assignment2.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,8 @@ The relevant hyperparameters you need to take into account here are `hidden_size
5151

5252
**Sanity check.**
5353

54+
Create an untrained MLP layer. Create some 3-dimensional tensor where the last dimension has the same size as `hidden_size` in your MLP. If you apply the MLP to the test tensor, the output should have the same size as the input.
55+
5456
### Normalization
5557

5658
To stabilize gradients during training, deep learning models with many layers often include some *normalization* (such as batch normalization or layer normalization). Transformers typically includes normalization layers at several places in the stack.
@@ -63,6 +65,8 @@ If you want to make your own layer, the PyTorch documentation shows the formula
6365

6466
**Sanity check.**
6567

68+
You can test this in the same way as you tested the MLP previously.
69+
6670
### Multi-head attention
6771

6872
Let's take the trickiest part first!

0 commit comments

Comments
 (0)