mlp section

ricj · ricj · commit dc06cc9ac586 · 2025-11-10T09:43:27.000+01:00
diff --git a/_pages/dat450/assignment2.md b/_pages/dat450/assignment2.md
@@ -51,6 +51,8 @@ The relevant hyperparameters you need to take into account here are `hidden_size
 
 **Sanity check.**
 
+Create an untrained MLP layer. Create some 3-dimensional tensor where the last dimension has the same size as `hidden_size` in your MLP. If you apply the MLP to the test tensor, the output should have the same size as the input.
+
 ### Normalization
 
 To stabilize gradients during training, deep learning models with many layers often include some *normalization* (such as batch normalization or layer normalization). Transformers typically includes normalization layers at several places in the stack.
@@ -63,6 +65,8 @@ If you want to make your own layer, the PyTorch documentation shows the formula
 
 **Sanity check.**
 
+You can test this in the same way as you tested the MLP previously.
+
 ### Multi-head attention
 
 Let's take the trickiest part first!