Skip to content

Commit 92cae7e

Browse files
committed
swiglu
1 parent 09c9fb1 commit 92cae7e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

_pages/dat450/assignment2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ The figure below shows the design of the OLMo 2 Transformer. We will reimplement
4545

4646
Olmo 2 uses an MLP architecture called SwiGLU, which was introduced in [this paper](https://arxiv.org/pdf/2002.05202). (In the paper, this type of network is referred to as FFN<sub>SwiGLU</sub>, described on page 2, Equation 6. Swish<sub>1</sub> corresponds to PyTorch's [SiLU](https://docs.pytorch.org/docs/stable/generated/torch.nn.SiLU.html) activation.) The figure below shows the architecture visually.
4747

48-
<img src="https://raw.githubusercontent.com/ricj/dsai-nlp.github.io/refs/heads/master/_pages/dat450/olmo2_overview.svg" alt="Olmo2 overview" style="width:10%; height:auto;">
48+
<img src="https://raw.githubusercontent.com/ricj/dsai-nlp.github.io/refs/heads/master/_pages/dat450/swiglu.svg" alt="SwiGLU" style="width:10%; height:auto;">
4949

5050
**Sanity check.**
5151

0 commit comments

Comments
 (0)