Skip to content

Commit aa9ac11

Browse files
committed
rework slides so that we do not start with optimiser (SGD) straight away
1 parent 2a379bf commit aa9ac11

File tree

1 file changed

+29
-15
lines changed

1 file changed

+29
-15
lines changed

slides/slides.qmd

Lines changed: 29 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -113,14 +113,29 @@ Helping Today:
113113

114114
# Part 1: Neural-network basics -- and fun applications.
115115

116+
## Fitting a straight line I {.smaller}
116117

117-
## Stochastic gradient descent (SGD)
118+
- Consider the data:
119+
120+
| $x_{i}$ | $y_{i}$ |
121+
|:--------:|:-------:|
122+
| 1.0 | 2.1 |
123+
| 2.0 | 3.9 |
124+
| 3.0 | 6.2 |
125+
126+
- Wish to fit a function to the above data.
127+
$$f(x) = mx + c$$
128+
129+
- When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
118130

119-
- Generally speaking, most neural networks are fit/trained using SGD (or some variant of it).
131+
## Fitting a straight line II - SGD
132+
133+
- Simple problems like the previous can be solved analytically.
134+
- Generally speaking, most neural networks are fit/trained using Stochastic Gradient Descent (SGD) - or some variant of it.
120135
- To understand how one might fit a function with SGD, let's start with a straight line: $$y=mx+c$$
121136

122137

123-
## Fitting a straight line with SGD I {.smaller}
138+
## Fitting a straight line III - SGD {.smaller}
124139

125140
- **Question**---when we a differentiate a function, what do we get?
126141

@@ -137,7 +152,7 @@ $$\frac{dy}{dx} = m$$
137152
:::
138153

139154

140-
## Fitting a straight line with SGD II {.smaller}
155+
## Fitting a straight line IV - SGD {.smaller}
141156

142157
- **Answer**---a function's derivative gives a _vector_ which points in the direction of _steepest ascent_.
143158

@@ -164,10 +179,9 @@ $$-\frac{dy}{dx}$$
164179
:::
165180

166181

167-
## Fitting a straight line with SGD III {.smaller}
182+
## Fitting a straight line V - Cost fn {.smaller}
168183

169-
- When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
170-
- We therefore need a way of measuring how well a model's predictions match our observations.
184+
- We need a way of measuring how well a model's predictions match our observations.
171185

172186

173187
::: {.fragment .fade-in}
@@ -201,7 +215,7 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
201215
:::
202216

203217

204-
## Fitting a straight line with SGD IV {.smaller}
218+
## Fitting a straight line VI {.smaller}
205219

206220
:::: {.columns}
207221
::: {.column width="45%"}
@@ -210,18 +224,18 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
210224

211225
- Data: \ $\{x_{i}, y_{i}\}$
212226

213-
- Loss: \ $\frac{1}{n}\sum_{i=1}^{n}(y_{i} - x_{i})^{2}$
214-
215-
:::
216-
::: {.column width="55%"}
217-
218-
$$
227+
- Loss fn:
228+
- $$
219229
\begin{align}
220230
L_{\text{MSE}} &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - f(x_{i}))^{2}\\
221231
&= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - mx_{i} + c)^{2}
222232
\end{align}
223233
$$
234+
<!-- - Loss: \$\frac{1}{n}\sum_{i=1}^{n}(y_{i} - x_{i})^{2}$ -->
224235

236+
:::
237+
::: {.column width="55%"}
238+
![](https://images.squarespace-cdn.com/content/v1/5acbdd3a25bf024c12f4c8b4/1600368657769-5BJU5FK86VZ6UXZGRC1M/Mean+Squared+Error.png?format=2500w){width=65%}
225239
:::
226240
::::
227241

@@ -233,7 +247,7 @@ $$
233247
:::: {#placeholder}
234248
::::
235249

236-
$$m_{n + 1} = m_{n} - \frac{dL}{dm} \cdot l_{r}$$
250+
$$m_{t + 1} = m_{t} - \frac{dL}{dm} \cdot l_{r}$$
237251

238252
:::: {#placeholder}
239253
::::

0 commit comments

Comments
 (0)