@@ -133,14 +133,29 @@ Helping Today:
133133
134134# Part 1: Neural-network basics -- and fun applications.
135135
136+ ## Fitting a straight line I {.smaller}
136137
137- ## Stochastic gradient descent (SGD)
138+ - Consider the data:
139+
140+ | $x_ {i}$ | $y_ {i}$ |
141+ | :--------:| :-------:|
142+ | 1.0 | 2.1 |
143+ | 2.0 | 3.9 |
144+ | 3.0 | 6.2 |
145+
146+ - Wish to fit a function to the above data.
147+ $$ f(x) = mx + c $$
148+
149+ - When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
138150
139- - Generally speaking, most neural networks are fit/trained using SGD (or some variant of it).
151+ ## Fitting a straight line II - SGD
152+
153+ - Simple problems like the previous can be solved analytically.
154+ - Generally speaking, most neural networks are fit/trained using Stochastic Gradient Descent (SGD) - or some variant of it.
140155- To understand how one might fit a function with SGD, let's start with a straight line: $$ y=mx+c $$
141156
142157
143- ## Fitting a straight line with SGD I {.smaller}
158+ ## Fitting a straight line III - SGD {.smaller}
144159
145160- ** Question** ---when we a differentiate a function, what do we get?
146161
@@ -157,7 +172,7 @@ $$\frac{dy}{dx} = m$$
157172:::
158173
159174
160- ## Fitting a straight line with SGD II {.smaller}
175+ ## Fitting a straight line IV - SGD {.smaller}
161176
162177- ** Answer** ---a function's derivative gives a _ vector_ which points in the direction of _ steepest ascent_ .
163178
@@ -184,10 +199,9 @@ $$-\frac{dy}{dx}$$
184199:::
185200
186201
187- ## Fitting a straight line with SGD III {.smaller}
202+ ## Fitting a straight line V - Cost fn {.smaller}
188203
189- - When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
190- - We therefore need a way of measuring how well a model's predictions match our observations.
204+ - We need a way of measuring how well a model's predictions match our observations.
191205
192206
193207::: {.fragment .fade-in}
@@ -221,7 +235,7 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
221235:::
222236
223237
224- ## Fitting a straight line with SGD IV {.smaller}
238+ ## Fitting a straight line VI {.smaller}
225239
226240:::: {.columns}
227241::: {.column width="45%"}
@@ -230,18 +244,18 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
230244
231245- Data: \ $\{ x_ {i}, y_ {i}\} $
232246
233- - Loss: \ $\frac{1}{n}\sum_ {i=1}^{n}(y_ {i} - x_ {i})^{2}$
234-
235- :::
236- ::: {.column width="55%"}
237-
238- $$
247+ - Loss fn:
248+ - $$
239249\begin{align}
240250L_{\text{MSE}} &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - f(x_{i}))^{2}\\
241251 &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - mx_{i} + c)^{2}
242252\end{align}
243253$$
254+ <!-- - Loss: \$\frac{1}{n}\sum_{i=1}^{n}(y_{i} - x_{i})^{2}$ -->
244255
256+ :::
257+ ::: {.column width="55%"}
258+ ![ ] ( https://images.squarespace-cdn.com/content/v1/5acbdd3a25bf024c12f4c8b4/1600368657769-5BJU5FK86VZ6UXZGRC1M/Mean+Squared+Error.png?format=2500w ) {width=65%}
245259:::
246260::::
247261
253267:::: {#placeholder}
254268::::
255269
256- $$ m_{n + 1} = m_{n } - \frac{dL}{dm} \cdot l_{r} $$
270+ $$ m_{t + 1} = m_{t } - \frac{dL}{dm} \cdot l_{r} $$
257271
258272:::: {#placeholder}
259273::::
0 commit comments