@@ -113,14 +113,29 @@ Helping Today:
113113
114114# Part 1: Neural-network basics -- and fun applications.
115115
116+ ## Fitting a straight line I {.smaller}
116117
117- ## Stochastic gradient descent (SGD)
118+ - Consider the data:
119+
120+ | $x_ {i}$ | $y_ {i}$ |
121+ | :--------:| :-------:|
122+ | 1.0 | 2.1 |
123+ | 2.0 | 3.9 |
124+ | 3.0 | 6.2 |
125+
126+ - Wish to fit a function to the above data.
127+ $$ f(x) = mx + c $$
128+
129+ - When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
118130
119- - Generally speaking, most neural networks are fit/trained using SGD (or some variant of it).
131+ ## Fitting a straight line II - SGD
132+
133+ - Simple problems like the previous can be solved analytically.
134+ - Generally speaking, most neural networks are fit/trained using Stochastic Gradient Descent (SGD) - or some variant of it.
120135- To understand how one might fit a function with SGD, let's start with a straight line: $$ y=mx+c $$
121136
122137
123- ## Fitting a straight line with SGD I {.smaller}
138+ ## Fitting a straight line III - SGD {.smaller}
124139
125140- ** Question** ---when we a differentiate a function, what do we get?
126141
@@ -137,7 +152,7 @@ $$\frac{dy}{dx} = m$$
137152:::
138153
139154
140- ## Fitting a straight line with SGD II {.smaller}
155+ ## Fitting a straight line IV - SGD {.smaller}
141156
142157- ** Answer** ---a function's derivative gives a _ vector_ which points in the direction of _ steepest ascent_ .
143158
@@ -164,10 +179,9 @@ $$-\frac{dy}{dx}$$
164179:::
165180
166181
167- ## Fitting a straight line with SGD III {.smaller}
182+ ## Fitting a straight line V - Cost fn {.smaller}
168183
169- - When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
170- - We therefore need a way of measuring how well a model's predictions match our observations.
184+ - We need a way of measuring how well a model's predictions match our observations.
171185
172186
173187::: {.fragment .fade-in}
@@ -201,7 +215,7 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
201215:::
202216
203217
204- ## Fitting a straight line with SGD IV {.smaller}
218+ ## Fitting a straight line VI {.smaller}
205219
206220:::: {.columns}
207221::: {.column width="45%"}
@@ -210,18 +224,18 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
210224
211225- Data: \ $\{ x_ {i}, y_ {i}\} $
212226
213- - Loss: \ $\frac{1}{n}\sum_ {i=1}^{n}(y_ {i} - x_ {i})^{2}$
214-
215- :::
216- ::: {.column width="55%"}
217-
218- $$
227+ - Loss fn:
228+ - $$
219229\begin{align}
220230L_{\text{MSE}} &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - f(x_{i}))^{2}\\
221231 &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - mx_{i} + c)^{2}
222232\end{align}
223233$$
234+ <!-- - Loss: \$\frac{1}{n}\sum_{i=1}^{n}(y_{i} - x_{i})^{2}$ -->
224235
236+ :::
237+ ::: {.column width="55%"}
238+ ![ ] ( https://images.squarespace-cdn.com/content/v1/5acbdd3a25bf024c12f4c8b4/1600368657769-5BJU5FK86VZ6UXZGRC1M/Mean+Squared+Error.png?format=2500w ) {width=65%}
225239:::
226240::::
227241
233247:::: {#placeholder}
234248::::
235249
236- $$ m_{n + 1} = m_{n } - \frac{dL}{dm} \cdot l_{r} $$
250+ $$ m_{t + 1} = m_{t } - \frac{dL}{dm} \cdot l_{r} $$
237251
238252:::: {#placeholder}
239253::::
0 commit comments