11---
22title : " Introduction to Neural Networks with PyTorch"
3- subtitle : " ICCS Summer School 2024 "
3+ subtitle : " ICCS Summer School 2025 "
44bibliography : references.bib
55format :
66 revealjs :
@@ -22,9 +22,8 @@ authors:
2222 - name : Matt Archer
2323 affiliations : ICCS/Cambridge
2424 orcid : 0009-0002-7043-6769
25- - name : Surbhi Goel
25+ - name : Isaac Akanho
2626 affiliations : ICCS/Cambridge
27- orcid : 0009-0005-0237-756X
2827
2928revealjs-plugins :
3029 - attribution
@@ -37,19 +36,18 @@ revealjs-plugins:
3736:::: {.columns}
3837::: {.column width=50%}
3938
40- * 9:00-9:30 - NN lecture
41- * 9:30-10:30 - Teaching/Code-along
42- * 10:30-11:00 - Coffee
43- * 11:00-12 :00 - Teaching/Code-along
39+ ### Wednesday
40+ * 9:30-10:00 - NN lecture
41+ * 10:00-10:30 - Teaching/Code-along
42+ * 13:30-15 :00 - Teaching/Code-along
4443
45- Lunch
4644
47- * 12:00 - 13:30
45+ ### Thursday
46+
47+ * 9:30-10:30 - Teaching/Code-along
4848
4949::: {style="color: turquoise;"}
50- Helping Today:
5150
52- * Person 1 - Cambridge RSE
5351:::
5452:::
5553::::
@@ -189,39 +187,33 @@ $$-\frac{dy}{dx}$$
189187- When fitting a function, we are essentially creating a model, $f$, which describes some data, $y$.
190188- We therefore need a way of measuring how well a model's predictions match our observations.
191189
190+ ## Fitting a straight line with SGD IV {.smaller}
192191
193- ::: {.fragment .fade-in}
194192
195- :::: {.columns}
196- ::: {.column width="30%"}
193+ ![ ] ( error-line.png )
194+
195+ - We can measure the distance between $f(x_ {i})$ and $y_ {i}$.
196+
197+
198+ <!-- :::: {.columns} -->
199+ <!-- ::: {.column width="30%"} -->
197200
198- - Consider the data:
201+ <!-- - Consider the data:
199202
200203| $x_{i}$ | $y_{i}$ |
201204|:--------:|:-------:|
202205| 1.0 | 2.1 |
203206| 2.0 | 3.9 |
204- | 3.0 | 6.2 |
207+ | 3.0 | 6.2 | -->
205208
206- :::
207- ::: {.column width="70%"}
208- - We can measure the distance between $f(x_ {i})$ and $y_ {i}$.
209- - Normally we might consider the mean-squared error:
209+ ## Fitting a straight line with SGD V {.smaller}
210210
211- $$ L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2} $$
212211
213- :::
214- ::::
215-
216- :::
217-
218- ::: {.fragment .fade-in}
219- - We can differentiate the loss function w.r.t. to each parameter in the the model $f$.
220- - We can use these directions of steepest descent to iteratively 'nudge' the parameters in a direction which will reduce the loss.
221- :::
212+ <!-- ::: {.column width="70%"} -->
222213
214+ - Normally we might consider the mean-squared error:
223215
224- ## Fitting a straight line with SGD IV {.smaller}
216+ $$ L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2} $$
225217
226218:::: {.columns}
227219::: {.column width="45%"}
@@ -233,19 +225,43 @@ $$L_{\text{MSE}} = \frac{1}{n}\sum_{i=1}^{n}\left(y_{i} - f(x_{i})\right)^{2}$$
233225- Loss: \ $\frac{1}{n}\sum_ {i=1}^{n}(y_ {i} - x_ {i})^{2}$
234226
235227:::
236- ::: {.column width="55%"}
237228
229+ ::: {.column width="55%"}
230+
231+ - We can differentiate the loss function w.r.t. to each parameter in the the model $f$.
238232$$
239233\begin{align}
240234L_{\text{MSE}} &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - f(x_{i}))^{2}\\
241235 &= \frac{1}{n}\sum_{i=1}^{n}(y_{i} - mx_{i} + c)^{2}
242236\end{align}
243237$$
244-
245238:::
246239::::
247240
248- ::: {.fragment .fade-in}
241+
242+ ####
243+
244+ ## Fitting a straight line with SGD VI {.smaller}
245+
246+ - Differential:
247+
248+ $$
249+ \frac{\partial L}{\partial m}
250+ \;=\;
251+ \frac{1}{n}\sum_{i=1}^{n} 2\bigl(m\,x_{i}+c-y_{i}\bigr)\,x_{i}.
252+ $$
253+
254+ $$
255+ \frac{\partial L}{\partial c}
256+ \;=\;
257+ \frac{1}{n}\sum_{i=1}^{n} 2\bigl(m\,x_{i}+c-y_{i}\bigr).
258+ $$
259+
260+ - This gradient is used to find the parameters that ** minimise the loss** , thereby reducing overall error.
261+
262+
263+ ## Update Rule
264+
249265- We can iteratively minimise the loss by stepping the model's parameters in the direction of steepest descent:
250266
251267::: {layout="[ 0.5, 1, 0.5, 1, 0.5] "}
@@ -266,7 +282,6 @@ $$c_{n + 1} = c_{n} - \frac{dL}{dc} \cdot l_{r}$$
266282:::
267283
268284- where $l_ {\text{r}}$ is a small constant known as the _ learning rate_ .
269- :::
270285
271286
272287## Quick recap {.smaller}
@@ -305,7 +320,7 @@ $$a_{l+1} = \sigma \left( W_{l}a_{l} + b_{l} \right)$$
305320:::
306321::::
307322
308- ![ ] ( https://3b1b-posts.us-east-1.linodeobjects.com//images/topics/neural-networks.jpg ) {style="border-radius: 50%;" .absolute top=35% left=42.5% width=65%}
323+ ![ ] ( https://web.archive.org/web/20230105124836if_/https:// 3b1b-posts.us-east-1.linodeobjects.com//images/topics/neural-networks.jpg ) {style="border-radius: 50%;" .absolute top=35% left=42.5% width=65%}
309324
310325::: {.attribution}
311326Image source: [ 3Blue1Brown] ( https://www.3blue1brown.com/topics/neural-networks )
@@ -329,9 +344,178 @@ Image source: [3Blue1Brown](https://www.3blue1brown.com/topics/neural-networks)
329344
330345- In this workshop, we will implement some straightforward neural networks in PyTorch, and use them for different classification and regression problems.
331346- PyTorch is a deep learning framework that can be used in both Python and C++.
332- - I have never met anyone actually training models in C++; I find it a bit weird.
347+ - There are other frameworks like Jax, Tensorflow, PyTorch Lightning
333348- See the PyTorch website: [ https://pytorch.org/ ] ( https://pytorch.org/ )
334349
350+ # Datasets, DataLoaders & ` nn.Module `
351+
352+
353+ ---
354+
355+ ## What a ` Dataset ` class does
356+
357+ - Provides a ** uniform API** to your data
358+ - Handles
359+ - ** Loading** raw files (images, CSVs, audio …)
360+ - ** Train / validation / test** split logic
361+ - ** Transforms / augmentation** per item
362+ - ** Item retrieval** so the rest of PyTorch can stay agnostic
363+
364+ ---
365+
366+ ## Anatomy of a custom ` Dataset `
367+
368+ ``` python
369+ class MyDataset (torch .utils .data .Dataset ):
370+ def __init__ (self , root_dir , split = " train" , transform = None ):
371+ # 1️ load or download files / labels
372+ self .paths, self .labels = load_index_file(root_dir, split)
373+ self .transform = transform # 2️ save transforms
374+ ```
375+
376+ * The constructor is where you gather file paths, download archives, read CSVs, etc.*
377+
378+ ---
379+
380+ ## ` __len__ ` & ` __getitem__ `
381+
382+ ``` python
383+ def __len__ (self ):
384+ return len (self .paths) # total #samples
385+
386+ def __getitem__ (self , idx ):
387+ img = PIL .Image.open(self .paths[idx]).convert(" RGB" )
388+ if self .transform: # 3️ apply transforms
389+ img = self .transform(img)
390+ label = self .labels[idx]
391+ return img, label # 4️ single example
392+ ```
393+
394+ With these two methods PyTorch knows ** how big** the dataset is and ** how to fetch** one record.
395+
396+ ---
397+
398+ ## Using the custom dataset
399+
400+ ``` python
401+ from torchvision import transforms
402+
403+ train_ds = MyDataset(
404+ " data/cats_vs_dogs" ,
405+ split = " train" ,
406+ transform = transforms.ToTensor()
407+ )
408+ print (len (train_ds)) # e.g. ➜ 20_000
409+ img, y = train_ds[0 ] # one (tensor, label) pair
410+ ```
411+
412+ ---
413+
414+ ## The ** DataLoader** at a glance
415+
416+ - Wraps any ` Dataset ` in an ** iterable**
417+ - ** Batches** samples together
418+ - ** Shuffles** if asked
419+ - Uses ** multiprocessing** (` num_workers ` ) to pre‑fetch data in parallel
420+ - Returns ` (batch, labels) ` tuples ready for the GPU
421+
422+ ---
423+
424+ ## Typical DataLoader code
425+
426+ ``` python
427+ train_loader = torch.utils.data.DataLoader(
428+ dataset = train_ds,
429+ batch_size = 64 ,
430+ shuffle = True ,
431+ num_workers = 4 , # 4 CPU workers
432+ )
433+
434+ for images, labels in train_loader:
435+ ...
436+ ```
437+
438+
439+
440+ ---
441+
442+ ## Quick networks with ` nn.Sequential `
443+
444+ ``` python
445+ mlp = torch.nn.Sequential(
446+ torch.nn.Linear(784 , 256 ), torch.nn.ReLU(),
447+ torch.nn.Linear(256 , 64 ), torch.nn.ReLU(),
448+ torch.nn.Linear(64 , 10 )
449+ )
450+
451+ out = mlp(torch.rand(32 , 784 )) # 32‑sample batch
452+ ```
453+
454+ Great for simple feed‑forward stacks when no branching logic is needed.
455+
456+ ---
457+
458+ ## ` nn.Module ` overview
459+
460+ - The ** base class** for * all* neural‑network parts in PyTorch
461+ - You ** sub‑class** , then implement
462+ - ` __init__(self) ` : declare layers
463+ - ` forward(self, x) ` : define the forward pass
464+
465+ ---
466+
467+ ## Declaring layers in ` __init__ `
468+
469+ ``` python
470+ class MyCNN (torch .nn .Module ):
471+ def __init__ (self , num_classes = 2 ):
472+ super ().__init__ ()
473+ self .features = torch.nn.Sequential(
474+ torch.nn.Conv2d(3 , 32 , 3 , padding = 1 ), torch.nn.ReLU(),
475+ torch.nn.MaxPool2d(2 ),
476+ torch.nn.Conv2d(32 , 64 , 3 , padding = 1 ), torch.nn.ReLU(),
477+ torch.nn.MaxPool2d(2 )
478+ )
479+ self .classifier = torch.nn.Linear(64 * 56 * 56 , num_classes)
480+ ```
481+
482+ ---
483+
484+ ## The ` forward ` pass
485+
486+ ``` python
487+ def forward (self , x ):
488+ x = self .features(x) # conv stack
489+ x = x.flatten(1 ) # N,…
490+ x = self .classifier(x) # logits
491+ return x
492+ ```
493+
494+ Only ** forward** is needed – back‑prop is handled automatically.
495+
496+ ---
497+
498+ ## Calling the model ≈ calling ` forward `
499+
500+ ``` python
501+ model = MyCNN()
502+ logits1 = model(images) # preferred ✔
503+ logits2 = model.forward(images) # works, but avoid
504+ ```
505+
506+ ` model(input) ` internally routes to ` model.forward(input) ` via ` __call__ ` .
507+
508+ ---
509+
510+ ## Key Take‑Aways
511+
512+ 1 . ** Dataset** = organized access to * individual* samples
513+ 2 . ** DataLoader** = batching, shuffling, parallel I/O
514+ 3 . ` nn.Module ` = reusable building block; override ` __init__ ` & ` forward `
515+ 4 . ` model(x) ` is the idiomatic way to run a forward pass
516+ 5 . Use ` nn.Sequential ` for quick layer chains
517+
518+
335519
336520# Exercises
337521
@@ -506,13 +690,13 @@ For more information we can be reached at:
506690
507691::: {.column width="25%"}
508692
509- {{< fa pencil >}} \ Surbhi Goel
693+ {{< fa pencil >}} \ Isaac Akanho
510694
511695{{< fa solid person-digging >}} \ [ ICCS/UoCambridge] ( https://iccs.cam.ac.uk/about-us/our-team )
512696
513- {{< fa solid envelope >}} \ [ sg2147 [ AT] cam.ac.uk] ( mailto:sg2147 @cam.ac.uk )
697+ {{< fa solid envelope >}} \ [ ia464 [ AT] cam.ac.uk] ( mailto:ia464 @cam.ac.uk )
514698
515- {{< fa brands github >}} \ [ surbhigoel77 ] ( https://github.com/surbhigoel77 )
699+ {{< fa brands github >}} \ [ isaacaka ] ( https://github.com/isaacaka )
516700
517701:::
518702
0 commit comments