|
213 | 213 | "source": [ |
214 | 214 | "<h2 style=\"color: #b51f2a\">What is automatic differentiation (autodiff)</h2>\n", |
215 | 215 | "\n", |
| 216 | + "In symbolic or numerical differentiation the derivatives are approximated using <u>finite differences</u> and can be very <u>computationally expensive</u> and scale poorly with the number of inputs.\n", |
| 217 | + "\n", |
216 | 218 | "`autodiff` is a way of **calculating gradients** that:\n", |
217 | 219 | "\n", |
218 | 220 | "- provides exact derivative values\n", |
219 | 221 | "- is more efficient than numerical methods ($O(n)$ linear in the cost of computing the value)\n", |
220 | 222 | "- can handle complex functions and models\n", |
221 | 223 | "- might struggle with non-differentiable functions\n", |
222 | | - "\n", |
223 | | - "In symbolic or numerical differentiation the derivatives are approximated using <u>finite differences</u> and can be very <u>computationally expensive</u> and scale poorly with the number of inputs.\n" |
| 224 | + "\n" |
224 | 225 | ] |
225 | 226 | }, |
226 | 227 | { |
|
277 | 278 | "\n", |
278 | 279 | "$z = x^2 + 3xy +1 \\ ; \\ u_1 = x^2 \\ ; \\ u_2 = 3xy \\ ; \\ z = u_1 + u_2 + 1$\n", |
279 | 280 | "\n", |
| 281 | + " <img src=\"fig/auto_diff_comparison_portrait.png\" alt=\"test\" style=\"width:60%; margin:auto; display:block;\"/>\n", |
| 282 | + "\n", |
280 | 283 | "### Forward mode\n", |
281 | 284 | "\n", |
282 | 285 | "- Propagates derivatives from inputs to outputs in a single forward pass in the computational graph by computing function values and their derivatives simultaneously.\n", |
|
292 | 295 | " - $\\frac{\\partial u_2}{\\partial x} = 3y, \\quad \\frac{\\partial u_2}{\\partial y} = 3x $\n", |
293 | 296 | " - Apply the chain rule:\n", |
294 | 297 | " - $\\frac{\\partial z}{\\partial x} = \\frac{\\partial z}{\\partial u_1} \\cdot \\frac{\\partial u_1}{\\partial x} + \\frac{\\partial z}{\\partial u_2} \\cdot \\frac{\\partial u_2}{\\partial x} = 2x + 3y $\n", |
295 | | - " - $\\frac{\\partial z}{\\partial y} = \\frac{\\partial z}{\\partial u_2} \\cdot \\frac{\\partial u_2}{\\partial y} = 3x $\n" |
| 298 | + " - $\\frac{\\partial z}{\\partial y} = \\frac{\\partial z}{\\partial u_2} \\cdot \\frac{\\partial u_2}{\\partial y} = 3x $\n", |
| 299 | + "\n", |
| 300 | + " more memory use at the price of speed\n" |
296 | 301 | ] |
297 | 302 | }, |
298 | 303 | { |
|
310 | 315 | "| Forward Mode | Few inputs, many outputs | O(n) per input | Physics simulations, sensitivity analysis |\n", |
311 | 316 | "| Reverse Mode | Many inputs, few outputs | O(n) per output | Deep learning, optimization problems |\n", |
312 | 317 | "\n", |
313 | | - "<a href=https://e-dorigatti.github.io/math/deep%20learning/2020/04/07/autodiff.html>\n", |
314 | | - " <img src=\"fig/compgraph.png\" alt=\"test\" style=\"width:60%; margin:auto; display:block;\"/>\n", |
315 | | - "</a>\n" |
| 318 | + "\n", |
| 319 | + "\n", |
| 320 | + " <img src=\"fig/auto_diff_comparison_portrait.png\" alt=\"test\" style=\"width:60%; margin:auto; display:block;\"/>\n", |
| 321 | + "\n" |
316 | 322 | ] |
317 | 323 | }, |
318 | 324 | { |
|
325 | 331 | "source": [ |
326 | 332 | "<h2 style=\"color: #b51f2a\">What is autograd?</h2>\n", |
327 | 333 | "\n", |
328 | | - "`autograd` is the name of a particular autodiff implementation commonly used in:\n", |
| 334 | + "`autograd` is the name of a particular autodiff implementation and is used interchangeably with `autodiff`.\n", |
| 335 | + "It's commonly used in:\n", |
329 | 336 | "\n", |
330 | 337 | "- **Machine learning and deep learning**: frameworks like PyTorch and JAX leverage autograd for gradient-based optimization.\n", |
331 | | - "- **Scientific computing**: libraries such as TensorFlow use autodiff for numerical modeling and solving differential equations.\n", |
| 338 | + "- **Scientific computing**: libraries use autodiff for numerical modeling and solving differential equations.\n", |
332 | 339 | "- **Optimization problems**: used in engineering and economics for parameter tuning.\n", |
333 | 340 | "- **Physics simulations**: computes gradients in complex simulations like fluid dynamics.\n", |
334 | 341 | "- **Probabilistic programming**: helps with Bayesian inference using gradient-based samplers.\n" |
|
2840 | 2847 | "metadata": { |
2841 | 2848 | "celltoolbar": "Edit Metadata", |
2842 | 2849 | "kernelspec": { |
2843 | | - "display_name": "Python 3 (ipykernel)", |
| 2850 | + "display_name": "malapa-cheetah-tutorial-2025", |
2844 | 2851 | "language": "python", |
2845 | 2852 | "name": "python3" |
2846 | 2853 | }, |
|
0 commit comments