Skip to content

Commit 866d687

Browse files
committed
Preliminary update for 2025.
1 parent 8afee28 commit 866d687

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

_pages/dat450/assignment1.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ nav_order: 4
99

1010
# DAT450/DIT247: Programming Assignment 1: Introduction to language modeling
1111

12+
## <span style="color:red">[Still under construction as of Oct. 29]</span>
13+
1214
*Language modeling* is the foundation that recent advances in NLP technlogies build on. In essence, language modeling means that we learn how to imitate the language that we observe in the wild. More formally, we want to train a system that models the statistical distribution of natural language. Solving this task is exactly what the famous commercial large language models do (with some additional post-hoc tweaking to make the systems more interactive and avoid generating provocative outputs).
1315

1416
In the course, we will cover a variety of technical solutions to this fundamental task (in most cases, various types of Transformers). In this first assignment of the course, we are going to build a neural network-based language model that uses *recurrent* neural networks (RNNs) to model the interaction between words.
@@ -28,9 +30,7 @@ We expect that you can program in Python and that you have some knowledge of bas
2830

2931
On the theoretical side, you will need to remember fundamental concepts related to neural networks such as forward and backward passes, batches, initialization, optimization.
3032

31-
On the practical side, you will need to understand the basics of PyTorch such as tensors, models, optimizers, loss functions and how to write the training loop. (If you need a refresher, there are plenty of tutorials available, for instance on the [PyTorch website](https://pytorch.org/tutorials/).)
32-
33-
https://docs.pytorch.org/tutorials/beginner/basics/optimization_tutorial.html
33+
On the practical side, you will need to understand the basics of PyTorch such as tensors, models, optimizers, loss functions and how to write the training loop. (If you need a refresher, there are plenty of tutorials available, for instance on the [PyTorch website](https://pytorch.org/tutorials/).) In particular, the [Optimizing Model Parameters tutorial](https://docs.pytorch.org/tutorials/beginner/basics/optimization_tutorial.html) contains more or less everything you need to know for this assignment about PyTorch training loops.
3434

3535
### Submission requirements
3636

@@ -263,7 +263,7 @@ Starting from the skeleton Python code, your task now is to complete the missing
263263
The missing parts you need to provide are
264264
- Setting up the optimizer, which is the PyTorch utility that updates model parameters during the training loop. The optimizer typically implements some variant of [stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent). We recommend [`AdamW`](https://pytorch.org/docs/stable/generated/torch.optim.AdamW.html), which is used to train most LLMs.
265265
- Setting up the `DataLoader`s for the training and validation sets. The datasets are provided as inputs, and you can simply create the `DataLoader`s as in Part 2.
266-
- The training loop itself, which is where most of your work will be done. Recall how you iterated through the batches in Part 2.
266+
- The training loop itself, which is where most of your work will be done.
267267

268268
Hyperparameters that control the training should be stored in a [TrainingArguments](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.TrainingArguments) object. HuggingFace defines a large number of such hyperparameters but you only need to consider a few of them. The skeleton code includes a hint that lists the relevant hyperparameters.
269269

0 commit comments

Comments
 (0)