Skip to content

Commit cf9bb61

Browse files
committed
pa4 more edits
1 parent 9decfea commit cf9bb61

File tree

1 file changed

+0
-11
lines changed

1 file changed

+0
-11
lines changed

_pages/dat450/assignment4.md

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -75,14 +75,11 @@ Since these 52k data points make fine-tuning longer, for simplicity in this cour
7575
To get a clear idea of how to complete the assignment, you can start with the skeleton code available here: `/data/courses/2025_dat450_dit247/assignments/a4`. It looks like this:
7676
```bash
7777
.
78-
├── __init__.py
7978
├── data_utils.py
8079
├── lora.py
8180
├── main.py
8281
├── predict.py
8382
└── utils.py
84-
85-
That's one directory with five files.
8683
```
8784

8885
In short, you need to fill in the incomplete parts of `data_utils.py` and `lora.py`. The other files contain helpful functions to run the assignment, but it’s highly recommended to review the documented code to understand the structure of the project. To ensure your code works correctly, you can follow these instructions and run the code either:
@@ -92,11 +89,6 @@ Using the prebuilt environment:
9289
python3 main.py
9390
```
9491

95-
Or, if you want to try new materials using **uv**:
96-
```bash
97-
uv run --python 3.12 python main.py
98-
```
99-
10092
## Step 0: Preprocessing
10193
Create a Dataset by loading Alpaca training set that already downloaded for you.
10294

@@ -227,7 +219,6 @@ Sample with prompt:
227219
</div>
228220
</details>
229221

230-
231222
Pre-trained LLMs are simply autoregressive models (next-token predictors); they learn patterns in text, not how to follow instructions. Therefore, SFT can enhance LLMs by teaching them how to answer tasks directly, structure their outputs, respond helpfully, and more. In real commercial systems like ChatGPT, Claude, and others, instruction tuning followed by reinforcement learning are crucial steps that make the models more practically useful. As mentioned before, Alpaca serves as a starting point to help our simple LLM (OLMo) adopt similar features. To achieve this, we define some templates based on the presence or absence of input for the Alpaca dataset.
232223

233224
```python
@@ -406,8 +397,6 @@ def make_trainer(
406397
return trainer
407398
```
408399

409-
410-
411400
## Step 2: Full fine-tuning (SFT dataset)
412401
Next, we train the pre-trained model using SFT (over all the parameters), then calculate the metrics and outputs to evaluate how well it follows instructions.
413402

0 commit comments

Comments
 (0)