Skip to content

Commit 3bc9fb6

Browse files
committed
Correct merge issues in 01
1 parent 8dcc19e commit 3bc9fb6

File tree

1 file changed

+2
-135
lines changed

1 file changed

+2
-135
lines changed

exercises/01_penguin_classification.ipynb

Lines changed: 2 additions & 135 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
"cell_type": "markdown",
2121
"metadata": {},
2222
"source": [
23-
"### Setup\n",
23+
"### Colab Setup\n",
2424
"Run the following cell to install the code and dependencies from github."
2525
]
2626
},
@@ -37,7 +37,7 @@
3737
"cell_type": "markdown",
3838
"metadata": {},
3939
"source": [
40-
"### Task 1: look at the data\n",
40+
"### Task 1 -- Part (a): look at the data\n",
4141
"In the following code block, we import the ``load_penguins`` function from the ``palmerpenguins`` package.\n",
4242
"\n",
4343
"- Call this function, which returns a single object, and assign it to the variable ``data``.\n",
@@ -296,117 +296,6 @@
296296
" return feats, tgt"
297297
]
298298
},
299-
{
300-
"cell_type": "code",
301-
"execution_count": null,
302-
"metadata": {},
303-
"outputs": [],
304-
"source": [
305-
"from typing import List, Tuple, Any\n",
306-
"\n",
307-
"# import some useful functions here, see https://pytorch.org/docs/stable/torch.html\n",
308-
"# where `tensor` and `eye` are used for constructing tensors,\n",
309-
"# and using a lower-precision float32 is advised for performance\n",
310-
"# Task 4: add imports here\n",
311-
"# from torch import tensor, eye, float32\n",
312-
"\n",
313-
"from torch.utils.data import Dataset\n",
314-
"\n",
315-
"from palmerpenguins import load_penguins\n",
316-
"\n",
317-
"\n",
318-
"class PenguinDataset(Dataset):\n",
319-
" \"\"\"Penguin dataset class.\n",
320-
"\n",
321-
" Parameters\n",
322-
" ----------\n",
323-
" input_keys : List[str]\n",
324-
" The column titles to use in the input feature vectors.\n",
325-
" target_keys : List[str]\n",
326-
" The column titles to use in the target feature vectors.\n",
327-
" train : bool\n",
328-
" If ``True``, this object will serve as the training set, and if\n",
329-
" ``False``, the validation set.\n",
330-
"\n",
331-
" Notes\n",
332-
" -----\n",
333-
" The validation split contains 10 male and 10 female penguins of each\n",
334-
" species.\n",
335-
"\n",
336-
" \"\"\"\n",
337-
"\n",
338-
" def __init__(\n",
339-
" self,\n",
340-
" input_keys: List[str],\n",
341-
" target_keys: List[str],\n",
342-
" train: bool,\n",
343-
" ):\n",
344-
" \"\"\"Build ``PenguinDataset``.\"\"\"\n",
345-
" self.input_keys = input_keys\n",
346-
" self.target_keys = target_keys\n",
347-
"\n",
348-
" data = load_penguins()\n",
349-
" data = (\n",
350-
" data.loc[~data.isna().any(axis=1)]\n",
351-
" .sort_values(by=sorted(data.keys()))\n",
352-
" .reset_index(drop=True)\n",
353-
" )\n",
354-
" # Transform the sex field into a float, with male represented by 1.0, female by 0.0\n",
355-
" data.sex = (data.sex == \"male\").astype(float)\n",
356-
" self.full_df = data\n",
357-
"\n",
358-
" valid_df = self.full_df.groupby(by=[\"species\", \"sex\"]).sample(\n",
359-
" n=10,\n",
360-
" random_state=123,\n",
361-
" )\n",
362-
" # The training items are simply the items *not* in the valid split\n",
363-
" train_df = self.full_df.loc[~self.full_df.index.isin(valid_df.index)]\n",
364-
"\n",
365-
" self.split = {\"train\": train_df, \"valid\": valid_df}[\n",
366-
" \"train\" if train is True else \"valid\"\n",
367-
" ]\n",
368-
"\n",
369-
" def __len__(self) -> int:\n",
370-
" \"\"\"Return the length of requested split.\n",
371-
"\n",
372-
" Returns\n",
373-
" -------\n",
374-
" int\n",
375-
" The number of items in the dataset.\n",
376-
"\n",
377-
" \"\"\"\n",
378-
" return len(self.split)\n",
379-
"\n",
380-
" def __getitem__(self, idx: int) -> Tuple[Any, Any]:\n",
381-
" \"\"\"Return an input-target pair.\n",
382-
"\n",
383-
" Parameters\n",
384-
" ----------\n",
385-
" idx : int\n",
386-
" Index of the input-target pair to return.\n",
387-
"\n",
388-
" Returns\n",
389-
" -------\n",
390-
" in_feats : Any\n",
391-
" Inputs.\n",
392-
" target : Any\n",
393-
" Targets.\n",
394-
"\n",
395-
" \"\"\"\n",
396-
" # get the row index (idx) from the dataframe and\n",
397-
" # select relevant column features (provided as input_keys)\n",
398-
" feats = tuple(self.split.iloc[idx][self.input_keys])\n",
399-
"\n",
400-
" # this gives a 'species' i.e. one of ('Gentoo',), ('Chinstrap',), or ('Adelie',)\n",
401-
" tgts = tuple(self.split.iloc[idx][self.target_keys])\n",
402-
"\n",
403-
" # Task 4 - Exercise #1: convert the features to PyTorch Tensors\n",
404-
"\n",
405-
" # Task 4 - Exercise #2: convert target to a 'one-hot' vector.\n",
406-
"\n",
407-
" return feats, tgts"
408-
]
409-
},
410299
{
411300
"cell_type": "markdown",
412301
"metadata": {},
@@ -518,28 +407,6 @@
518407
"Instantiate the `torchvision.transforms.Compose` transformations and pass to the `PenguinsDataset` in [src/ml_workshop/_penguins.py](../src/ml_workshop/_penguins.py), instead of hardcoding as above. "
519408
]
520409
},
521-
{
522-
"cell_type": "code",
523-
"execution_count": 1,
524-
"metadata": {},
525-
"outputs": [],
526-
"source": [
527-
"# Apply transforms to the data. See Task 4 exercise comments above.\n",
528-
"\n",
529-
"# Create train_set\n",
530-
"\n",
531-
"# Create valid_set\n"
532-
]
533-
},
534-
{
535-
"cell_type": "markdown",
536-
"metadata": {},
537-
"source": [
538-
"### (Optional) Task 4b: \n",
539-
"\n",
540-
"Apply the `torchvision.transforms.Compose` transformations instead of hardcoding as above. "
541-
]
542-
},
543410
{
544411
"cell_type": "code",
545412
"execution_count": null,

0 commit comments

Comments
 (0)