Skip to content

Commit 354b76f

Browse files
committed
updated README.md
1 parent 64d96fc commit 354b76f

File tree

3 files changed

+13
-21
lines changed

3 files changed

+13
-21
lines changed

README.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
# IDT - Image Dataset Tool
22

3-
## Version 0.0.3 alpha
3+
## Version 0.0.4 alpha
44

5-
![idb](https://user-images.githubusercontent.com/47995046/92179763-be275d80-ee1b-11ea-8063-aa2b565616f6.png)
5+
![idt-logo](https://user-images.githubusercontent.com/47995046/93012317-cac35880-f575-11ea-9cfb-8b6a8a3242eb.png)
66

77

88
## Description
99

10-
The image dataset builder is a CLI tool developed to make it easier and faster to create image datasets to be used for deep learning. The tool achieves this by scraping images from several search engines such as duckgo, bing and deviantart. IDB also optimizes the image dataset, although this feature is optional, the user can downscale and compress the images for optimal file size and dimensions. An example dataset created using idb that contains 23.688 files weights only 559,2MBs.
10+
The image dataset tool (IDT) is a CLI app developed to make it easier and faster to create image datasets to be used for deep learning. The tool achieves this by scraping images from several search engines such as duckgo, bing and deviantart. IDT also optimizes the image dataset, although this feature is optional, the user can downscale and compress the images for optimal file size and dimensions. An example dataset created using idt that contains 23.688 files weights only 559,2MBs.
1111

1212
## Installing
1313

@@ -22,22 +22,23 @@ user@admin:~$ pip3 install idt
2222

2323

2424
```console
25-
user@admin:~$ git clone https://github.com/deliton/idb.git && cd idb
25+
user@admin:~$ git clone https://github.com/deliton/idt.git && cd idt
2626
user@admin:~$ sudo python3 setup.py install
2727

2828
```
2929

3030

3131
## Getting Started
3232

33-
The quickiest way to get started with IDB is running the simple "run" command. Just write in your facorite console something like:
33+
The quickiest way to get started with IDT is running the simple "run" command. Just write in your facorite console something like:
3434

3535
```console
3636
user@admin:~$ idt run -i apples
3737
```
3838

3939
This will quickly download 50 images of apples. By default it uses the duckgo search engne to do so.
4040
The run command accept the following options:
41+
4142
| Option | Description |
4243
| ----------- | ----------- |
4344
| **-i** or **--input** | the keyword to find the desired images. |
@@ -50,7 +51,7 @@ The run command accept the following options:
5051

5152
## Usage
5253

53-
IDB requires a config file that tells it how your dataset should be organized. You can create it using the following command:
54+
IDT requires a config file that tells it how your dataset should be organized. You can create it using the following command:
5455

5556
```console
5657
user@admin:~$ idt init
@@ -62,7 +63,7 @@ This command will trigger the config file creator and will ask for the desired d
6263
Insert a name to your dataset: : My favorite cars
6364
```
6465

65-
Then the tool will ask how many samples per search are required to mount your dataset. In order to build a good dataset for deep learning, many images are required and since we're using a search engine to scrape images, many searches with different keywords are required to mount a good sized dataset. This value will correspond to how many images should be downloaded at every search. In this example we need a dataset with 250 images in each class, and we'll use 5 keywords to mount each class. So if we type the number 50 here, IDB will download 50 images of every keyword provided. If we provide 5 keywords we should get the required 250 images.
66+
Then the tool will ask how many samples per search are required to mount your dataset. In order to build a good dataset for deep learning, many images are required and since we're using a search engine to scrape images, many searches with different keywords are required to mount a good sized dataset. This value will correspond to how many images should be downloaded at every search. In this example we need a dataset with 250 images in each class, and we'll use 5 keywords to mount each class. So if we type the number 50 here, IDT will download 50 images of every keyword provided. If we provide 5 keywords we should get the required 250 images.
6667

6768
```console
6869
How many samples per seach will be necessary? : 50
@@ -135,7 +136,7 @@ chevrolet impala on the road, chevrolet impala vintage car, chevrolet impala con
135136
Then repeat the process of filling class name and its keywords until you fill all the 4 classes required.
136137

137138
```console
138-
Dataset YAML file has been created sucessfully. Now run idb build to mount your dataset!
139+
Dataset YAML file has been created sucessfully. Now run idt build to mount your dataset!
139140
```
140141

141142
Your dataset configuration file has been created. Now just rust the following command and see the magic happen:
@@ -154,6 +155,9 @@ Downloading Chevrolet Impala 1967 car photos [#########################--------
154155

155156
At the end, all your images will be available in a folder with the dataset name. Also, a csv file with the dataset stats are also included in the dataset's root folder.
156157

158+
![idt-results](https://user-images.githubusercontent.com/47995046/93012667-808fa680-f578-11ea-82fc-7ebcb8ce3c41.png)
159+
160+
157161
## Split image dataset for Deep Learning
158162

159163
Since deep learning often requires you to split your dataset into a subset of training/validation folders, this project can also do this for you! Just run:

dataset.yaml

Lines changed: 0 additions & 13 deletions
This file was deleted.

idt/flickr_api.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,7 @@ def search(self):
4747
results = response.json()
4848
results = results['photos']
4949
if results['total'] == 0:
50+
progress.update(task1, advance=self.n_images)
5051
return 0
5152

5253
self.page += 1

0 commit comments

Comments
 (0)