Skip to content

Commit 304002e

Browse files
committed
Merge branch 'master' into geotiff-download-80
2 parents f2d7ca0 + a694ecb commit 304002e

38 files changed

+832
-267
lines changed

.circleci/config.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,27 @@ jobs:
2727
sudo pip install .
2828
python -m unittest discover -v -s test/integration
2929
30+
- add_ssh_keys:
31+
fingerprints:
32+
- "79:16:39:74:e9:b3:39:52:87:2c:90:aa:ee:3c:09:13"
33+
34+
- run:
35+
name: Deploy documentation
36+
command: |
37+
if [ "${CIRCLE_BRANCH}" == "${PRODUCTION_BRANCH}" ]; then
38+
cd docs
39+
make html
40+
cd _build/html
41+
git init
42+
git config user.name "Devseed-CI"
43+
git config user.email "[email protected]"
44+
touch .nojekyll # Add this so GitHub doesn't try and build site
45+
git add .
46+
git commit -m "CI deploy [skip ci]"
47+
git remote add origin [email protected]:developmentseed/label-maker.git
48+
git fetch
49+
git push origin --force --quiet HEAD:gh-pages
50+
rm -rf .git
51+
else
52+
echo "Not the branch you're looking for, skipping documentation deploy"
53+
fi

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ config.json
1313
stdout*
1414
/integration*
1515
.idea/
16+
docs/_build/

CHANGES.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
0.3.2 (2018-05-14)
2+
------------------
3+
- Provide a default value of False for imagery_offset to preview function (#79)
4+
15
0.3.1 (2018-04-19)
26
------------------
37
- Add colors for object detection and segmentation labels (#64)

README.md

Lines changed: 5 additions & 149 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Label Maker
22
## Data Preparation for Satellite Machine Learning
33

4-
The tool downloads [OpenStreetMap QA Tile]((https://osmlab.github.io/osm-qa-tiles/)) information and satellite imagery tiles and saves them as an [`.npz` file](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html) for use in Machine Learning training.
4+
The tool downloads [OpenStreetMap QA Tile]((https://osmlab.github.io/osm-qa-tiles/)) information and satellite imagery tiles and saves them as an [`.npz` file](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html) for use in machine learning training.
55

66
![example classification image overlaid over satellite imagery](examples/images/classification.png)
77
_satellite imagery from [Mapbox](https://www.mapbox.com/) and [Digital Globe](https://www.digitalglobe.com/)_
@@ -13,159 +13,14 @@ _satellite imagery from [Mapbox](https://www.mapbox.com/) and [Digital Globe](ht
1313
## Installation
1414

1515
```bash
16-
pip install label_maker
16+
pip install label-maker
1717
```
1818

1919
Note that running this library this requires `tippecanoe` as a "peer-dependency" and that command should be available from your command-line before running this.
2020

21-
## Configuration
22-
23-
Before running any commands, it is necessary to create a `config.json` file to specify inputs to the data preparation process:
24-
25-
```json
26-
{
27-
"country": "togo",
28-
"bounding_box": [1.09725, 6.05520, 1.34582, 6.30915],
29-
"zoom": 12,
30-
"classes": [
31-
{ "name": "Roads", "filter": ["has", "highway"] },
32-
{ "name": "Buildings", "filter": ["has", "building"] }
33-
],
34-
"imagery": "http://a.tiles.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}.jpg?access_token=ACCESS_TOKEN",
35-
"background_ratio": 1,
36-
"ml_type": "classification"
37-
}
38-
```
39-
40-
- `country`: The [OSM QA Tile](https://osmlab.github.io/osm-qa-tiles/) extract to download. The value should be a country string matching a value found in `label_maker/countries.txt`
41-
- `bounding_box`: The bounding box to create images from. This should be given in the form: `[xmin, ymin, xmax, ymax]` as longitude and latitude values between `[-180, 180]` and `[-90, 90]` respectively. Values should use the WGS84 datum, with longitude and latitude units of decimal degrees.
42-
- `zoom`: The [zoom level](http://wiki.openstreetmap.org/wiki/Zoom_levels) to create images as. This functions as a rough proxy for resolution. Values should be given as integers.
43-
- `classes`: An array of classes for machine learning training. Each class is defined as an object with two required properties:
44-
- `name`: The class name
45-
- `filter`: A [Mapbox GL Filter](https://www.mapbox.com/mapbox-gl-js/style-spec#other-filter) to define any vector features matching this class. Filters are applied with the standalone [featureFilter](https://github.com/mapbox/mapbox-gl-js/tree/master/src/style-spec/feature_filter) from Mapbox GL JS.
46-
- `buffer`: The number of pixels to buffer the geometry by. This is an optional parameter to buffer the label for `object-detection` and `segmentation` tasks. Accepts any number (positive or negative). It uses [Shapely `object.buffer`](https://shapely.readthedocs.io/en/latest/manual.html#object.buffer) to calculate the final geometry. You can verify that your buffer options create the desired labels by inspecting the files created in `data/labels/` after running the `labels` command.
47-
- `imagery`: One of:
48-
- A template string for a tiled imagery service. Note that you will generally need an API key to obtain images and there may be associated costs. The above example requires a [Mapbox access token](https://www.mapbox.com/help/how-access-tokens-work/)
49-
- A GeoTIFF file location. Works with both local and remote files. Ex: `'http://oin-hotosm.s3.amazonaws.com/593ede5ee407d70011386139/0/3041615b-2bdb-40c5-b834-36f580baca29.tif'`
50-
- `background_ratio`: For single-class classification problems, we need to download images with no matching class. We will download `background_ratio` times the number of images matching the one class.
51-
- `ml_type`: One of `"classification"`, `"object-detection"`, or `"segmentation"`. For the final label numpy arrays (`y_train` and `y_test`), we will produce a different label depending upon the `type`.
52-
- `"classification"`: An array of the same length as `classes`. Each array value will be either `1` or `0` based on whether it matches the class at the same index
53-
- `"object-detection"`: An array of bounding boxes of the form `[xmin, ymin, width, height, class_index]`. In this case, the values are not latitude and longitude values but pixel values measured from the upper left-hand corner. Each feature is tested against each class so if a feature matches two or more classes, it will have the corresponding number of bounding boxes created.
54-
- `"segmentation"`: An array of shape `(256, 256)` with values matching the class_index label at that position. The classes are applied sequentially according to `config.json` so latter classes will be written over earlier class labels.
55-
- `imagery_offset`: An optional list of integers representing the number of pixels to offset imagery. For example `[15, -5]` will move the images 15 pixels right and 5 pixels up relative to the requested tile bounds.
56-
57-
## Command Line Use
58-
59-
`label-maker` is most easily used as a command line tool. There are five commands documented below. All commands accept two flags:
60-
- `-d` or `--dest`: _string_ directory for storing output files. (default: `'data'`)
61-
- `-c` or `--config`: _string_ location of config.json file. (default `'config.json'`)
62-
63-
Example:
64-
```bash
65-
$ label-maker download --dest flood-monitoring-project --config flood.json
66-
```
67-
68-
### Download
69-
70-
Download and unzip OSM QA tiles
71-
72-
```bash
73-
$ label-maker download
74-
Saving QA tiles to data/ghana.mbtiles
75-
100% 18.6 MiB 1.8 MiB/s 0:00:00 ETA
76-
```
77-
78-
### Labels
79-
80-
Retiles the OSM data to the desired zoom level, creates label data (`labels.npz`), calculates class statistics, creates visual label files (either GeoJSON or PNG files depending upon `ml_type`). Requires the OSM QA tiles from the previous step. Accepts an additional flag:
81-
- `-s` or `--sparse`: _boolean_ if this flag is present, only save labels for up to `n` background tiles, where `n` is equal to `background_ratio` times the number of tiles with a class label.
82-
83-
```bash
84-
$ label-maker labels
85-
Determining labels for each tile
86-
---
87-
Residential: 638 tiles
88-
Total tiles: 1189
89-
Write out labels to data/labels.npz
90-
```
91-
92-
### Preview
93-
94-
Downloads example satellite images for each class. Requires the `labels.npz` file from the previous step. Accepts an additional flag:
95-
- `-n` or `--number`: _integer_ number of examples images to create per class. (default: `5`)
96-
97-
```bash
98-
$ label-maker preview -n 10
99-
Writing example images to data/examples
100-
Downloading 10 tiles for class Residential
101-
```
102-
103-
### Images
104-
105-
Downloads all imagery tiles needed for training. Requires the `labels.npz` file from the `labels` step.
106-
107-
```bash
108-
$ label-maker images
109-
Downloading 1189 tiles to data/tiles
110-
```
111-
112-
### Package
113-
114-
Bundles the satellite images and labels to create a final `data.npz` file. Requires the `labels.npz` file from the `labels` step and downloaded image tiles from the `images` step.
115-
116-
```bash
117-
$ label-maker package
118-
Saving packaged file to data/data.npz
119-
```
120-
121-
## Using the Packaged Data
122-
123-
Once you have a packaged `data.npz` file, you can use [`numpy.load`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html) to load it. As an example, here is how you can supply the created data to a [Keras](https://keras.io) Model:
124-
125-
```python
126-
# the data, shuffled and split between train and test sets
127-
npz = np.load('data.npz')
128-
x_train = npz['x_train']
129-
y_train = npz['y_train']
130-
x_test = npz['x_test']
131-
y_test = npz['y_test']
132-
133-
# define your model here, example usage in Keras
134-
model = Sequential()
135-
# ...
136-
model.compile(...)
137-
138-
# train
139-
model.fit(x_train, y_train, batch_size=16, epochs=50)
140-
model.evaluate(x_test, y_test, batch_size=16)
141-
```
142-
143-
For more detailed walkthroughs, check out the [examples page](examples)
144-
145-
## Contributing
146-
147-
### Installation
148-
149-
Install in development mode using
150-
```
151-
pip install -e .
152-
```
153-
154-
### Testing
155-
156-
Tests are run using `unittest`. Unit tests are at `tests/unit` and
157-
integration tests are at `tests/integration`.
158-
159-
You can test a single file like:
160-
```
161-
python -m unittest test/unit/test_validate.py
162-
```
163-
or a folder with
164-
```
165-
python -m unittest discover -v -s test/unit
166-
```
167-
Full options [here](https://docs.python.org/3/library/unittest.html)
21+
## Documentation
16822

23+
Full documentation is available here: http://devseed.com/label-maker/
16924

17025
## Acknowledgements
17126

@@ -176,3 +31,4 @@ This library builds on the concepts of [skynet-data](https://github.com/developm
17631
[ODbL](http://opendatacommons.org/licenses/odbl/)
17732
- Mapbox Satellite data can be
17833
[traced for noncommercial purposes](https://www.mapbox.com/tos/#[YmtMIywt]).
34+
- Marc Farra's [tilepie](https://github.com/kamicut/tilepie) to asynchronously process vector tiles

docs/Makefile

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line.
5+
SPHINXOPTS =
6+
SPHINXBUILD = sphinx-build
7+
SOURCEDIR = .
8+
BUILDDIR = _build
9+
10+
# Put it first so that "make" without argument is like "make help".
11+
help:
12+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
13+
14+
.PHONY: help Makefile
15+
16+
# Catch-all target: route all unknown targets to Sphinx using the new
17+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
18+
%: Makefile
19+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

0 commit comments

Comments
 (0)