connect to Intake

I don't know if you are aware of [intake](https://intake.readthedocs.io/en/latest/), but it is a data access and cataloguing package that aims to do a lot of what you have done here, but for generic data-sets rather than the one specific example. 

Firstly. the existing [npy](https://intake.readthedocs.io/en/latest/api_user.html#intake.source.npy.NPySource) data source type shows how you might use intake on array data; note that the use of [open_files](http://dask.pydata.org/en/latest/bytes.html) ( [here in the code](https://github.com/ContinuumIO/intake/blob/master/intake/source/npy.py#L54) ) already allows access to data on remote file-systems (s3, gcs, http...) with optional compression, and the [caching system](https://intake.readthedocs.io/en/latest/catalog.html#caching-source-files-locally) handles download-on-first-use, again with various possible file layouts at the far end. 

You would still need some of your code for the specifics of the format of the mnist data, but I believe you could make your work much smaller and structured, and allow it to be included in other catalogues, or indeed as a conda package.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

connect to Intake #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

connect to Intake #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions