Demo about using GitHub Actions as a MLOps tool
- Running live at: https://crypto-predictor.tuukka.net
- Has Swagger UI API documentation
- Teaches a machine learning model from real-time data, that predicts the price of Bitcoin at a specific time
- Wraps the model inside a HTTP API
- Creates a Docker image out of the API with descriptive tags
Deploys the latest image to Render- Does all of this in a GitHub Actions workflow every 24 hours to always keep the model trained with the latest data
- Uses real-time updating history data from Alpaca Markets Market Data API using their official Python SDK
- Teaches a machine learning model with Facebook prophet
- Saves the model as a pickle binary-file on the disk
- Flask API served with Waitress
- The function is to use the model from any other application using the API over HTTP
- This microservice structure allows the usage of the model even when the user application is created using different technologies
- This also allows the model to be retrained or updated without needing to update the user applications
- The API is run loads the binary model at startup
- Listens to
/bitcoinendpoint that takesdatequery parameter- Example:
/bitcoin?date=2024-06-01 - Uses the
dateparameter to run it through the model and returns the predicted price in the following JSON format:{ "date": "2025-01-01T00:00:00Z", "prediction": 56771.779953588826, "prediction_low": 53746.06414094623, "prediction_high": 60034.61668451781 }
- Example:
- CORS is enabled, so the API can be called from other web applications
- Flask-Cors is used to simplify this process
- Uses Multi-stage dockerfile
- Trains the model in the build stage
- Only model binary, API python file and the API's dependencies are applied to the final image
- This reduces the final image size and potentially reduces the attack surface
- Runs on push to main branch and on a CRON schedule every 24 hours
- Tags a new commit with a semantic version
- If run with a schedule, commit is already tagged, so the existing version is just read
- Builds a Docker image using the Dockerfile and publishes the image to GitHub Container Registry (ghcr.io) under this GitHub repository
- The image is tagged with the following tags:
latestfor finding the latest version of the image- The semantic version of the commit (e.g.
0.3.1) for finding the latest image of this specific commit - The semantic version combined with the timestamp of the build (e.g.
0.3.1-20240518121047) for having a unique identifier for each build, that also clearly states the software version and the build time
- It's important to note that only one image can hold a specific tag at a time, and other images holding the same tags will lose those tags
- The image is tagged with the following tags:
Triggers Render deployment for the applicationThe Render web service is configured to run the image with thelatesttag so it will always deploy the newly built imageThis image deployment allows most of the configuring to be done in the repository and just minimal setup at Render side
- Python 3.x
- Required Python dependencies
- For running train script (train.py)
- For running the API (api.py)
- For running the Jupyter notebook machine learning experiment (train.ipynb)