This repository contains tutorials for fine-tuning and applying MIST (Molecular Insight SMILES Transformer) foundation models to chemical problems.
Model checkpoints for MIST models are available on HuggingFace and on Zenodo.
The full code, including pre-training, model development and full scale application demos can be found in the mist repository.
Complete fine-tuning workflow for MIST encoder models:
- Finetuning with LoRA (Low-Rank Adaptation) for parameter-efficient training
- Hyperparameter optimization for task network
- Training on the QM9 dataset for molecular property prediction
- Model evaluation
Inference demonstrations using fine-tuned MIST models:
- Loading pretrained MIST checkpoints from HuggingFace
- Predicting boiling point, flash point, and melting point
- Analyzing property trends for alkenes and alcohols
- Clone the repository:
git clone <repository-url>
cd mist-demo- Create a virtual environment and install dependencies using uv
uv sync
source .venv/bin/activate # On Windows: .venv\Scripts\activateLaunch Jupyter and open any notebook in mist-demo/tutorials:
jupyter notebook