Chempleter is a lightweight generative sequence model based on a multi-layer gated recurrent units (GRU) to predict syntactically valid extensions of a provided molecular fragment or bridge two molecules/molecular fragments. It operates on SELFIES token sequences, ensuring syntactically valid molecular generation and accepts SMILES notation as input. Due to its simple recurrent architecture and small vocabulary, the model runs efficiently on both CPUs and GPUs.
-
What can Chempleter do?
-
Currently, Chempleter accepts an intial molecule/molecular fragment in SMILES format and generates a larger molecule with that intial structure included, while respecting chemical syntax. It also shows some interesting descriptors.
-
It can be used to generate a wide range of structural analogs which the share same core structure (by changing the sampling temperature) or decorate a core scaffold iteratively (by increasing generated token lengths)
-
It can be used to bridge two molecules/molecular fragments to explore linker chemistry.
-
In the future, it might be adapated to predict structures with a specific chemical property using a regressor to rank predictions and transition towards more "goal-directed" predictions.
-
- Python ">=3.12"
- uv (optional but recommended)
See detailed installation instructions.
Visit Chempleter's docs.
These commands are valid for running Chempleter on CPU. For GPU, see installation instructions.
-
-
On windows:
uvx --from "chempleter[cpu]" chempleter-gui.exe -
On linux/MacOS:
uvx --from "chempleter[cpu]" chempleter-gui -
The very first start of the GUI on your device might be a bit slow. To know more about using the GUI and various options, see here.
-
-
uv pip install "chempleter[cpu]" -
uv run chempleter-gui -
-
Type in the SMILES notation for the starting structure or leave it empty to generate random molecules. Click on
GENERATEbutton to generate a molecule. -
To know more about using the GUI and various other options, see here.
-
-
-
To use Chempleter as a python library:
from chempleter.inference import extend generated_mol, generated_smiles, generated_selfies = extend(smiles="c1ccccc1") print(generated_smiles) >> C1=CC=CC=C1C2=CC=C(CN3C=NC4=CC=CC=C4C3=O)O2
To draw the generated molecule :
from rdkit import Chem Chem.Draw.MolToImage(generated_mol)
-
For details on available paramenters and inference functions, see generating molecules.
-
- src/chempleter: Contains python modules relating to different functions.
- src/chempleter/processor.py: Contains fucntions for processing csv files containing SMILES data and generating training-related files.
- src/chempleter/dataset.py: ChempleterDataset class
- src/chempleter/model.py: ChempleterModel class
- src/chempleter/inference.py: Contains functions for inference
- src/chempleter/train.py: Contains functions for training
- src/chempleter/gui.py: Chempleter GUI built using NiceGUI
- src/chempleter/data : Contains trained model, vocabulary files
MIT License
Copyright (c) 2025-2026 Davis Thomas Daniel
Any contribution, improvements, feature ideas or bug fixes are always welcome.

