This environment is one of the five gymnasium's classic control environments. The environment is stochastic in terms of its initial state, within a given range. A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The pendulum is placed upright on the cart and the goal is to balance the pole by applying forces in the left and right direction on the cart.
This is a pytorch implementation where we will solve this environmnet using two model, DeepQNetwork and Q learning.
| Hyperaparameters | Value |
|---|---|
gamma |
0.99 |
epsilon |
1 |
epsilon_decay |
0.995 |
epsilon_end |
0.05 |
TAU |
0.001 |
lr(learning rate) |
0.0001 |
replayBufferSize |
10000 |
batchreplayBufferSize |
128 |
gymnasiumpytorchmatplotlibpandasnumpymoviepy
Important
To install conda or miniconda follow this link
If you have conda installed, Then to install all the supported libararies, just run,
git clone https://github.com/davnish/smart_cartpole-v1 # cloning the repo
cd smart_cartpole-v1 # Moving inside the repo
conda env create --name cartpole --file=conda_env.yml # Installing the libraries
Then just activate the environment by,
conda activate cartpole
For testing the models, go to inside the directory of the algortihm you want to test, then inside the vis.py run the function Simulate_DQ_Strategy(<model_no>) if testing DQN or Simulate_Q_Strategy(<model_no>) if testing Q, where model_no is the model number you want to test. Just run the function an environment will open simulating the learned strategy.
All the trained models are in there respective models direcotry.
If you want the information of every model there is a info.txt which consist info of every trained model such as
EpisodeNumber of episode trained on.TimeTime taken by the model to train.Ep SolvedNumber of episode trained on and achieved >=200 rewards.High_ScoreNumber of episode trained on and achieved >=300.avg_intvAverage taken per Interval (shown in graphs above).
















