Skip to content

davnish/smart_cartpole-v1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

smart_cartpole-v1

This environment is one of the five gymnasium's classic control environments. The environment is stochastic in terms of its initial state, within a given range. A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The pendulum is placed upright on the cart and the goal is to balance the pole by applying forces in the left and right direction on the cart.

This is a pytorch implementation where we will solve this environmnet using two model, DeepQNetwork and Q learning.

Deep Q-Network (DQN)

Hyperaparameters Value
gamma 0.99
epsilon 1
epsilon_decay 0.995
epsilon_end 0.05
TAU 0.001
lr
(learning rate)
0.0001
replayBufferSize 10000
batchreplayBufferSize 128

Q-Network

Hyperaparameters Value
alpha 0.1
gamma 1
epsilon 0.2
Discretized States 30

Libraries Used

  • gymnasium
  • pytorch
  • matplotlib
  • pandas
  • numpy
  • moviepy

Setup

Important

To install conda or miniconda follow this link

If you have conda installed, Then to install all the supported libararies, just run,

git clone https://github.com/davnish/smart_cartpole-v1 # cloning the repo
cd smart_cartpole-v1 # Moving inside the repo
conda env create --name cartpole --file=conda_env.yml # Installing the libraries

Then just activate the environment by,

conda activate cartpole

Testing

For testing the models, go to inside the directory of the algortihm you want to test, then inside the vis.py run the function Simulate_DQ_Strategy(<model_no>) if testing DQN or Simulate_Q_Strategy(<model_no>) if testing Q, where model_no is the model number you want to test. Just run the function an environment will open simulating the learned strategy.

All the trained models are in there respective models direcotry.

If you want the information of every model there is a info.txt which consist info of every trained model such as

  • Episode Number of episode trained on.
  • Time Time taken by the model to train.
  • Ep Solved Number of episode trained on and achieved >=200 rewards.
  • High_Score Number of episode trained on and achieved >=300.
  • avg_intv Average taken per Interval (shown in graphs above).

Refrences

About

Implementing DeepQNetwork and Q learning on gymnasium CartPole-V1 env.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages