Skip to content

bobby060/pg-autohint

Repository files navigation

pg-autohint

AutoHint is system designed to enable using pg_hint_plan to implement heuristic, rules-based improvements to Postgres query plans.

Created as a class project for 15-799: Special Topics in Query Optimization at Carnegie Mellon University

At a high level, AutoHint either plans a query (or plans and executes a query) on Postgres, then uses rules to identify problems with the Postgres plan. Each rule generates a list of query hints designed to improve the plan.

Currently two rules are fully implemented with one in progress:

  1. NLJ to Hash Join: Identifies Nested Loop Joins whose estimated cardinality is off by a large factor from actual cardinality and converts to Hash Joins
  2. Cardinality injection. Injects the actual cardinality of joins back into the plan after analyzing
  3. Index selection (partially implemented). Tries to fix the case where an ORDER BY causes postgres to pick the wrong index. Currently only works on single indexes. Still in progress.

Additional patterns should be easy to implement.

We support both running hints that are in the query and also saving hints to the hint table so they are automatically applied when the query is run in the future.

AutoHint provides the ability to save hints in the hint table. This will allow future runs of a query to use the hints without being explicitly included in the query. Only works in Postgres 17.

Simple example:

    let mut conn = establish_connection("imdbload", "postgres", "postgres", "localhost", "5432");
    let mut opt = Optimizer::new(true, None);

    opt.add_rule(Box::new(rules::NljToHashJoin::new(1.0, 1500)));
    opt.add_rule(Box::new(rules::CardCorrection::new()));

    let query = "SELECT * FROM table;"


    opt.optimize(query, conn, true); 

See job_benchmark/src/main.rs for a more detailed example.

Results of running AutoHint on the Join Ordering Benchmark, single core Postgres 17. 9 queries time out in original, 11 time out in the AutoHinted version.

image

Distribution of improvement, binned by ratio of AutoHint/original image

Get started

  1. Install rust

Linux/unix: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

  1. Install postgres 17
sudo apt update
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
curl -fsSL https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/postgresql.gpg
sudo apt update
sudo apt install postgresql-17

sudo systemctl restart postgresql

sudo -i -u postgres psql


Create a user with your machines username

sudo su - postgres
psql


Run ALTER USER postgres PASSWORD 'postgres'; to set the password for the postgres user.

Run CREATE ROLE <username> superuser createdb login;, where username is your machines username.

Then copy the imdb_postgres_setup.sh to that user and run as postgres. I did this by touching a new file, then pasting the text, but there is probably a better way...

Now you can access imdb with psql -d imdb from your normal user

Tests will not pass unless you load the test dataset

Run all tests cargo test

Run specific test cargo test q10 -- --nocapture

  1. hint table To enable hint table feature, connect to the database, load pg_hint_plan, create hint table, and enable hint table by
LOAD 'pg_hint_plan';
CREATE EXTENSION pg_hint_plan;
set pg_hint_plan.enable_hint_table='on';

To get query identifier, run

SET compute_query_id = 'on';

Then EXPLAIN VERBOSE returns query identifier, which would be used to insert hint into hint table

Testing

Install code coverage tool cargo +stable install cargo-llvm-cov --locked

Redo code coverage test

cargo llvm-cov --html

JOB benchmark

To load JOB benchmark

git clone https://github.com/danolivo/jo-bench
cd jo-bench

export PGDATABASE=imdbload
export PGPORT=5432
export PGHOST=localhost
export PGUSER=postgres
export PGPASSWORD=<your password>

psql -f schema.sql
cp -r ./csv /tmp/csv
psql -vdatadir="'/tmp'" -f copy.sql
cd -

To run the original one pass

sudo vim /etc/postgresql/<version number>/main/postgresql.conf

add

shared_preload_libraries = 'pg_stat_statements, pg_hint_plan'

then

sudo systemctl restart postgresql
cd job_benchmark/
./run_job.sh --single-core
cd -

it will generate job-onepass-X.dat and explains-X.txt that contains execution time for each query and its plan from EXPLAIN (ANALYZE, VERBOSE, FORMAT JSON)

To run hinted version first generate hint table

cd job_benchmark/
cargo run --release <query dir>
cd -

Then run the hinted queries

cd job_benchmark/
./run_job_hinted.sh --single-core
cd -

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •