Building a Unix Shell from Scratch in Rust: Lessons Learned from Implementing Push

Over the past several months, I've been working with my team (Youssef Hajjaoui and Zakaria Salhi) on Push - a feature-rich Unix-like shell implementation written entirely in Rust. This project has been one of the most educational and challenging endeavors we've undertaken, pushing us to deeply understand systems programming, language design, and the intricate details of how shells actually work under the hood.

The Vision: Understanding the Foundation

Most developers use shells daily (bash, zsh, fish), but few understand what happens when you type ls | grep "test" and press Enter. We wanted to build a shell from scratch to truly understand:

How command parsing and execution works
The complexity of process management and job control
The nuances of variable expansion and quoting
How pipelines and redirections are implemented
The challenges of interactive terminal handling

Getting Started

Prerequisites

Rust (latest stable version recommended)
Cargo (comes with Rust)
Unix-like operating system (Linux, macOS, BSD)

Installation

Install Push using the Makefile:

make install

This will:

Build the shell and utilities in release mode
Install them to ~/.push/bin/
Create configuration files (~/.push/.pushrc, ~/.push/.push_history)
Add ~/.push/bin to your PATH by updating your shell configuration file

After installation, run:

source ~/.bashrc

or

source ~/.zshrc

Or simply restart your terminal. Then you can use Push with:

push

Running Push:

Interactive Mode: Just run push
Non-Interactive Mode (from stdin): echo "ls -la" | push
Command Mode: push -c "ls | grep test"

Project Structure

This section guides contributors on where to find and modify different components of the shell.

push/
├── shell/                    # Main shell implementation
│   ├── Cargo.toml           # Rust project configuration
│   └── src/
│       ├── main.rs          # Entry point - handles shell modes (interactive/non-interactive/command)
│       ├── lib.rs           # Library root - exports public modules
│       │
│       ├── lexer/           # LEXICAL ANALYSIS - Token generation
│       │   ├── mod.rs       # Lexer module exports
│       │   ├── tokenize.rs  # Main tokenizer - converts input to tokens
│       │   └── types.rs     # Token definitions (Token, Word, WordPart, State)
│       │
│       ├── parser/          # PARSING - AST construction
│       │   ├── mod.rs       # Parser module - main parser logic
│       │   ├── types.rs     # AST node definitions (AstNode, Redirect, ArithmeticExpr)
│       │   ├── parse_command.rs      # Command parsing
│       │   ├── parse_pipeline.rs     # Pipeline (|) parsing
│       │   ├── parse_sequence.rs     # Sequence (;) parsing
│       │   ├── parse_if.rs           # If-then-else parsing
│       │   ├── parse_while.rs        # While loop parsing
│       │   ├── parse_for.rs          # For loop parsing
│       │   ├── parse_function.rs     # Function definition parsing
│       │   ├── parse_group.rs        # Command grouping ({}) parsing
│       │   ├── parse_redirection.rs  # I/O redirection parsing
│       │   └── parse_assignment.rs   # Variable assignment parsing
│       │
│       ├── expansion.rs     # EXPANSION - Variable/arithmetic/command substitution
│       │
│       ├── envirement.rs    # ENVIRONMENT - Shell state management (variables, functions, jobs)
│       │
│       ├── executor/        # EXECUTION - AST interpretation
│       │   ├── mod.rs       # Main executor - dispatches to specific executors
│       │   ├── exec_command.rs       # Command execution
│       │   ├── exec_pipeline.rs      # Pipeline execution
│       │   ├── exec_sequence.rs      # Sequence execution
│       │   ├── exec_if.rs            # If statement execution
│       │   ├── exec_while.rs         # While loop execution
│       │   ├── exec_for.rs           # For loop execution
│       │   ├── exec_until.rs         # Until loop execution
│       │   ├── exec_and.rs           # Logical AND (&&) execution
│       │   ├── exec_or.rs            # Logical OR (||) execution
│       │   ├── exec_not.rs           # Logical NOT (!) execution
│       │   ├── exec_subshell.rs      # Subshell execution
│       │   ├── exec_group.rs         # Command group execution
│       │   ├── spawn_commande.rs     # Process spawning (fork/exec)
│       │   └── run_commande.rs       # Command execution helpers
│       │
│       ├── exec.rs          # Main execution entry point (legacy support)
│       │
│       ├── shell.rs         # Shell struct and core functionality
│       │
│       ├── commands/        # BUILT-IN COMMANDS
│       │   ├── cd.rs        # Change directory
│       │   ├── pwd.rs       # Print working directory
│       │   ├── echo.rs      # Print text
│       │   ├── ls.rs        # List directory (if implemented)
│       │   ├── cat.rs       # Concatenate files (if implemented)
│       │   ├── cp.rs        # Copy files
│       │   ├── rm.rs        # Remove files
│       │   ├── mv.rs        # Move/rename files
│       │   ├── mkdir.rs     # Create directories
│       │   ├── export.rs    # Export variables
│       │   ├── jobs.rs      # List jobs
│       │   ├── fg.rs        # Foreground job
│       │   ├── bg.rs        # Background job
│       │   ├── kill.rs      # Kill process
│       │   ├── exit.rs      # Exit shell
│       │   ├── type.rs      # Show command type
│       │   ├── test.rs      # Test command
│       │   ├── tru.rs       # True command
│       │   └── fals.rs      # False command
│       │
│       ├── features/        # FEATURES
│       │   ├── history.rs   # Command history management
│       │   └── jobs.rs      # Job control system
│       │
│       ├── shell_interactions/  # INTERACTIVE TERMINAL
│       │   ├── buffer.rs        # Input buffer management
│       │   ├── history_handler.rs  # History navigation
│       │   ├── rerender.rs      # Terminal re-rendering
│       │   └── utils.rs         # Terminal utilities
│       │
│       ├── events_handler.rs    # Compatibility layer (re-exports from shell.rs)
│       ├── shell.rs             # Shell struct and core functionality
│       ├── redirection.rs       # I/O redirection handling
│       ├── eval.rs              # Arithmetic expression evaluation
│       ├── error.rs             # Error types and handling
│       └── signal_handler.rs    # Signal handling
│
├── cat/                    # Cat utility (separate binary)
├── ls/                     # Ls utility (separate binary)
├── Makefile                # Build and installation scripts
└── README.md               # This file

Quick Reference for Contributors

Want to add a new built-in command? → Add implementation in shell/src/commands/your_command.rs and register it in shell/src/exec.rs (build_command function)

Want to modify tokenization? → Edit shell/src/lexer/tokenize.rs and shell/src/lexer/types.rs

Want to add new AST node types? → Edit shell/src/parser/types.rs and add corresponding parser in shell/src/parser/

Want to add new execution logic? → Add executor in shell/src/executor/ and register in shell/src/executor/mod.rs

Want to modify environment/variables? → Edit shell/src/envirement.rs

Want to improve interactive features? → Edit shell/src/shell_interactions/ and shell/src/shell.rs

Want to add expansion features? → Edit shell/src/expansion.rs

Architecture: A Multi-Stage Pipeline

The shell follows a classic compiler-like architecture with distinct phases:

1. Lexical Analysis (Tokenization)

The lexer is a state machine that processes raw input character by character. This was more complex than we initially anticipated. Handling edge cases like:

Variable substitution: $VAR, ${VAR}, $((arithmetic)), $(command)
Quoting: Single quotes (literal), double quotes (variable expansion), and escaping
Redirection operators: >, >>, <, <<, and file descriptor variants like 2>&1
Word boundaries: When does a word end? What about ls>file vs ls > file?

The state machine approach was crucial here. States like InWord, InSingleQuote, InDoubleQuote, MaybeRedirectOut2 allowed us to handle context-dependent parsing correctly.

2. Parsing: Building the Abstract Syntax Tree

The parser constructs an AST representing the shell command structure. This was fascinating because shell syntax is more complex than it appears:

Operator precedence: && vs || vs ; vs |
Grouping: { commands; } vs (subshell)
Control flow: if-then-elif-else-fi, while-do-done, for-in-do-done
Function definitions: name() { body; }

We implemented a recursive descent parser with lookahead, which made handling operator precedence and nested constructs manageable. The AST nodes (AstNode enum) represent everything from simple commands to complex control structures.

3. Expansion: The Magic of Variable Substitution

Variable expansion is where shells get their power. Implementing this correctly required understanding:

When to expand: Variables in double quotes expand, but not in single quotes
Field splitting: After expansion, words are split on whitespace (unless quoted)
Arithmetic expansion: $((2 + 2)) evaluates to 4
Command substitution: $(date) executes and substitutes output
Default values: ${VAR:-default} provides fallback values

The Word type stores expansion information, allowing the executor to know whether a word should be split or kept as-is.

3.5. Environment Management: The Shell's State

One of the most critical components we implemented was the shell environment. The ShellEnv struct manages the complete state of the shell:

Shell Variables: A HashMap<String, (String, bool)> storing variable names, values, and whether they're exported to child processes
Arithmetic Variables: Separate storage for numeric variables used in arithmetic expressions
User-Defined Functions: Functions defined with name() { body; } syntax, stored as AST nodes
Job Control: Integration with the job management system for tracking background processes
Exit Status Tracking: Maintaining $? (last command exit status)
Positional Arguments: Handling $0, $1, $2, etc. for script arguments

The environment initialization (ShellEnv::new()) sets up:

Inherited environment variables from the parent process
Standard variables: USER, HOME, SHELL, PWD
Special handling for ~ (home directory expansion)
Positional arguments from command-line invocation

We implemented proper separation between local variables (shell-only) and exported variables (passed to child processes). The set_local_var() and set_env_var() methods ensure correct scoping and inheritance behavior.

4. Execution: Where the Real Work Happens

The execution engine is the heart of the shell. This is where we learned the most about Unix process management:

Process Creation and Management:

Using fork() to create child processes (via the nix crate)
execve() to replace process image with external commands
Process groups for job control: setpgid() to create/manage groups
Terminal control: tcsetpgrp() to give terminal control to foreground processes

Pipelines: Implementing pipelines was particularly challenging. The key insight: all commands in a pipeline must run concurrently, connected via pipes. The implementation:

Creates pipes between consecutive commands
Spawns all processes concurrently (not sequentially!)
Connects stdin/stdout appropriately
Waits for all processes to complete
Returns the exit status of the last command

Job Control: Job control allows background execution and process management:

Background jobs: command & runs in background
Job tracking: Each job has a process group ID (PGID)
Status tracking: Running, Stopped, Done, Terminated
Job commands: jobs, fg, bg, kill
Signal handling: SIGTSTP (Ctrl+Z), SIGINT (Ctrl+C), SIGCONT

The job reaper thread periodically checks for completed child processes using waitpid() with WNOHANG, preventing zombie processes.

Redirections: I/O redirection required careful file descriptor management:

> redirects stdout (fd 1) to a file
>> appends to a file
< redirects stdin (fd 0) from a file
2>&1 redirects stderr to stdout
File descriptor variants: 2>file, 1>&2

The redirection system merges redirects from commands and groups, ensuring proper ordering and precedence.

5. Interactive Terminal Handling

Building an interactive shell required raw terminal mode:

Raw mode: Disable line buffering and echo for full control
Line editing: Insert, delete, cursor movement
History navigation: Up/Down arrows to browse command history
Signal handling: Ctrl+C, Ctrl+D, Ctrl+Z, Ctrl+L (clear screen)
Prompt display: Dynamic prompt with proper cursor positioning

Using the termion crate, we implemented a readline-like interface with proper cursor tracking and history management. The history persists to a file for future sessions.

Key Technical Challenges

1. Process Group Management

One of the trickiest aspects was understanding process groups. When you run ls | grep test, both processes must be in the same process group so signals (like Ctrl+C) affect both. The first process becomes the group leader, and subsequent processes join that group.

2. Terminal Control Transfer

When a foreground process runs, the shell must:

Give terminal control to the process group (tcsetpgrp)
Temporarily ignore SIGTTOU (to avoid blocking)
Wait for the process to complete
Reclaim terminal control

This dance ensures proper signal delivery and terminal behavior.

3. Pipeline Execution Order

Pipelines must execute all commands concurrently, not sequentially. This means:

All fork() calls happen before any wait()
Pipes are created before spawning processes
File descriptors are properly closed in parent and child
The shell waits for the entire pipeline, not individual commands

4. Variable Expansion Edge Cases

Handling variable expansion correctly required careful attention to:

Quoting: "$VAR" expands, '$VAR' doesn't
Field splitting: VAR="a b" splits into two arguments when unquoted
Empty variables: "" vs unset variables
Special variables: $? (exit status), $0 (positional arguments), $$ (PID)

5. Error Handling and Recovery

Rust's Result<T, E> type was invaluable for error handling. The ShellError enum captures:

Syntax errors: Unclosed quotes, unexpected tokens
Parse errors: Invalid command structure
Execution errors: Command not found, permission denied
IO errors: File operations, pipe creation

Proper error propagation ensures the shell can recover gracefully and provide meaningful error messages.

Rust-Specific Insights

Building this in Rust provided unique advantages and challenges:

Advantages:

Memory safety: No buffer overflows, use-after-free, or data races
Pattern matching: Exhaustive matching on enums caught many bugs at compile time
Ownership system: Made it clear who owns file descriptors, processes, etc.
Error handling: Result types forced explicit error handling
Zero-cost abstractions: The abstractions compile away, leaving efficient code

Challenges:

Lifetime management: Ensuring file descriptors and process handles live long enough
String handling: Rust's string types required careful conversion between &str, String, and OsString
separation of concerns: Separate between lexer job and parser job

Features Implemented

The shell supports a comprehensive set of features, all tested and verified:

Built-in Commands ✅

File operations: ls, cat, cp, rm, mv, mkdir
Navigation: cd, pwd
Utilities: echo, export, type, test, true, false
Job control: jobs, fg, bg, kill
Control: exit

Shell Constructs ✅

Command sequences: cmd1; cmd2; cmd3 (fully working)
Pipelines: cmd1 | cmd2 | cmd3 (multi-stage pipelines supported)
Logical operators: cmd1 && cmd2, cmd1 || cmd2, ! cmd (all working)
Background execution: cmd & (supported)
Variable assignments: VAR=value command (working)
I/O redirection: >, >>, < (working when directories exist)
Command grouping: { cmd1; cmd2; } (fully functional)

Control Flow ✅

Conditionals: if-then-elif-else-fi (all branches working)
Loops:
- for-in-do-done (fully working)
- while-do-done (working, arithmetic in conditions has limitations)
- until-do-done (working, arithmetic in conditions has limitations)
Loop control: break, continue (with optional levels, fully working)
Functions: name() { body; } (fully implemented with argument support)

Functions ✅ (Fully Implemented)

Function definitions: myfunc() { echo hello; }
Function calls: myfunc
Function arguments: greet() { echo hello $1; } followed by greet world
Multiple arguments: Functions support $1, $2, $3, etc.
Nested function calls: Functions can call other functions
Function redefinition: Functions can be redefined
Positional parameter scoping: Parameters are properly saved/restored
Functions with control structures: Functions can contain if/for/while
Functions with pipelines and redirections: Full support

Interactive Features ✅

Command history with persistent storage
Line editing (insert, delete, cursor movement)
History navigation (Up/Down arrows)
Terminal control (Ctrl+C, Ctrl+D, Ctrl+Z, Ctrl+L)
Dynamic prompt

Variable Expansion ✅

Basic expansion: $VAR, ${VAR}
Environment variables: export VAR=value and $VAR
Positional parameters: $1, $2, etc. (in functions)
Special variables: $0 (preserved in functions), $? (exit status)

Known Limitations ⚠️

Subshells: (cmd1; cmd2) - Parse error, not yet implemented
Arithmetic in test conditions: $((i+1)) in while/until loops has issues
File operation flags: mkdir -p and rm -rf don't support flags (flags treated as filenames)
Pipelines with control structures: Pipelines can only contain commands, not if/for/while inside them
Redirection to non-existent directories: Requires parent directory to exist first

What We Learned

This project taught us:

Systems Programming: Deep understanding of Unix process model, signals, file descriptors, and terminal I/O
Language Design: How shell syntax evolved and why certain design decisions were made
Compiler Techniques: Lexing, parsing, AST construction, and interpretation
Concurrency: Process management, job control, and signal handling
Rust Mastery: Advanced Rust features, FFI, ownership patterns, and error handling

The Result

Push is a very capable shell implementation that has been extensively tested and verified. It can:

✅ Execute external commands and built-ins - All basic commands work correctly ✅ Handle complex pipelines and redirections - Multi-stage pipelines fully functional ✅ Manage background jobs - Job control with proper process group handling ✅ Support control flow constructs - If/while/for/until all working ✅ Function support - Full function implementation with arguments and scoping ✅ Provide interactive editing - History, line editing, cursor movement ✅ Manage shell environment - Proper variable scoping and function storage ✅ Command sequences and logical operators - All combinations working ✅ Nested control structures - Nested loops and conditionals supported

Test Results Summary

Based on extensive testing (40+ test scenarios):

75% of features fully working (30/40 tests passed completely)
20% partially working (8/40 tests with minor limitations)
5% not yet implemented (2/40 tests - subshells and some edge cases)

The shell successfully handles:

All basic built-in commands
Complex command sequences (8+ commands)
Multi-stage pipelines
All logical operator combinations
Complete if-then-elif-else-fi structures
For, while, and until loops
Break and continue with levels
Full function implementation with arguments
Variable expansion and environment management
Command grouping and complex combinations

While it's not a complete replacement for bash or zsh (some advanced features like subshells and command substitution are still in development), we've built a solid, well-tested foundation that demonstrates deep systems programming knowledge and Rust expertise. The core functionality is robust and handles most day-to-day shell operations effectively. The architecture is well-designed and extensible, making it ready for further development.

Reflection

Building a shell from scratch is one of those projects that seems simple until you start implementing it. The devil is truly in the details: handling edge cases in parsing, managing process lifecycles correctly, ensuring proper signal delivery, and providing a smooth interactive experience.

This project has been invaluable for our growth as systems programmers. Working as a team, we divided responsibilities while maintaining close collaboration. We implemented the complete shell environment system together, ensuring proper variable scoping, function storage, and job tracking. The collaborative effort forced us to understand not just how shells work, but why they work the way they do. Every feature we implemented revealed new layers of complexity and taught us something new about Unix, Rust, and software engineering in general.

If you're interested in systems programming, language implementation, or just want to understand how the tools you use daily actually work, I highly recommend building a shell. It's challenging, educational, and deeply satisfying.

Open for Contributions

We're excited to share Push with the community and welcome contributions! While we have a solid foundation in place, there are still features to implement and improvements to make. Whether you're interested in:

Adding new built-in commands
Implementing advanced shell features
Improving error handling and user experience
Optimizing performance
Writing documentation or tests
Or just exploring how shells work

We'd love to have you contribute! The codebase is well-structured, and we're happy to help onboard new contributors. If you're passionate about systems programming, Rust, or just want to learn by contributing to a real project, reach out or check out our repository.

Tech Stack: Rust, Unix bindings, termion (terminal I/O), various standard library crates

Status: Production-ready core functionality with extensive testing - 75% of features fully working, open for contributions

Test Coverage: 40+ comprehensive test scenarios covering all implemented features

Team: Youssef Hajjaoui, Zakaria Salhi, and myself

#Rust #SystemsProgramming #Unix #Shell #SoftwareEngineering #OpenSource #Programming #Tech #ContributionsWelcome

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
cat		cat
ls		ls
shell		shell
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
PROJECT_INDEX.md		PROJECT_INDEX.md
README.md		README.md
abc.txt		abc.txt
comprehensive_test.sh		comprehensive_test.sh
extensive_test.sh		extensive_test.sh
install-docker-rootless.zsh		install-docker-rootless.zsh
test_commands.sh		test_commands.sh
test_ctrl_z.sh		test_ctrl_z.sh

e-aub/push

Folders and files

Latest commit

History

Repository files navigation