Skip to content

jamesainslie/antimoji

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Antimoji Logo

Antimoji

CI Status Build Status Nightly Build
Go Version Latest Release GitHub Stars
License Test Coverage Go Report Card Go Reference

A blazing-fast CLI tool and linter for detecting and removing emojis from code files and documentation.

Antimoji is a high-performance emoji detection and removal tool built with Go using functional programming principles. Designed primarily as a linter, it provides comprehensive emoji scanning and cleaning capabilities for maintaining professional, emoji-free codebases with seamless CI/CD and pre-commit integration.

Features

Core Capabilities

  • Unicode Emoji Detection: Comprehensive support for Unicode 15.0+ emojis
  • Text Emoticon Detection: Recognizes :), :(, :D and other emoticons
  • Custom Pattern Detection: Supports :smile:, :thumbs_up: style patterns
  • Multi-Rune Support: Handles skin tone modifiers and ZWJ sequences
  • Allowlist Filtering: Configurable patterns to preserve specific emojis

File Operations

  • Safe File Modification: Atomic operations prevent data corruption
  • Backup Creation: Automatic backups with timestamp naming
  • Permission Preservation: Maintains original file permissions
  • Streaming Processing: Memory-efficient handling of large files
  • Binary File Detection: Automatically skips non-text files

CLI Interface

  • Multiple Commands: scan for detection, clean for removal, generate for configuration, setup-lint for automated setup
  • Output Formats: Table, JSON, and CSV formats with colored user-friendly display
  • Configuration Profiles: Default, strict, and CI/CD profiles
  • Performance Statistics: Built-in benchmarking and metrics
  • Dry-Run Mode: Preview changes without file modification
  • Advanced Logging: OpenTelemetry compliant structured logging with multiple levels
  • Debugging Support: Comprehensive Unicode detection debugging and binary file analysis

Linting & Integration

  • CI/CD Linting: Designed for automated emoji policy enforcement
  • Pre-commit Hooks: Auto-clean emojis before commits with backup creation
  • Configurable Allowlists: Smart emoji filtering for legitimate use cases
  • Self-Linting: Antimoji uses itself to maintain emoji-free codebase
  • Threshold-based Policies: Fail builds when emoji limits exceeded

Installation

From Source

git clone https://github.com/jamesainslie/antimoji.git
cd antimoji
make build
sudo cp bin/antimoji /usr/local/bin/

Using Go Install

go install github.com/jamesainslie/antimoji/cmd/antimoji@latest

Using Homebrew (macOS/Linux)

brew tap jamesainslie/antimoji
brew install antimoji

Using Docker

docker pull ghcr.io/jamesainslie/antimoji:latest
docker run --rm -v $(pwd):/app ghcr.io/jamesainslie/antimoji:latest scan /app

Quick Start

Automated Setup (Recommended)

# Zero-tolerance setup (no emojis allowed)
antimoji setup-lint --mode=zero-tolerance

# Allow-list setup (only specific emojis allowed)
antimoji setup-lint --mode=allow-list --allowed-emojis="βœ…,❌"

# Permissive setup (warns about excessive usage)
antimoji setup-lint --mode=permissive

# The setup-lint command automatically:
# - Generates .antimoji.yaml configuration
# - Updates .pre-commit-config.yaml with hooks
# - Installs pre-commit hooks (optional)

Scan for Emojis

# Scan current directory
antimoji scan .

# Scan specific files
antimoji scan file.go README.md

# Recursive scan with statistics and verbose output
antimoji scan --recursive --stats --verbose src/

# JSON output for automation with debug logging
antimoji scan --format json --log-level=info .

# Debug emoji detection issues
antimoji scan --log-level=debug --verbose .

Remove Emojis

# Preview changes (safe)
antimoji clean --dry-run .

# Remove emojis with backup
antimoji clean --backup --in-place .

# Custom replacement text with verbose output
antimoji clean --replace "[EMOJI]" --in-place --verbose .

# Respect allowlist configuration with logging
antimoji clean --respect-allowlist --in-place --log-level=info .

# Debug emoji removal issues
antimoji clean --dry-run --log-level=debug --verbose .

Automated Linting Setup

Setup-Lint Command

The setup-lint command provides one-command configuration for emoji linting:

# Zero-tolerance mode (strictest)
antimoji setup-lint --mode=zero-tolerance
# Creates configuration that disallows ALL emojis in source code

# Allow-list mode (recommended) 
antimoji setup-lint --mode=allow-list --allowed-emojis="βœ…,❌,⚠️"
# Allows only specified emojis with customizable list

# Permissive mode (development-friendly)
antimoji setup-lint --mode=permissive  
# Warns about excessive emoji usage but doesn't fail builds

# Advanced options
antimoji setup-lint --mode=allow-list \
  --allowed-emojis="πŸš€,⚑,βœ…,❌" \
  --output-dir=./config \
  --force \
  --skip-precommit

What setup-lint does:

  • βœ… Generates .antimoji.yaml with mode-specific profiles
  • βœ… Creates/updates .pre-commit-config.yaml with antimoji hooks
  • βœ… Installs pre-commit hooks (unless --skip-precommit)
  • βœ… Provides detailed usage instructions and next steps

Linting Policies & Configuration

Policy Enforcement

Antimoji supports multiple emoji policies for different use cases:

# Zero-tolerance policy (strict linting)
antimoji scan --threshold=0 --ignore-allowlist .

# Allowlist-based policy (recommended)
antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 .

# Permissive policy (development)
antimoji scan --threshold=10 .

Configuration Generation

# Generate configuration based on current project usage
antimoji generate --type=ci-lint --output=.antimoji.yaml .

# Available generation types:
# ci-lint    - Strict allowlist for CI/CD linting
# dev        - Permissive allowlist for development
# test-only  - Only allow emojis found in test files
# docs-only  - Only allow emojis found in documentation
# minimal    - Only frequently used emojis
# full       - All found emojis with categorization

Logging and Debugging

Antimoji provides comprehensive logging and debugging capabilities for troubleshooting and monitoring:

Logging Levels

# Silent mode (default) - no diagnostic output
antimoji scan --log-level=silent .

# Info mode - basic operation information
antimoji scan --log-level=info .

# Debug mode - detailed diagnostic information
antimoji scan --log-level=debug .

# Error mode - only errors and warnings
antimoji scan --log-level=error .

Logging Formats

# JSON format (default) - structured logs for monitoring
antimoji scan --log-format=json --log-level=info .

# Text format - human-readable logs for development
antimoji scan --log-format=text --log-level=info .

Debugging Emoji Detection Issues

# Debug specific file with detailed Unicode information
antimoji clean --dry-run --log-level=debug --verbose specific-file.go

# Monitor binary file detection
antimoji scan --log-level=debug . 2>&1 | grep "Binary file"

# Track emoji detection with Unicode code points
antimoji scan --log-level=debug . 2>&1 | grep "unicode_codepoints"

User Output vs Diagnostic Logging

Antimoji separates user-facing output from diagnostic logging:

  • User Output: Colored, formatted messages to stdout/stderr for human consumption
  • Diagnostic Logging: Structured JSON logs with OpenTelemetry compliance for monitoring
# Example combined output:
$ antimoji clean --dry-run --log-level=info --verbose .

# User output (colored, formatted):
INFO: File discovery completed for cleaning - files found: 42
βœ“ Summary: would remove 5 emojis from 42 files (2 modified, 0 errors)

# Diagnostic logs (structured JSON):
{"level":"INFO","msg":"Starting clean operation","operation":"clean","component":"cli"}
{"level":"DEBUG","msg":"Emoji detected","file_path":"test.go","unicode_codepoints":["U+1F600"]}

Configuration

Antimoji uses XDG-compliant configuration files:

# ~/.config/antimoji/config.yaml
version: "0.5.0"
profiles:
  default:
    # File processing
    recursive: true
    follow_symlinks: false
    backup_files: false
    
    # Emoji detection
    unicode_emojis: true
    text_emoticons: true
    custom_patterns: [":smile:", ":frown:", ":thumbs_up:"]
    
    # Allowlist (emojis to preserve)
    emoji_allowlist:
      - "βœ…"  # Checkmark for task completion
      - "❌"  # Cross mark for failures
      - "⚠️"  # Warning symbol
    
    # File filters
    include_patterns: ["*.go", "*.md", "*.js", "*.py", "*.ts"]
    exclude_patterns: ["vendor/*", "node_modules/*", ".git/*"]
    
    # Output
    output_format: "table"
    show_progress: true
    colored_output: true

Configuration Profiles

Default Profile

Balanced settings for general development use with common allowlisted emojis.

Strict Profile

Zero-tolerance policy - removes all emojis regardless of type.

CI Profile

Optimized for CI/CD pipelines with JSON output and specific error codes.

Usage Examples

Development Workflow

# Generate project-specific allowlist
antimoji generate --type=ci-lint --output=.antimoji.yaml .

# Check for emojis before commit
antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 .

# Clean codebase maintaining allowlisted emojis
antimoji clean --config=.antimoji.yaml --respect-allowlist --backup --in-place .

# Generate report for code review
antimoji scan --config=.antimoji.yaml --format=json > emoji-report.json

Quick Start for Linting (Legacy)

# Manual setup (use setup-lint command instead for easier configuration)
go install github.com/jamesainslie/antimoji/cmd/antimoji@latest
antimoji generate --type=ci-lint --output=.antimoji.yaml .
antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 .

Linting Integration

Antimoji is designed primarily as a linter for automated emoji policy enforcement in development workflows.

Generate Allowlist Configuration

# Analyze current project and generate strict allowlist
antimoji generate --type=ci-lint --output=.antimoji.yaml .

# Generate development-friendly allowlist
antimoji generate --type=dev --output=.antimoji-dev.yaml .

# Generate minimal allowlist (only frequently used emojis)
antimoji generate --type=minimal --min-usage=3 .

Pre-commit Integration

Automatic Setup:

# Install pre-commit framework and antimoji hooks
make install-pre-commit

# Generate antimoji configuration
make generate-allowlist

# Test the integration
make test-pre-commit

Manual Setup - Add to your .pre-commit-config.yaml:

repos:
  # Antimoji emoji linting with auto-cleaning
  - repo: https://github.com/jamesainslie/antimoji
    rev: v0.9.6  # Use latest release
    hooks:
      # Strict linting - fails if emojis found in source code
      - id: antimoji-lint
        files: \.(go|js|ts|jsx|tsx|py|rb|java|c|cpp|h|hpp|rs|php|swift|kt|scala)$
        exclude: .*_test\.|.*/test/.*|.*/testdata/.*
        
      # Auto-clean emojis with backup (recommended)
      - id: antimoji-clean
        files: \.(go|js|ts|jsx|tsx|py|rb|java|c|cpp|h|hpp|rs|php|swift|kt|scala)$
        exclude: .*_test\.|.*/test/.*|.*/testdata/.*
        
      # Documentation checking (permissive)
      - id: antimoji-docs
        files: \.(md|rst|txt)$

Local Repository Setup:

repos:
  - repo: local
    hooks:
      - id: antimoji-strict
        name: Antimoji Strict Linter
        entry: bin/antimoji scan --threshold=0 --ignore-allowlist --quiet
        language: system
        files: \.(go|js|ts|py|java|c|cpp|rs)$
        exclude: .*_test\.|.*/test/.*
        pass_filenames: true

CI/CD Integration

GitHub Actions Example:

name: CI

on: [push, pull_request]

jobs:
  emoji-lint:
    name: Emoji Linting
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Go
        uses: actions/setup-go@v4
        with:
          go-version: '1.21'
          
      - name: Install Antimoji
        run: go install github.com/jamesainslie/antimoji/cmd/antimoji@latest
        
      - name: Generate Allowlist
        run: antimoji generate --type=ci-lint --output=.antimoji.yaml .
        
      - name: Lint for Emojis
        run: antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 --format=json --log-level=info .

GitLab CI Example:

emoji-lint:
  stage: test
  image: golang:1.21
  script:
    - go install github.com/jamesainslie/antimoji/cmd/antimoji@latest
    - antimoji generate --type=ci-lint --output=.antimoji.yaml .
    - antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 --log-level=info .
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

Jenkins Pipeline Example:

pipeline {
    agent any
    stages {
        stage('Emoji Lint') {
            steps {
                sh 'go install github.com/jamesainslie/antimoji/cmd/antimoji@latest'
                sh 'antimoji generate --type=ci-lint --output=.antimoji.yaml .'
                sh 'antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 --log-level=info .'
            }
        }
    }
}

Docker Integration:

# In your Dockerfile for CI
FROM golang:1.21-alpine AS linter
RUN go install github.com/jamesainslie/antimoji/cmd/antimoji@latest
COPY . /app
WORKDIR /app
RUN antimoji scan --threshold=0 --ignore-allowlist --log-level=info .

Real-World Examples

Self-Linting Project

Antimoji uses itself for emoji linting! Here's how:

Configuration (.antimoji.yaml):

version: "0.5.0"
profiles:
  ci-lint:
    # Scan source code and build files
    include_patterns: ["*.go", "*.js", "*.py", "Makefile", "*.mk"]
    
    # Allow legitimate emojis from tests and docs
    emoji_allowlist: ["πŸ˜€", "βœ…", "❌", ":)", ":(", ":smile:"]
    
    # Exclude files that legitimately contain emojis
    file_ignore_list: ["*_test.go", "*.md", "generate.go", "config.go"]
    
    # Fail build if emojis found in production code
    fail_on_found: true
    max_emoji_threshold: 0

GitHub Actions Integration:

# From .github/workflows/ci.yml
antimoji-lint:
  name: Antimoji Lint
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Build antimoji
      run: make build
    - name: Run antimoji linter
      run: |
        if [ -f ".antimoji.yaml" ]; then
          ./bin/antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 .
        else
          ./bin/antimoji generate --type=ci-lint --output=.antimoji.yaml .
          ./bin/antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 .
        fi

Makefile Integration:

# From Makefile
antimoji-lint: build
	@echo "Running antimoji linter..."
	@./bin/antimoji scan --config=.antimoji.yaml --profile=ci-lint --threshold=0 .

generate-allowlist: build
	@echo "Generating antimoji allowlist configuration..."
	@./bin/antimoji generate --type=ci-lint --output=.antimoji.yaml .

check-all: deps fmt vet lint antimoji-lint security-scan test-coverage-check
	@echo "All quality checks passed!"

Enterprise Integration

Multi-Language Repository:

# .antimoji.yaml for polyglot projects
profiles:
  strict:
    include_patterns: 
      - "*.go"      # Backend
      - "*.ts"      # Frontend  
      - "*.py"      # Scripts
      - "*.java"    # Services
      - "Makefile"  # Build
    emoji_allowlist: []  # Zero tolerance
    fail_on_found: true

Gradual Adoption:

# Phase 1: Documentation only
antimoji generate --type=docs-only --output=.antimoji.yaml .

# Phase 2: Add test files  
antimoji generate --type=test-only --output=.antimoji.yaml .

# Phase 3: Full source code
antimoji generate --type=ci-lint --output=.antimoji.yaml .

Large Repository Processing

# High-performance scanning
antimoji scan --recursive --stats --workers 8 .

# Memory-efficient cleaning
antimoji clean --stream --in-place large-repo/

Performance

Antimoji is optimized for high-performance processing:

  • Small files (<1KB): >10,000 files/second
  • Medium files (1-100KB): >1,000 files/second
  • Large files (>1MB): >100MB/second throughput
  • Memory usage: <50MB for typical repositories
  • Startup time: <100ms cold start

Architecture

Antimoji follows clean architecture principles with functional programming and comprehensive observability:

CLI Layer          β†’ Cobra commands, User output, Viper config
Application Layer  β†’ Command handlers, Config manager, Context propagation
Business Logic     β†’ Emoji detector, File processor, Allowlist manager
Infrastructure     β†’ File system, Concurrency, Memory management
Observability      β†’ OTEL logging, Context tracking, User output separation

Key Design Principles

  • Functional Programming: Pure functions and immutable data
  • Performance First: Zero-copy operations and memory pooling
  • Safety Emphasis: Atomic operations and comprehensive error handling
  • Test-Driven: 85% minimum test coverage requirement
  • Observability First: Structured logging and comprehensive debugging
  • Separation of Concerns: User output distinct from diagnostic logging

Development

Prerequisites

  • Go 1.21 or later
  • Make (for build automation)

Development Setup

git clone https://github.com/jamesainslie/antimoji.git
cd antimoji
make dev-setup
make test-watch

Running Tests

# Run all tests
make test

# Test with coverage
make test-coverage

# Run benchmarks
make benchmark

# Quality checks
make check-all

Build

# Development build
make build

# Release build
make build-release

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests first (TDD approach)
  4. Implement feature maintaining 85% test coverage
  5. Ensure all linting passes (make lint)
  6. Commit changes (git commit -m 'feat: add amazing feature')
  7. Push to branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

Development Standards

  • 85% minimum test coverage
  • Zero linting warnings
  • Functional programming principles
  • Comprehensive documentation

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Troubleshooting

Common Issues

Inconsistent Results Between Scan and Clean

If scan and clean report different emoji counts:

# Enable debug logging to identify the issue
antimoji scan --log-level=debug . 2>&1 | grep -E "(Binary file|Emojis detected)"
antimoji clean --dry-run --log-level=debug . 2>&1 | grep -E "(Binary file|Emojis detected)"

This will show if binary files are being processed inconsistently.

Binary Files Being Processed as Text

If you see emojis detected in binary files (executables, images, etc.):

# Check binary file detection
antimoji scan --log-level=debug suspicious-file 2>&1 | grep "Binary file"

Emoji Detection Issues

For detailed emoji detection debugging:

# Show Unicode code points for detected emojis
antimoji scan --log-level=debug file.txt 2>&1 | grep "unicode_codepoints"

# Show which Unicode ranges are matching
antimoji scan --log-level=debug file.txt 2>&1 | grep "matched_ranges"

Performance Issues

For performance analysis:

# Enable benchmarking and detailed metrics
antimoji scan --stats --log-level=info --verbose .

# Monitor processing patterns
antimoji scan --log-level=debug . 2>&1 | grep "patterns_applied"

Getting Help

If issues persist:

  1. Enable debug logging: --log-level=debug --verbose
  2. Check for binary file processing issues
  3. Verify configuration with --dry-run mode
  4. Create minimal reproduction case
  5. File an issue with debug logs

Support


Antimoji - Keeping your codebase clean and professional, one emoji at a time.

About

Dealing with emoji slop from AI

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors 2

  •  
  •  

Languages