Skip to content

8 months of AI support — what we learned #5651

@softhack007

Description

@softhack007

This is a summary from a review of AI-supported issue report, AI-supported code review and research, and contributions largely created by AI Agent

It's a CALL for DISCUSSION and REFLECTION!

Kind regrard from @softhack007, @copilot and @coderabbitai 😃

(questions and questioning of intermediate results: myself, write-up and summary: our AI helpers)


8 Months of AI Support — What We Learned

Context

Since late 2025, WLED has been using AI-assisted code review (CodeRabbit) and AI coding agents (GitHub Copilot) alongside human maintainers and contributors. After ~8 months of experience, this issue collects our observations, conclusions, and recommendations for how all participants — contributors, maintainers, and AI tools — can work together more effectively.

This is a discussion document. Feedback from all contributors is welcome.


Key Observations

What Works Well

  • AI catches mechanical issues reliably: null checks, buffer overflows, syntax errors, basic code style
  • AI is tireless at cross-referencing large codebases — finding where a function is used, checking flag consistency
  • AI scaffolding helps less-experienced contributors get started (usermod templates, issue formatting)
  • Encoding lessons into instruction files creates a persistent, improving knowledge base

What Doesn't Work Well

  • AI fabricates flag names, API behaviors, and version history with high confidence
  • AI lacks hardware domain knowledge (DMA constraints, PSRAM timing, ESP32 watchdog behavior)
  • AI over-applies generic software patterns that don't fit embedded/IoT constraints (suggesting abstractions that cost flash, recommending auth for protocols without it)
  • AI-generated "authoritative" PR descriptions can mask contributor uncertainty, making review harder
  • AI reviews can generate noise — many low-value suggestions that bury the important ones

The Fundamental Tension

Most WLED contributors are not software domain experts — they're LED enthusiasts, home automation hobbyists, and makers. They bring invaluable real-world hardware experience but may need help translating that into production-quality code. AI can bridge this gap, but only if all parties are honest about the boundaries of their knowledge.


Recommendations by Role

🧑‍🎨 For Non-Expert Contributors (Users Who Want to Share)

Non-expert contributors have real-world experience with the hardware that no amount of code reading can replace.

The core challenge: You know what's wrong or what would be cool, but may struggle to express it in code or in the project's technical vocabulary.

  1. Lead with your experience, not your diagnosis. Describe what you observed ("LEDs flicker when transition X is active at brightness < 20") rather than what you think the code is doing wrong. Your hardware observation is authoritative; code analysis may not be.

  2. Use AI as a translator, not an authority. Label AI involvement: "I used AI to help write this patch" or "AI suggested this might be the cause." This tells reviewers where to focus verification.

  3. Small scope, clear intent. A 5-line fix with clear before/after is better than a 200-line AI-generated refactor you can't fully explain.

  4. Don't let AI write your issue reports. Use AI privately to understand what you're seeing, then write in your own words. Authenticity builds trust.

  5. It's OK to say "I don't know why this works." Honest uncertainty invites collaboration. AI-wrapped false confidence invites rejection.

🛠️ For Maintainers (technically experienced)

Maintainers have architectural context and constraint knowledge that emerges from years of debugging edge cases, as well as a good understanding of what constitutes 'quality code'.

The core challenge: You need to efficiently separate signal from noise in AI-influenced contributions while remaining welcoming to less-expert contributors.

  1. Create "landing pads" for imperfect contributions. When intent is good but execution is flawed: "Your observation is correct — could you file this as an issue and I'll look at the proper fix?" This captures value without requiring production-quality code from everyone.

  2. Develop verification shortcuts for AI-generated code:

    • Check feature flag names against actual codebase (AI fabricates these)
    • Verify claimed behavioral changes between versions
    • Look for patterns that ignore ESP32 constraints (memory, DMA, PSRAM)
    • Check for platformio.ini modifications (frequent AI mistake)
  3. Invest in "guardrail documentation" over "how-to documentation." Document constraints (which functions must not block, which memory is DMA-capable, which globals are volatile). This is what AI consistently gets wrong.

  4. Teach the AI review tool incrementally. When you catch a pattern the AI missed, encode it into instruction files. Each correction improves the next cycle.

  5. Model "verify then trust" publicly. When you say "I checked and this flag doesn't exist" or "I traced this path and confirmed the bug," you teach contributors what good analysis looks like.

🤖 For AI Tool (Configuration Principles)

AI tools have tireless pattern-matching and cross-referencing capability across the entire codebase simultaneously.

The core challenge: Being helpful without being authoritative on things you can't verify, and adapting to an embedded/IoT context that differs from mainstream web development.

  1. Confidence calibration by category:

    Category Confidence Behavior
    Syntax, null checks, buffer overflows High Flag directly
    Flag/API existence Medium Always grep-verify before commenting
    Architectural fit Low Frame as questions, not assertions
    Hardware constraints (DMA, PSRAM, timing) Very Low Defer to maintainers explicitly
  2. Verify-first rule: Before stating "flag X does Y" or "function Z changed in version N," search the codebase. If unverifiable, say "I couldn't verify this — a maintainer should confirm."

  3. Domain-appropriate suggestions only. Before suggesting a pattern, check: Does it fit in available flash/RAM? Is it relevant to an LED controller? Does the protocol support it?

  4. Prioritize ruthlessly. One high-confidence verified finding > five speculative suggestions. Never pad reviews with noise.

  5. Learn from corrections explicitly. Maintainer overrides should feed back into instructions:
    AI suggests X → Maintainer says "no, because Y" → Y gets encoded → AI stops suggesting X


For the Process Overall

  1. Contribution maturity ladder:
  • Level 1: Issue report (describe the problem)
  • Level 2: Reproduction case (minimal steps)
  • Level 3: Proposed fix (AI-assisted is fine if labeled)
  • Level 4: Complete PR (fix + validation + docs)

Every level is valuable. Not everything needs to reach Level 4.

  1. "AI-assisted" label for PRs — not stigma, but a signal for reviewers to apply verification shortcuts.

  2. Quarterly "lessons learned" updates to AI instruction files. Review recent AI failures, encode corrections.

  3. FMEA for complex proposals — when contributors propose complex changes, a simple failure-mode analysis forces everyone to think about what can go wrong, leveling the playing field between experts and non-experts.


Suggested AI Review Checklist

Suggested AI Review Checklist

For maintainers reviewing AI-assisted PRs, or for AI review tools to self-check:

Before Approving Any AI-Reviewed/Generated Code:

  • Flag verification: All WLED_ENABLE_* / WLED_DISABLE_* flags mentioned actually exist in wled00/const.h or build configs
  • Memory constraints: No unbounded allocations, no VLAs, reserve() used for vectors/strings in hot paths
  • PSRAM awareness: Large buffers use p_malloc(), hot-path data stays in DRAM, DMA buffers not in PSRAM (ESP32 classic)
  • No blocking in effects: No delay(), no blocking I/O in FX.cpp or pixel pipeline
  • Platform guards: ESP32-specific code wrapped in #ifdef ARDUINO_ARCH_ESP32, ESP8266 in #ifdef ESP8266
  • Math functions: Uses sin8_t()/cos8_t() (not removed sin8()/cos8()), sin_approx()/cos_approx() over sinf()/cosf()
  • String handling: F() macros for constants, no String in hot paths, PSTR() for format strings
  • No fabricated references: Any cited functions, APIs, or version changes actually exist in the codebase
  • Scope proportionality: Change size matches problem size (5-line bug doesn't need 200-line refactor)
  • platformio.ini untouched: Or explicitly flagged for maintainer approval if modified
  • Attribution present: AI-generated blocks marked with // AI: comments per guidelines
  • Error handling: Return codes used (no exceptions), early-return guard clauses for invalid state
Suggested CodeRabbit Path Instructions

Suggested CodeRabbit Path Instructions

These path-specific instructions help CodeRabbit focus its reviews appropriately:

# .coderabbit.yaml path instructions (suggested)

path_instructions:
- path: "wled00/FX*.cpp"
 instructions: |
   This is the LED effects engine — the hottest code path in WLED.
   - Flag any use of delay(), yield(), or blocking I/O
   - Verify math uses sin8_t/cos8_t (NOT sin8/cos8 which are removed), sin_approx/cos_approx (NOT sinf/cosf)
   - Check that loop-invariant computations are hoisted outside pixel loops
   - Ensure IRAM_ATTR or WLED_O2_ATTR on performance-critical functions
   - No String objects or dynamic allocation inside effect functions
   - Use uint_fast16_t/uint_fast8_t for loop counters
   - perlin8/perlin16 replace inoise8/inoise16

- path: "wled00/bus_manager.*"
 instructions: |
   Bus management controls hardware LED output — DMA, timing, and pin allocation.
   - PSRAM buffers are NOT DMA-capable on ESP32 classic
   - Verify pin ownership via PinOwner enum and pinManager
   - Check for proper platform guards (#ifdef ARDUINO_ARCH_ESP32 / ESP8266)
   - No dynamic allocation in output path

- path: "wled00/cfg.cpp"
 instructions: |
   Configuration handling — JSON serialization, filesystem access.
   - ArduinoJson usage must respect document size limits
   - Verify all config keys match actual field names
   - Check for proper defaults when keys are missing
   - String handling: use F() for literals, avoid temporary String copies

- path: "wled00/json.cpp"
 instructions: |
   JSON API — state serialization and command parsing.
   - Validate all JSON keys against actual API documentation
   - Check bounds on numeric values before applying
   - Verify that state changes trigger appropriate notifications
   - No blocking operations in API handlers

- path: "usermods/**"
 instructions: |
   User-contributed modules — varying quality and maintenance levels.
   - Verify usermod follows the pattern: setup(), loop(), addToConfig(), readFromConfig()
   - Check that loop() has early-return guard: if (!enabled || strip.isUpdating()) return
   - Verify USERMOD_ID is registered in const.h ONLY if needed (pin ownership, inter-mod communication, or JSON identification)
   - Static strings must be PROGMEM: static const char _name[] PROGMEM = "..."
   - No delay() in loop — use millis() timing
   - Verify library.json dependencies are reasonable

- path: "wled00/data/**"
 instructions: |
   Web UI source files (HTML/JS/CSS).
   - Tab indentation (not spaces)
   - Reuse helpers from common.js — do not duplicate utilities
   - camelCase for JS functions/variables
   - Check for XSS: user-supplied values must be escaped before DOM insertion
   - Minimal footprint — every byte costs flash space

- path: "platformio.ini"
 instructions: |
   CRITICAL: Changes to this file require explicit maintainer approval.
   - Flag ANY modification to global build settings
   - Verify library versions haven't been arbitrarily bumped
   - Check that new environments don't conflict with existing ones
   - board_build and upload settings must match actual hardware specs

- path: "wled00/wled.h"
 instructions: |
   Main project header — included by everything.
   - Changes here affect ALL compilation units — flag any additions
   - Verify include order: project headers, then platform, then third-party
   - New globals must be justified — prefer encapsulation

- path: "wled00/const.h"
 instructions: |
   Constants and feature flags — the source of truth for WLED_ENABLE/DISABLE flags.
   - Verify new USERMOD_IDs don't conflict with existing ones
   - Feature flag names must exactly match usage elsewhere (preprocessor ignores typos silently)
   - Prefer constexpr over #define for new constants

Discussion Questions

  • Should we formalize the "AI-assisted" PR label?
  • How often should we update AI instruction files — per-release, quarterly, or on-demand?
  • Are there specific areas where AI review should be silenced entirely (too much noise, too little value)?
  • Should we create a "contributor experience" template that guides non-experts through the maturity ladder?
  • What metrics (if any) should we track to measure whether AI assistance is helping or hurting review throughput?

Summary

Each role has a unique, unfakeable advantage:

  • Contributors have real-world hardware experience no code reading can replace
  • Maintainers have architectural context from years of debugging edge cases
  • AI tools have tireless cross-referencing across the entire codebase

--> Success = each role contributes its unique advantage while being honest about limitations.

The worst outcomes happen when any role pretends to have expertise it doesn't. The best contributions follow: "Here's what I know for certain, here's what I think, and here's where I need help."

Pinned by softhack007

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions