Introduce nob_lexer - A general purpose lexer #171

maddeye · 2025-12-11T09:48:56Z

A single-header lexer that builds on top of nob.h for tokenizing source code.
Requires C99 or later.

Features:

Identifiers (including UTF-8/Unicode)
Integers: decimal, hex (0x), binary (0b), octal (0o)
Floats: decimal (3.14, 1e10) and C99 hex floats (0x1.Fp+10)
String and character literals with escape sequences
Line (//) and block (/* */) comments
Location tracking (file:line:column)
Optional prefix stripping (NOB_LEXER_STRIP_PREFIX)
Optional comment skipping (NOB_LEXER_SKIP_COMMENTS)
Includes test suite in tests/lexer.c

Comparison with stb_c_lexer.h

Aspect	nob_lexer.h	stb_c_lexer.h
Dependencies	Requires nob.h	Standalone
Memory	Zero-copy (tokens are views into source)	Requires separate string storage buffer
Location tracking	Built-in file:line:column on every token	Separate inefficient function call
Binary literals	Yes (0b1010)	No
C99 hex floats	Yes (0x1.Fp+10)	Optional
Unicode identifiers	Yes (UTF-8)	Yes (>= 128 bytes)
Multi-char operators	No (++, --, ==, etc.)	Yes (C-complete)
Configuration	Simple defines	20+ compile-time Y/N flags
API style	Modern (dynamic arrays, string views)	Classic C (manual buffer management)
Comment handling	Tokens returned (optionally skip)	Always discarded
Suffixes (uLL, etc.)	No	Optional

With this you no longer have to complain about stb_c_lexer. This implementation should be more than sufficient for your purposes and follows the simplicity of nob.

Hope you find this helpful 😄.

Disclaimer

I used llm for a code review and small formatting changes. Also it helped for the HexFloat implementation. The rest of the code is written by myself!

marc-dantas · 2025-12-22T00:33:16Z

stb_c_lexer replacement lol. Genius

maddeye added 2 commits December 11, 2025 10:30

feat: introduced nob_lexer

e0f5206

feat: added unicode and hexfloat support

d443ce5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce nob_lexer - A general purpose lexer #171

Introduce nob_lexer - A general purpose lexer #171

Uh oh!

maddeye commented Dec 11, 2025 •

edited

Loading

Uh oh!

marc-dantas commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Introduce nob_lexer - A general purpose lexer #171

Are you sure you want to change the base?

Introduce nob_lexer - A general purpose lexer #171

Uh oh!

Conversation

maddeye commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comparison with stb_c_lexer.h

Disclaimer

Uh oh!

marc-dantas commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maddeye commented Dec 11, 2025 •

edited

Loading