perf: replace recursive closure in Text() with iterative subtree walk#556
Closed
jvoisin wants to merge 1 commit into
Closed
perf: replace recursive closure in Text() with iterative subtree walk#556jvoisin wants to merge 1 commit into
jvoisin wants to merge 1 commit into
Conversation
Text() built a self-referencing closure `var f func(*html.Node); f = ...` to recurse into descendants and accumulate text. The closure capture of its own variable forces the compiler to box `f`, and each recursive call pays a Go stack-frame setup. This commit replaces it with an iterative depth-first walk using FirstChild/NextSibling/Parent pointers, anchored at each root in s.Nodes so the walk never escapes the original subtree. Allocations are unchanged (the remaining 6/15/19 allocs come from strings.Builder grow-doubling), but ns/op drops by roughly 30-40% across shallow and deep subtrees: name old time/op new time/op delta Text-8 1.86µs 1.11µs -40.4% (h2, n=14) TextDeep-8 126µs 86µs -31.6% (whole body) TextManyShallow-8 40.9µs 29.1µs -28.9% (all anchors) All p=0.000 over 10 runs on arm64. B/op and allocs/op are bit-for-bit identical in all three benchmarks.
Member
|
I'm not sold on that one, the resulting implementation is harder to read and has subtle conditions. I think just getting rid of the closure could be enough, and keep the recursive approach. |
Contributor
Author
|
I guess it's a matter of taste, I prefer the explicit iterative version with comments. But alright, I'll try a version using an outlined function instead of a closure, benchmark it, and will send a pull-request if it's improving things. |
Contributor
Author
|
See #561 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Text() built a self-referencing closure
var f func(*html.Node); f = ...to recurse into descendants and accumulate text. The closure capture of its own variable forces the compiler to boxf, and each recursive call pays a Go stack-frame setup.This commit replaces it with an iterative depth-first walk using FirstChild/NextSibling/Parent pointers, anchored at each root in s.Nodes so the walk never escapes the original subtree.
Allocations are unchanged (the remaining 6/15/19 allocs come from strings.Builder grow-doubling), but ns/op drops by roughly 30-40% across shallow and deep subtrees:
name old time/op new time/op delta
Text-8 1.86µs 1.11µs -40.4% (h2, n=14)
TextDeep-8 126µs 86µs -31.6% (whole body)
TextManyShallow-8 40.9µs 29.1µs -28.9% (all anchors)
All p=0.000 over 10 runs on arm64. B/op and allocs/op are bit-for-bit identical in all three benchmarks.