[codex] Cache list indentation regexes#3969
Conversation
|
@astelmach01 is attempting to deploy a commit to the MarkedJS Team on Vercel. A member of the Team first needs to authorize it. |
There was a problem hiding this comment.
Code Review
This pull request introduces a cachedIndentRegex utility in src/rules.ts to cache regular expressions based on indentation levels (capped at 3), optimizing the creation of various block-level regexes. A review comment identifies a potential SyntaxError that could occur if the indent parameter is zero or less, as it would result in an invalid range in the regular expression. A code suggestion was provided to clamp the indentation value to a minimum of zero.
There was a problem hiding this comment.
Code Review
This pull request introduces a caching mechanism for indentation-based regular expressions in src/rules.ts to optimize performance by reusing regex objects for common indentation levels. The feedback identifies a potential issue where a zero indentation value could lead to a negative quantifier in the generated regex, causing a syntax error, and suggests adding a safety check to ensure the indentation is at least zero.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
UziTech
left a comment
There was a problem hiding this comment.
Thanks for helping make marked faster! 💯
| const cappedIndent = Math.max(0, Math.min(3, indent - 1)); | ||
| const cacheIndex = cappedIndent; | ||
| let regex = cache[cacheIndex]; | ||
| if (!regex) { | ||
| regex = createRegex(cappedIndent); | ||
| cache[cacheIndex] = regex; | ||
| } | ||
| return regex; |
There was a problem hiding this comment.
| const cappedIndent = Math.max(0, Math.min(3, indent - 1)); | |
| const cacheIndex = cappedIndent; | |
| let regex = cache[cacheIndex]; | |
| if (!regex) { | |
| regex = createRegex(cappedIndent); | |
| cache[cacheIndex] = regex; | |
| } | |
| return regex; | |
| const cacheIndex = Math.max(0, Math.min(3, indent - 1)); | |
| let regex = cache[cacheIndex]; | |
| if (!regex) { | |
| regex = createRegex(cacheIndex); | |
| cache[cacheIndex] = regex; | |
| } | |
| return regex; |
Seems like you could eliminate one of these variables since the indent and index are the same value.
Marked version:
18.0.3 /
69257e45Markdown flavor: CommonMark
Description
This caches the indentation-sensitive regexes used while scanning list items. These regexes are rebuilt for each list item today even though the effective indentation is bounded to
Math.min(3, indent - 1), so the same few regex shapes are reused repeatedly during list parsing.A CPU profile of a CommonMark parse loop showed samples in
nextBulletRegex,fencesBeginRegex,headingBeginRegex, andhtmlBeginRegex. After caching the bounded indentation regexes, those helper functions no longer showed self samples in the same profile pass.Local benchmark notes from my machine:
npm run bench:markedimproved from 1655 ms to 1513 ms on the CommonMark benchmark run.Contributor
npm testpassed locally.Committer
In most cases, this should be a different person than the contributor.