Add syllable-boundary braille text wrap using pyphen hyphenation#20186
Open
LeonarddeR wants to merge 2 commits into
Open
Add syllable-boundary braille text wrap using pyphen hyphenation#20186LeonarddeR wants to merge 2 commits into
LeonarddeR wants to merge 2 commits into
Conversation
Wires AT_WORD_OR_SYLLABLE_BOUNDARIES mode: language tracking via Region._languageIndexes selects the correct pyphen dictionary per locale. Breaks at syllable boundary closest to display edge within the last word, falling back to word boundary if no better split found. Depends on: pyphen-abstraction + braille-textwrap-refactor Part of nvaccess#17010
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new braille “Text wrap” mode that can split long words at syllable boundaries using hyphenation dictionaries, including language-aware behavior, and documents/tests the feature.
Changes:
- Introduces
AT_WORD_OR_SYLLABLE_BOUNDARIEStext wrap option and implements syllable-boundary splitting in braille window row calculations. - Tracks language changes across region raw text to drive correct hyphenation dictionary selection.
- Updates user documentation / changelog and adds unit tests for language index tracking and syllable-boundary wrapping.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| user_docs/en/userGuide.md | Documents the new “At word or syllable boundaries” wrap option with an example. |
| user_docs/en/changes.md | Updates release notes to reflect the new 4-valued “Text wrap” option and hyphenation behavior. |
| tests/unit/test_braille/test_regionLanguageIndexes.py | Adds unit tests for region language index tracking/reset behavior. |
| tests/unit/test_braille/test_calculateWindowRowBufferOffsets.py | Adds tests for syllable-boundary wrap behavior; cleans up per-test AutoProperty overrides. |
| source/setup.py | Removes textUtils from a manifest/module list. |
| source/louisHelper.py | Adds getTableLanguage helper to read/normalize table language metadata. |
| source/config/featureFlagEnums.py | Adds the new BrailleTextWrapFlag enum value and label. |
| source/braille.py | Implements language index tracking and syllable-boundary wrap using hyphenation positions. |
Comment on lines
+16
to
+18
| """Build a TextInfoRegion without going through __init__ (which requires an NVDAObject).""" | ||
| region = braille.TextInfoRegion.__new__(braille.TextInfoRegion) | ||
| braille.Region.__init__(region) |
Collaborator
Author
There was a problem hiding this comment.
This would be an interesting case if unit tests would really fail, but they don't at all.
|
|
||
| def _getLanguageAtPos(self, pos: int) -> str: | ||
| """Get the language at a given position in L{rawText} based on L{_languageIndexes}.""" | ||
| keys = list(self._languageIndexes) |
Collaborator
Author
There was a problem hiding this comment.
False positive IMO. Dictionaries keep insertion order since Python 3.7, and keys are always inserted sorted.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Link to issue number:
Closes #17010
Follow-up for #20146 and #20145. This is the last of three PRs replacing #19916.
Summary of the issue:
Word wrap is sometimes pretty aggressive, especially on shorter braille displays. The previous two PRs added the text wrap infrastructure and continuation marks; this PR adds the final mode that splits long words at syllable boundaries using hyphenation dictionaries.
Description of user facing changes:
A fourth option, At word or syllable boundaries, is added to the Text wrap combo box in braille settings. Like "At word boundaries", it avoids splitting words mid-way, but when a word is too long to fit on the display it additionally tries to split at a syllable boundary (using hyphenation dictionaries from the
pyphenlibrary) so less of the word spills onto the next row. NVDA marks the split with the continuation mark (braille dots 7-8).For locales without a pyphen dictionary, the mode falls back cleanly to word-boundary behaviour without any error.
Description of developer facing changes:
BrailleTextWrapFlag.AT_WORD_OR_SYLLABLE_BOUNDARIESmember added toconfig.featureFlagEnums.Region._languageIndexes(dict[int, str]) tracks language-span boundaries within a braille region. Populated during_addFieldTextand_addTextWithFieldswhen format fields carry alanguageattribute or when field text is in a different language than the surrounding content.Region._getLanguageAtPos(pos)looks up the language at a raw-text offset using a bisect on the (always-ascending) keys of_languageIndexes.BrailleBuffer._getLanguageAtBufferPos(pos)delegates to the region that owns that braille cell.louisHelper.getTableLanguage(table)querieslouis.getTableInfofor the"language"key and normalises the result, providing the default language for a region when no format-field language is known.Description of development approach:
When
AT_WORD_OR_SYLLABLE_BOUNDARIESis selected and a word straddles a row boundary,_calculateWindowRowBufferOffsetsalready finds the last space before the display edge. This PR adds a second pass: it looks up the full word (from that space to the next space), retrieves the language at the word's braille position, and callstextUtils.hyphenation.getHyphenPositions(introduced in #20145) to obtain candidate hyphen offsets. It then iterates the candidates from the end (closest to the display edge) and picks the first that falls within the current row, updatingendaccordingly and settingshowContinuationMark.Language tracking in
Regionensures that the correct pyphen dictionary is selected even when a braille region contains multilingual content (e.g. a paragraph with inline foreign phrases).Testing strategy:
New unit tests in
test_calculateWindowRowBufferOffsetscover:endandshowContinuationMark = True.getHyphenPositionsreturns(), falls back to word boundary.New unit tests in
test_regionLanguageIndexescover:_addFieldTextinserts a switch/restore pair when field language differs._addTextWithFieldsrecords a language index for aformatChangecommand carrying alanguageattribute.TextInfoRegion.updateresets_languageIndexesto{0: default}, discarding stale entries from the previous update cycle.Manual testing: confirmed the new option appears in the braille settings panel with the correct label, that long words are split at syllable boundaries on a 20-cell display, and that the continuation mark is shown at the split point.
Known issues with pull request:
None.
Code Review Checklist: