TODO

rewrite test functions from bool foo(x) ... to bool is_foo(x) ...
all about emoji flag sequences
bool is_mirrored(char32_t) noexcept (such as parenthesis, curly braces, brackets, ...)
- also ability to get the mirroring codepoint
map codepoint to block (enum) - see Blocks.txt
map coepoint to plane (enum)
map block to codepoint range
map plane to codepoint range
provide C API binding for basic functionality
script_segmenter: add support for commonPreferredScript tracking with regards to brackets () [] {}.
script_segmenter: test "foo(λ);" -> {Latin, Greek, Latin}
orientation_segmenter (and integrate it into run_segmenter as well as its tests)
mktables: to_string builder
mktables: to_type builder
mktables: pylint into CI
clang-tidy into CI
META: cmake install target (header files and .a file, executable)
META: pkg-config file
word segmentation (UTS algorithm)
generic text segmentation (top level segmentation API suitable for text shaping implementations)
CLI tool: unicode-inspect for inspecting input files by code point, grapheme cluster, word, script, ...
unit tests for most parts (wcwidth / segmentation)
README: list all TRs that are being implemented
API for accessing UCD properties
UTF8 <-> UTF32 conversion
grapheme segmentation (UTS algorithm)
symbol/emoji segmentation (UTS algorithm)
wcwidth equivalent (unicode::width(char32_t))
script segmentation
out<T> helper to force explicit ref(val) for more readability.
operator<<(ostream&, T) for all UCD properties - in its own header file (ucd_ostream.h)
emoji_segmenter: test "x 😀 y" -> {Text, Emoji, Text}
make run_segmenter more templated / customizable
mktables: enum class builder

Integration TODO

integrate into contour
see if this makes sense: make use of this library in klex lexical scanner, to allow unicode input

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TODO

Integration TODO

Uh oh!

FilesExpand file tree

TODO.md

Latest commit

History

TODO.md

File metadata and controls

TODO

Integration TODO