Skip to content

Commit 8084110

Browse files
Merge pull request #57 from contour-terminal/fix/invalid-utf8-decoding
Fixes decoding invalid UTF-8 locking up.
2 parents 4ecdd8b + c859099 commit 8084110

File tree

3 files changed

+12
-0
lines changed

3 files changed

+12
-0
lines changed

Changelog.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
## 0.2.1 (unreleased)
22

33
- Fixes unicode-query's output for "character width".
4+
- Fixes decoding invalid UTF-8 locking up.
45
- unicode-query is now linked statically on UNIX platforms.
56

67
## 0.2.0 (2022-11-13)

src/unicode/utf8.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,10 @@ inline ConvertResult from_utf8(utf8_decoder_state& _state, uint8_t _byte)
137137
_state.character = _byte & 0b0000'0111;
138138
}
139139
else
140+
{
141+
_state.currentLength = 1;
140142
return Invalid {};
143+
}
141144
}
142145
else
143146
{

src/unicode/utf8_test.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,14 @@ TEST_CASE("utf8.from_utf8", "[utf8]")
138138
CHECK(b32 == U"😖:-)");
139139
}
140140

141+
TEST_CASE("utf8.from_utf8.invalid", "[utf8]")
142+
{
143+
// Ensure invalid bytes are consumed and ignored accordingly.
144+
auto const a8 = string { "Hi\xb1Ho" };
145+
auto const a32 = from_utf8(a8);
146+
CHECK(a32 == U"HiHo");
147+
}
148+
141149
TEST_CASE("utf8.iter", "[utf8]")
142150
{
143151
auto constexpr values = string_view {

0 commit comments

Comments
 (0)