26 Commits

Author SHA1 Message Date
Kovid Goyal
e8b19e08fa
Fix non-renderable combining chars causing some text to not be rendered on Linux
The test for non-renderable chars was broken and the variation selectors
were not included in the test. Fixes #4444
2022-01-05 22:33:53 +05:30
Kovid Goyal
d875615c03
Fix a regression in the handling of some combining characters such as zero width joiners
Fixes #4439
2022-01-05 08:50:55 +05:30
Kovid Goyal
fbf47f75d5
Fix soft hyphens not being preserved when round tripping text through the terminal
Also roundtrip all characters in the Cf category.

Characters with the DI (Default Ignorable) property are now
preserved but not rendered and treated as zero-width
as per the unicode standard.
See https://www.unicode.org/faq/unsup_char.html
2021-10-07 12:44:22 +05:30
Kovid Goyal
31e623afb3
Add support for Unicode 14
Fixes #3542
2021-10-04 14:00:35 +05:30
Kovid Goyal
3633049ba5
Forgot to include \r in the url regex 2021-07-19 18:09:00 +05:30
Kovid Goyal
ff1585acfe
Unicode input: Make diamond a synonym for gem
Fixes #3437
2021-04-02 12:53:58 +05:30
Kovid Goyal
d09666aba9
Unicode input kitten: Add symbols from NERD font
These are mostly Private Use symbols not in any standard,
however they are common enough to be useful.

Fixes #2972
2020-09-22 19:47:39 +05:30
Kovid Goyal
628b92f20b
Speed up is_ignored_char in the common case 2020-08-06 18:05:33 +05:30
Kovid Goyal
a835b56a51
Speed up is_combining_char() in the common case 2020-08-06 17:45:40 +05:30
Kovid Goyal
24197dc422
Render known country flags designated by a pair of unicode codepoints in two cells instead of four. 2020-04-06 22:16:59 +05:30
Kovid Goyal
bf4e8c490c
Update to Unicode 13.0
Fixes #2513
2020-04-06 18:59:35 +05:30
Kovid Goyal
b709ee6842
Add a function to check if a codepoint is a symbol 2019-10-01 18:57:06 +05:30
Kovid Goyal
8e1ed2f8c3
Update unicode data to 12.1 2019-08-02 14:48:18 +05:30
Kovid Goyal
facd353228
Update to using the Unicode 12 standard 2019-03-06 13:58:16 +05:30
Kovid Goyal
094ddd9333
Round-trip the zwj unicode character
Rendering of sequences containing zwj is still not implemented, since it
can cause the collapse of an unbounded number of characters into a
single cell. However, kitty at least preserves the zwj by storing it as
a combining character.
2018-08-04 18:29:45 +05:30
Kovid Goyal
000c1cf306
Implement support for emoji skin tone modifiers
Fixes #787
2018-08-04 10:06:25 +05:30
Kovid Goyal
61dd52b50f
Ignore the non-characters from the unicode standard in addition to ignoring the control characters 2018-06-14 10:20:13 +05:30
Kovid Goyal
0b93b85cf2
Dont use case range in names.h as it prevents compilation with Visual Studio 2018-05-01 11:27:10 +05:30
Kovid Goyal
f7001ea068
Fix character names for control characters not being read from unicode database
Also allow unicode_names.c to be compiled with python 2 so I can re-use
it in calibre.
2018-05-01 10:13:58 +05:30
Kovid Goyal
0b99bb534f
Unicode input: When searching by name search for prefix matches as well as whole word matches
So now hori matches both "hori" and "horizontal". Switched to a
prefix-trie internally.
2018-04-24 07:45:20 +05:30
Kovid Goyal
8c18486836
Module with all the data for unicode entry by character name 2018-02-09 19:56:25 +05:30
Kovid Goyal
ff2e5b3966
Avoid unnecessary calls to mark_for_codepoint 2018-02-06 11:23:39 +05:30
Kovid Goyal
fbe4d036d8
Have wcwidth() return 0 for marks instead of -1
Since kitty always treats marks as combinig chars, this allows us to
remove a few unnecessary branches
2018-02-05 10:06:05 +05:30
Kovid Goyal
fc7ec1d3f7
Get rid of the option to use the system wcwidth
The system wcwidth() is often wrong. Not to mention that if you SSH into
a different machine, then you have a potentially different wcwidth. The
only sane way to deal with this is to use the unicode standard.
2018-02-04 21:02:30 +05:30
Kovid Goyal
32632264ee
Mapping that can be used to store unicode mark symbols in only two bytes 2018-01-18 16:06:07 +05:30
Kovid Goyal
5faa649452
Drop the dependency on libunistring 2018-01-18 00:09:40 +05:30