Charset

From Braindump
Revision as of 06:12, 11 August 2023 by Jan (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Unicode isn't hard if you know the history and where it comes from

https://mcilloni.ovh/2023/07/23/unicode-is-hard/

Baudot Encoding 5-bit

International Encoding

IA5

ASCII 7-bit Control blocks graphics

IBM CP Windows Latin-1 8-bit, upper codepoints used for

Unicode 2-byte UCS-2 or 4-byte UCS-4

UTF 8 bit UTF-8 10...... => next byte is also used

Byte order and BOM FFFE

Java internal UCS-2