Skip to content
Snippets Groups Projects
  • Henrik (Grubba) Grubbström's avatar
    9c9a91ae
    Sql.mysql: Use/support UTF-8 encoded UTF-16. · 9c9a91ae
    Henrik (Grubba) Grubbström authored
    MySQL/MariaDB default to a "utf8" character set that may only
    encode the BMP (max 3 bytes). In MySQL/MariaDB 5.5 and later
    there is an additional character set "utf8mb4" that also supports
    the code points outside the BMP. This new character set however
    requires redefining tables, etc for it to be able to be used.
    
    As a work-around we instead default to keep using the "utf8"
    character set while encoding characters outside the BMP with
    surrogate pairs. This works seemlessly with old table definitions,
    while having the minor defect of characters outside the BMP not
    collating as single characters.
    
    Fixes [PIKE-112].
    9c9a91ae
    History
    Sql.mysql: Use/support UTF-8 encoded UTF-16.
    Henrik (Grubba) Grubbström authored
    MySQL/MariaDB default to a "utf8" character set that may only
    encode the BMP (max 3 bytes). In MySQL/MariaDB 5.5 and later
    there is an additional character set "utf8mb4" that also supports
    the code points outside the BMP. This new character set however
    requires redefining tables, etc for it to be able to be used.
    
    As a work-around we instead default to keep using the "utf8"
    character set while encoding characters outside the BMP with
    surrogate pairs. This works seemlessly with old table definitions,
    while having the minor defect of characters outside the BMP not
    collating as single characters.
    
    Fixes [PIKE-112].