Tag: unicode

Quick Read on UTF-8 in Golang

Raw Strings we create a “raw string”, enclosed by back quotes, so it can contain only literal text. Regular strings, enclosed by double quotes, can contain escape sequences as we showed above. package main import ( “fmt” ) func main() { fmt.Println(`go\\n`) fmt.Println(“escapedGo\\n”) } Output go\\n escapedGo\n Raw string is always UTF-8 because it is part of the Go source…

Database Collation and UTF8MB4

Database collation defines how characters are compared and hence the order of rows in query results. Popular Encodings UTF8: Uses 3 bytes for a character UTF8MB4: Uses 4 bytes for a character, so allows more characters. How to Decode Collation utf8mb4_unicode_520_ci UTF8MB4 Unicode 5.2.0 comparison for characters ci: Case Insensitive comparisons Reference https://stackoverflow.com/questions/37307146/difference-between-utf8mb4-unicode-ci-and-utf8mb4-unicode-520-ci-collations-in-m# https://www.monolune.com/mysql-utf8-charsets-and-collations-explained/ Written with StackEdit.

Notes on String & Encoding Techniques

String and their encoding decide the languages the code can support. Introduction We have many languages and their symbols that need more than 8-bits (ASCII) for binary representation. Encoding adds semantics to a set of bytes. Unicode is a table of all characters and their numeric equivalent. Since there are more than 100k symbols, 8-bits are not enough. What is…