for everybody aiming to treat UTF-8 correctly:
make sure that your code correctly operates on the
Big List of Naughty Strings
https://github.com/minimaxir/big-list-of...r/blns.txt
, which for example contains the 2 letters "Ⱥ" and "Ⱦ", which INCREASE in size from 2 to 3 bytes when lowercased...:
https://twitter.com/mikko/status/1059521508765298691
make sure that your code correctly operates on the
Big List of Naughty Strings
https://github.com/minimaxir/big-list-of...r/blns.txt
, which for example contains the 2 letters "Ⱥ" and "Ⱦ", which INCREASE in size from 2 to 3 bytes when lowercased...:
https://twitter.com/mikko/status/1059521508765298691