While saving a string to a custom table in the database, MySQL was
complaining that there was an incorrect string value. This was a result
of the string containing emoji unicode, which the database was not
expecting. The error looked like this:
.../vendor/yiisoft/yii2/db/Schema.php:677
Error Info:
Array
(
[0] => HY000
[1] => 1366
[2] => Incorrect string value: '\xF0\x9F\x8E\x93\\...' for column 'error' at row 1
)
Caused by: Exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x8E\x93\\...' for column 'error' at row 1'
The database was expecting a utf8 encoded string, but emojis are a
utf8mb4 string. UTF8 only supports 3 bytes per character while UTF8MB4
supports up to 4 bytes per character.
The Fix
use LitEmoji\LitEmoji;
$utf8String = LitEmoji::unicodeToShortcode($utf8mb4String);
LitEmoji is a package that Craft CMS uses to normalize values before
saving them to the database. An example of this can be found in the
PlainText field file in the Craft CMS source code. If it's good enough for them, it's good enough for me. Plus, since it's a required package by Craft, the LitEmoji class is available without any additional compose requirements.
When you need to access this data from the database, you can simply call the reverse function to turn it back into a normal emoji unicode value.
use LitEmoji\LitEmoji;
$utf8mb4String = LitEmoji::shortcodeToUnicode($utf8String);
Happy emoji-ing 🙂