This post covers the Encoding section of the Strings chapter when studying for the Zend PHP 7 Certification.
There are a variety of PHP extensions that support character encoding. The Multibyte String extension provides string functions that help you deal with multibyte encodings. mbstring
is a non-default extension. This means it is not enabled by default. You must explicitly enable the module with the configure option, passing in --enable-mbstring
as a configuration option.
The string functions that mbstring
provides can be seen below.
The mb_check_encoding()
function checks if the string is valid for the specified encoding. It can take two parameters.
mb_check_encoding($string, 'UTF-8');
The mb_convert_encoding()
function converts character encoding. It can take three parameters.
/* Convert internal character encoding to SJIS */
$str = mb_convert_encoding($str, "SJIS");
/* Convert EUC-JP to UTF-7 */
$str = mb_convert_encoding($str, "UTF-7", "EUC-JP");
The mb_detect_encoding()
function detects character encoding. This function can also take three parameters.
mb_detect_order
is used.false
.Note that if you try to use mb_detect_encoding()
to detect whether a string is valid UTF-8, use the strict mode, it is pretty worthless otherwise.
$str = 'áéóú'; // ISO-8859-1
mb_detect_encoding($str, 'UTF-8'); // 'UTF-8'
mb_detect_encoding($str, 'UTF-8', true); // false
mb_detect_order()
sets or gets the character encoding detection order. It takes one parameter which is the encoding_list
– an array or comma separated list of character encoding. If encoding_list
is omitted, it returns the current character encoding detection order as array.
/* Set detection order by enumerated list */
mb_detect_order("eucjp-win,sjis-win,UTF-8");
/* Set detection order by array */
$ary[] = "ASCII";
$ary[] = "JIS";
$ary[] = "EUC-JP";
mb_detect_order($ary);
/* Display current detection order */
echo implode(", ", mb_detect_order());
View the other sections:
Note: This article is based on PHP version 7.1.