Go to the source code of this file.
Enumerations | |
enum | UNICODE_ERROR |
enum | UNICODE_SINGLEBYTE |
enum | UNICODE_MULTIBYTE |
Functions | |
unicode_check () | |
_unicode_check () | |
unicode_requirements () | |
drupal_xml_parser_create (&$data) | |
drupal_convert_to_utf8 ($data, $encoding) | |
drupal_truncate_bytes ($string, $len) | |
truncate_utf8 ($string, $len, $wordsafe=FALSE, $dots=FALSE) | |
mime_header_encode ($string) | |
mime_header_decode ($header) | |
_mime_header_decode ($matches) | |
decode_entities ($text, $exclude=array()) | |
_decode_entities ($prefix, $codepoint, $original, &$table, &$exclude) | |
drupal_strlen ($text) | |
drupal_strtoupper ($text) | |
drupal_strtolower ($text) | |
_unicode_caseflip ($matches) | |
drupal_ucfirst ($text) | |
drupal_substr ($text, $start, $length=NULL) |
enum UNICODE_ERROR |
Indicates an error during check for PHP unicode support.
Definition at line 7 of file unicode.inc.
enum UNICODE_MULTIBYTE |
Indicates that full unicode support with the PHP mbstring extension is being used.
Definition at line 18 of file unicode.inc.
enum UNICODE_SINGLEBYTE |
Indicates that standard PHP (emulated) unicode support is being used.
Definition at line 12 of file unicode.inc.
_decode_entities | ( | $ | prefix, | |
$ | codepoint, | |||
$ | original, | |||
&$ | table, | |||
&$ | exclude | |||
) |
Helper function for decode_entities
Definition at line 351 of file unicode.inc.
_mime_header_decode | ( | $ | matches | ) |
Helper function to mime_header_decode
Definition at line 309 of file unicode.inc.
References drupal_convert_to_utf8().
_unicode_caseflip | ( | $ | matches | ) |
Helper function for case conversion of Latin-1. Used for flipping U+C0-U+DE to U+E0-U+FD and back.
Definition at line 450 of file unicode.inc.
_unicode_check | ( | ) |
Perform checks about Unicode support in PHP, and set the right settings if needed.
Because Drupal needs to be able to handle text in various encodings, we do not support mbstring function overloading. HTTP input/output conversion must be disabled for similar reasons.
$errors | Whether to report any fatal errors with form_set_error(). |
Definition at line 38 of file unicode.inc.
References get_t().
Referenced by unicode_check(), and unicode_requirements().
decode_entities | ( | $ | text, | |
$ | exclude = array() | |||
) |
Decode all HTML entities (including numerical ones) to regular UTF-8 bytes. Double-escaped entities will only be decoded once ("&lt;" becomes "<", not "<").
$text | The text to decode entities in. | |
$exclude | An array of characters which should not be decoded. For example, array('<', '&', '"'). This affects both named and numerical entities. |
Definition at line 331 of file unicode.inc.
Referenced by db_connect(), drupal_html_to_text(), and format_rss_channel().
drupal_convert_to_utf8 | ( | $ | data, | |
$ | encoding | |||
) |
Convert data to UTF-8
Requires the iconv, GNU recode or mbstring PHP extension.
$data | The data to be converted. | |
$encoding | The encoding that the data is in |
Definition at line 173 of file unicode.inc.
References watchdog().
Referenced by _mime_header_decode(), and drupal_xml_parser_create().
drupal_strlen | ( | $ | text | ) |
Count the amount of characters in a UTF-8 string. This is less than or equal to the byte count.
Definition at line 401 of file unicode.inc.
Referenced by _form_validate(), theme_username(), and truncate_utf8().
drupal_strtolower | ( | $ | text | ) |
Lowercase a UTF-8 string.
Definition at line 432 of file unicode.inc.
Referenced by book_export(), parse_size(), and template_preprocess_page().
drupal_strtoupper | ( | $ | text | ) |
Uppercase a UTF-8 string.
Definition at line 415 of file unicode.inc.
Referenced by drupal_ucfirst(), and tablesort_sql().
drupal_substr | ( | $ | text, | |
$ | start, | |||
$ | length = NULL | |||
) |
Cut off a piece of a string based on character indices and counts. Follows the same behavior as PHP's own substr() function.
Note that for cutting off a string at a known character/substring location, the usage of PHP's normal strpos/substr is safe and much faster.
Definition at line 470 of file unicode.inc.
Referenced by drupal_ucfirst(), theme_username(), and truncate_utf8().
drupal_truncate_bytes | ( | $ | string, | |
$ | len | |||
) |
Truncate a UTF-8-encoded string safely to a number of bytes.
If the end position is in the middle of a UTF-8 sequence, it scans backwards until the beginning of the byte sequence.
Use this function whenever you want to chop off a string at an unsure location. On the other hand, if you're sure that you're splitting on a character boundary (e.g. after using strpos() or similar), you can safely use substr() instead.
$string | The string to truncate. | |
$len | An upper limit on the returned string length. |
Definition at line 209 of file unicode.inc.
Referenced by mime_header_encode().
drupal_ucfirst | ( | $ | text | ) |
Capitalize the first letter of a UTF-8 string.
Definition at line 457 of file unicode.inc.
References drupal_strtoupper(), and drupal_substr().
Referenced by system_modules(), and system_modules_confirm_form().
drupal_xml_parser_create | ( | &$ | data | ) |
Prepare a new XML parser.
This is a wrapper around xml_parser_create() which extracts the encoding from the XML data first and sets the output encoding to UTF-8. This function should be used instead of xml_parser_create(), because PHP 4's XML parser doesn't check the input encoding itself. "Starting from PHP 5, the input encoding is automatically detected, so that the encoding parameter specifies only the output encoding."
This is also where unsupported encodings will be converted. Callers should take this into account: $data might have been changed after the call.
&$data | The XML data which will be parsed later. |
Definition at line 126 of file unicode.inc.
References drupal_convert_to_utf8(), and watchdog().
mime_header_decode | ( | $ | header | ) |
mime_header_encode | ( | $ | string | ) |
Encodes MIME/HTTP header values that contain non-ASCII, UTF-8 encoded characters.
For example, mime_header_encode('tést.txt') returns "=?UTF-8?B?dMOpc3QudHh0?=".
See for more information.
Notes:
Definition at line 279 of file unicode.inc.
References $output, and drupal_truncate_bytes().
Referenced by drupal_mail_send().
truncate_utf8 | ( | $ | string, | |
$ | len, | |||
$ | wordsafe = FALSE , |
|||
$ | dots = FALSE | |||
) |
Truncate a UTF-8-encoded string safely to a number of characters.
$string | The string to truncate. | |
$len | An upper limit on the returned string length. | |
$wordsafe | Flag to truncate at last space within the upper limit. Defaults to FALSE. | |
$dots | Flag to add trailing dots. Defaults to FALSE. |
Definition at line 234 of file unicode.inc.
References drupal_strlen(), and drupal_substr().
Referenced by _locale_translate_seek(), comment_admin_overview(), dblog_overview(), and dblog_top().
unicode_check | ( | ) |
Wrapper around _unicode_check().
Definition at line 23 of file unicode.inc.
References _unicode_check().
Referenced by _drupal_bootstrap_full(), and _drupal_maintenance_theme().
unicode_requirements | ( | ) |
Return Unicode library status and errors.
Definition at line 79 of file unicode.inc.
References _unicode_check(), and get_t().