CESU-8
Encyclopedia : C : CE : CES : CESU-8
| Unicode |
|---|
| Encodings |
| UCS |
| Mapping |
| Bi-directional text |
| BOM |
| Han unification |
| Unicode and HTML |
| Unicode and e-mail |
| Unicode typefaces |
In practice, CESU-8 is often used to communicate with the Oracle database software, which in modern configurations apparently uses UTF-16 as an internal character representation. Oracle's "UTF-8" (actually CESU-8) codec rejects proper UTF-8 sequences for characters from outside the Basic Multilingual Plane, but happily accepts and generates technically invalid UTF-8 sequences for codepoints in the surrogate range (U+D800 .. U+DFFF), as specified in CESU-8.
External links
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
