Esperanto orthography
Encyclopedia : E : ES : ESP : Esperanto orthography
| Esperanto topics |
|---|
| Esperanto | Grammar | | Phonology | Orthography | Vocabulary |
| History | Zamenhof | Proto-Esperanto | "Unua Libro" | Declaration of Boulogne | "Fundamento" | Prague Manifesto |
| and |
| Culture | Esperantist | Esperantujo | Film | Internacia Televido | La Espero | Libraries | Literature | Music | Native speakers | Pop culture references | | Symbols | Zamenhof Day |
| and services |
| Amikeca Reto | Esperanto Academy | Kurso de Esperanto | Encyclopedia | Pasporta Servo | Plouézec Meetings | TEJO | UEA | SAT | World Congress | Youth Congress |
| National Associations |
| Australia | Britain | British Youth | Canada | Taiwan | USA | Labour |
| Criticism |
| Esperantido | Propedeutic value | Reformed | Riism | Vs. Ido | Vs. Interlingua | Vs. Novial |
| Related topics |
| Auxiliary language | Constructed language | Ido | Interlingua | Novial | Volapük | Signuno | Anationalism |
| Wikimedia |
| | [Vikipedio] | [eo] | [eo] | [Vikifonto] | [b] | [Vikikomunejo] | [Vikispecoj] |
Twenty-two of the letters are identical in form to letters of the English alphabet (q, w, x, and y being omitted). The remaining six have diacritic marks, ĉ, ĝ, ĥ, ĵ, ŝ, and ŭ (that is, c, g, h, j, and s circumflex, and u breve). The full alphabet is,
| A | B | C | Ĉ | D | E | F | G | Ĝ | H | Ĥ | I | J | Ĵ | K | L | M | N | O | P | R | S | Ŝ | T | U | Ŭ | V | Z |
| a | b | c | ĉ | d | e | f | g | ĝ | h | ĥ | i | j | ĵ | k | l | m | n | o | p | r | s | ŝ | t | u | ŭ | v | z |
With the exception of c (= [ts]) and the diacritic letters, the letters have approximately the sound values of the IPA. (See Esperanto pronunciation.) There is a nearly one-to-one correspondence of letter to sound; the only significant exceptions being the sequence kz, as in ekzemple, which is frequently pronounced [gz] and borrowed words such as ŭato that use ŭ for initial [w], which is normally an allophone of v. (See Esperanto phonology.)
In handwritten Esperanto, the diacritics pose no problem. However, since they don't appear on standard alphanumeric keyboards, various alternate methods have been devised for representing them in printed and typed text. The original method was a set of digraphs now known as the "h-system", but with the rise of computer word processing a so-called "x-system" has become equally popular. These systems are described below. However, with the advent of Unicode, the need for such work-arounds has lessened.
Unique to the Esperanto script is the spesmilo (1000 specie) sign, an Sm monogram for a now-obsolete international unit of auxiliary Esperanto currency used by a few British and Swiss banks before World War I. It has no Unicode value, and in ordinary fonts is transcribed as Sm, usually italic.
Origin
The script is modeled after Western Slavic scripts such as the Czech or Sorbian alphabet. However, the use of circumflexes instead of carons for the letters ĉ, ĝ, ĥ, ĵ, ŝ avoids the appearance of any national version of the Latin alphabet, and the non-Slavic bases g, j of the letters ĝ and ĵ, rather than Slavic dž, ž, help preserve the printed appearance of Latinate and Germanic vocabulary such as ĝenerala "general" and ĵurnalo "journal". The letter v stands for either v or w of other languages. The letter ŭ of the diphthongs aŭ, eŭ appears to be from the Belarusian Łacinka alphabet, historically associated with the Polish-Lithuanian Commonwealth. (Today Łacinka is strikingly similar to the Esperanto alphabet, but in Zamenhof's day it was closer to Polish; the convergence came with orthographic reforms two decades after Zamenhof went public with Esperanto.)
Due to these Slavic origins, the spelling of geographic names is sometimes divergent from English. This is especially remarked upon when English has the letters x, w, qu, or gu, as in Vaŝintono "Washington", Meksiko "Mexico", or Gvatemalo "Guatemala". However, such spellings are normal to several languages of Central and Northern Europe, including Croatian, Icelandic, Slovenian, and Lithuanian, and are functionally identical to Polish and Russian. Compare the Esperanto forms with Croatian Vašington, Meksiko, and Gvatemala. In the case of "Mexico", the spelling Meksiko is found in languages as diverse as Indonesian, Turkish, and Swahili.
Non-printed variants
Esperanto versions of international Morse code and Braille include the six diacritic letters. In Braille, the circumflex is indicated by adding a point at position 6 (lower right), and the u-breve is the mirror image of u. An Esperanto Braille magazine, Aŭroro, has been published since 1920.
There is a proposed manual alphabet as part of the Signuno project. Signuno itself, as signed Esperanto rather than a language in its own right, is a manual logographic Esperanto orthography. The Signuno alphabet deviates from international norms (that is, ASL with an Irish T) in that all letters are upright, with a straight wrist: the G is simply turned upright, while the H, P, Q are taken from Irish, the J from Russian, and the Z appears to be unique to Signuno. (It's shaped like an ASL 3, and appears to be derived from alphabetically adjacent V the way Ŭ was derived from adjacent U.) The diacritic letters Ŝ, Ĥ, Ĝ, Ŭ are derived from their base letters S, H, G, U; while Ĉ and Ĵ, like J, are Russian. Numerals 1-5 include the thumb, 6-9 do not, and 10, 100, 1000 are the Roman numerals X, C, M.
The names of the letters of the alphabet
Zamenhof simply tacked an -o onto each consonant to create the name of the letter, with the vowels representing themselves: a, bo, co, ĉo, do, e, fo, etc. The diacritics are frequently mentioned overtly. For instance, ĉ may be called ĉo ĉapela or co ĉapela, from ĉapelo (a hat), and ŭ may be called ŭo luneta or u luneta, from luno (a moon) plus the diminutive -et-.
This is fine for initialisms such as ktp [kotopo] for etc., but can be problematic when spelling out names. This is especially true because several consonantal distinctions are difficult for some nationalities, who normally rely on the fact that Esperanto seldom uses these sounds to distinguish words. (That is, they don't form many minimal pairs.) Thus the pairs of letter names ĵo ĝo, ĥo ho (or ĥo ko), co ĉo (or co so, or co to), and ŭo vo are problematic. In addition, over a noisy telephone connection it quickly becomes apparent that voicing distinctions can be difficult to make out: noise confounds the pairs po bo, to do, ĉo ĝo, ko go, fo vo, so zo, ŝo ĵo, as well as the nasals mo no. In addition, lo ro is a difficult distinction for many Asians, Africans, and Pacific Islanders; whereas speakers of languages with no /b/-/v/ contrast, like non-Mexican Spanish or Japanese, have trouble with bo vo.
There have been several proposals to resolve this problem. The one closest to international norms (and thus the easiest to remember) that also clarifies all the above distinctions is a modification of a proposal by Kálmán Kalocsay. It uses the vowel e after a consonant by default, but places the e before sonorants and voiceless fricatives; uses a as the vowel for
As with most languages, punctuation is not completely standardized, but in Esperanto there is the additional complication of multiple competing national traditions.
Commas are required to introduce subordinate clauses (that is, before ke "that" or the ki- correlatives),
Question marks (?) and exclamation marks (!) are used at the end of a clause, and may be internal to a sentence. Question words generally come at the beginning of a question, obviating the need for Spanish-style inverted question marks.
Periods may be used to indicate initialisms: k.t.p. or ktp for kaj tiel plu (et cetera), but not abbreviations that retain the grammatical suffixes. Instead, a hyphen optionally replaces the missing letters: D-ro or Dro for Doktoro (Dr). With ordinal numerals, the adjectival a and accusative n may be superscripted: 13a or 13ª (13th). The abbreviation k is used without a period for kaj (and); the ampersand (&) is not found. Roman numerals are also avoided.
The hyphen is also occasionally used to clarify compounds, and to join grammatical suffixes to proper names that haven't been Esperantized or don't have a nominal -o suffix, such as the accusative on Kalocsay-n or Kálmán-on. Zamenhof used a hyphen to attach particles to correlatives, such as tiu-ĉi (this one here), but this has fallen out of use.
Quotation marks show the greatest variety of any punctuation. Before computerized word processing, they reflected what the printer had available, which was often the national standard of the country where the printer was located. — Dashes, « guillemets » (often »reversed«), “double apostrophes” (also often „reversed“), and more are all found. (However, the 「East Asian」 quotes are not used, as they were designed to fit Chinese characters.) Very occasionally characters in a novel will be distinguished by individualizing the quotation marks used for them. Quotations are introduced with a comma or colon.
Capitalization is used for the first word of a sentence and for proper names used as nouns. Names of months, days of the week, ethnicities, languages, and the adjectival forms of proper names, etc., are not typically captitalized [anglo (an Englishperson), zamenhofa (Zamenhofian)], although national norms may override such generalizations. Titles are more variable: both the Romance style of capitalizing only the first word of the title and the English style of capitalizing all lexical words are found.
All capitals or small capitals are used for acronyms and initialisms of proper names, like TEJO, but not common expressions like ktp (etc.). Small capitals are also a common convention for family names, to avoid the confusion of varying national naming conventions: KALOCSAY Kálmán, Leslie CHEUNG Kwok Wing.
Camel case, with or without a hyphen, may occur when a prefix is added to a proper noun: la geZamenhofoj (the Zamenhofs), pra-Esperanto (Proto-Esperanto). It is also used for Russian-style syllabic acronyms, such as the name ReVo for Reta Vortaro (Internet Dictionary), which is homonymous with revo (dream). Occasionally mixed captitalization will be used for orthographic puns, such as espERAnto, which stands for the Esperanto radikala asocio (Radical Esperanto Association).
Zamenhof contrasted informal ci with formal, and capitalized, Vi as the second-person singular pronouns. However, lower-case vi is now used as the second-person pronoun regardless of number.
The original method of working around the diacritics was developed by the creator of Esperanto himself, L. L. Zamenhof. He recommended using u in place of ŭ, and using digraphs with h for the circumflex letters. For example, ŝ is represented by sh, as in shi for ŝi (she), and shanco for ŝanco (chance).
Unfortunately this method suffers from two main problems:
A more recent system for typing in Esperanto is the so-called "x-system", which uses x instead of h for the digraphs, including ux for ŭ. For example, ŝ is represented by sx, as in sxi for ŝi and sxanco for ŝanco.
X-digraphs solve the two main problems of the h-system:
A practical problem of digraph substitution that the x-system does not completely resolve is in the complication of bilingual texts. Ux for ŭ is especially problematic when used alongside French text, because many French words end in aux or eux. Aux, for example, is a word in both languages (aŭ in Esperanto). Any automatic conversion of the text will alter the French words as well as the Esperanto. A few English words like "auxiliary" and "Euxine" can also suffer from such search-and-replace routines. A few people have proposed using "vx" instead of "ux" for ŭ to resolve this problem, but this variant of the system is rarely used.
Another, less popular, system is the use of the caret character (^) to represent the diacritics, either before or after the letter to be accented. For example, ŝanco becomes ^sanco or s^anco. This shares the advantage of unambiguity with the x-system, and also has the advantage that the character itself resembles a circumflex accent, so that people unfamiliar with the system are likely to grasp what is meant. However, it has not caught on in many places, partially because it is regarded as aesthetically ugly, but also because it is cumbersome to type. It is primarily seen in introductions to Esperanto written in another language such as English, where introducing both the diacritics and the digraphs would be needlessly confusing.
A variant proposes sliding the caret onto the following vowel, since the circumflex vowels of French are widely supported.
So, one would write ehôsângôj cîujâude for eĥoŝanĝoj ĉiuĵaŭde. However, this proposal does not seem to have gotten off the ground.
Many new Esperantists perceive the diacritics to be a problem, and often propose reforms to Esperanto orthography, sometimes with substantial modifications. Such proposals are ignored by the community, both because they generally come from people who do not know the language well, and because reform projects tend to snowball, a fate that has destroyed several constructed languages.
The transliteration of Esperanto into ASCII is a topic known to cause flame wars with little constructive discussion. The reduction of such behaviour is sometimes indicated as one of the main reasons to go to the extra effort of using the proper diacritics. With the advent of Unicode, transliteration systems are no longer necessary on web pages. Nonetheless, the h- and x-systems remain common on Usenet and in e-mail, where encoding support is rare and the limited availability of keyboard configurations makes it difficult for many to type the diacritics.
The entire Esperanto alphabet is part of the Latin-3 and Unicode character sets, and is included in WGL4.
The HTML entities for the special Esperanto characters in Unicode are:
Microsoft Windows: A page that describes how to use the excellent tool [Keyman] (free for personal use) in conjunction with a special (free) "keyword file" is available [here]. It can be configured to automatically run at startup. The advantage of using Keyman is that you can easily deactivate it, so your "abbreviations" (such as "cx," which are automatically converted to the corresponding Esperanto letter as you type) are not accidentally converted.
A similar tool to Keyman is [Ek], which is free for any use.
You can also use [keyboard layout manager] to define special keys: the most elementary thing is associating AltGr+g to ĝ and similar ones. The program has a simple and intuitive interface, but it may be necessary to define a new keyboard to avoid interference from Windows' system-file protection system, that may not permit modifications of important system files as keyboard drivers.
Many popular e-mail clients support Unicode, so you can happily use the tools described above to write e-mails using the Esperanto alphabet.
If you want to use a text editor that is Esperanto-compatible, make sure it supports Unicode, as do Editplus (UTF-8) and [UniRed].
If the Linux system is recent, or kept updated, then the system is probably already working with Esperanto keys. For X11 and KDE, it's only necessary to switch to a keyboard layout that has Latin dead keys (for example, the "US International" keyboard), whenever the user wants to write in Esperanto. Some keyboards with dead keys are those:
In GNOME and GTK+ software, the installed input methods don't currently accept Esperanto key compositions (as of June 2006, Gnome 2.14 and Gtk+ 2.8). On many applications, however, one can use the standard X11 input method, by right-clicking on a text box, selecting "Input Methods" on the popup menu and then "X Input Method". Firefox is an example where this does not work, since it uses another popup menu instead of the default one. In these cases it's better to fall back to the difficult solution above.
If necessary, install and use high quality fonts that have Esperanto glyphs, like Microsoft Web core fonts (free for personal use) or DejaVu (The Bitstream Vera glyphs have the [Bitstream Vera license] and DejaVu extensions are in public domain).
Swedish Mac using Esperantists have the advantage of being able to use the Finnish Extended layout, which comes with the OS. Finnish has the exact same alphabet and type layout as Swedish and the extended layout adds functionality just like US Extended, only using other key combinations (the breve appears when you type option+y and the circumflex when you type a circumflex).
An Esperanto locale would use "." as the thousands separator and "," as a decimal point. Time and date format among Esperantists is not as standardized as number format, but 24-hour time with colon between hour and minutes, and for dates, either yyyy-mm-dd or dd-mm-yyyy, would be international and unambiguous.
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
(kaj means "and". The last two lines are:
Where letters are still confused, such as es vs eŝ, or a vs ha, mention can be made of the diacritic (eŝ ĉapela), or to the manner of articulation of the sound (ha brueta breathy aitch). Quite commonly, however, people will use the ’aitch as in ’ouse strategy used in English.Punctuation
The comma is also used for the decimal point, while thousands are separated by periods or non-breaking spaces: 12.345.678,9 or 12 345 678,9. ASCII transliteration systems
The h-system
The x-system
The x-system has become as popular as the h-system, but many people dislike it for its perceived "odd" appearance. Proponents argue that it would look "odd" only if one is expecting the appearance to resemble that of other European languages.The caret system
Unicode
Practical Unicode for Esperanto
Microsoft Windows
Adjusting a keyboard to type Unicode is actually relatively easy (all Windows variants of the Microsoft Windows NT family, such as 2000 and XP, for example, support Unicode; Windows 9x does not natively support Unicode).Linux
In Linux systems, configuration difficulty pretty much depends on whether your system version is old or new. This is so, because on old systems, it may be necessary to activate Unicode by setting the locale to a UTF-8 locale. There is a special eo_XX.UTF-8 locale available at [Bertil Wennergren's home page], along with a thorough explanation of how one implements Unicode and the keyboard in Linux.
Keys / Layout
US International
Brazilian ABNT2
Portuguese
ĉ
shift-6 c
shift-~ c
shift-~ c
Ĉ
shift-6 shift-c
shift-~ shift-c
shift-~ shift-c
ĝ
shift-6 g
shift-~ g
shift-~ g
Ĝ
shift-6 shift-g
shift-~ shift-g
shift-~ shift-g
ĥ
shift-6 h
shift-~ h
shift-~ h
Ĥ
shift-6 shift-h
shift-~ shift-h
shift-~ shift-h
ĵ
shift-6 j
shift-~ j
shift-~ j
Ĵ
shift-6 shift-j
shift-~ shift-j
shift-~ shift-j
ŝ
shift-6 s
shift-~ s
shift-~ s
Ŝ
shift-6 shift-s
shift-~ shift-s
shift-~ shift-s
ŭ
altgr-shift-9 u
altgr-shift-\ u
altgr-shift-~ u
Ŭ
altgr-shift-9 shift-u
altgr-shift-\ shift-u
altgr-shift-~ shift+u
Mac OS X
On Mac OS X systems, Esperanto characters can be entered by activating the "U.S. Extended" keyboard layout in the "Input Menu" pane of the "International" system preferences. When the U.S. Extended layout is active, Esperanto characters can be entered as follows:
The option characters can be remembered by mnemonics: the 6 key contains the caret character, so option-6 places a caret over the following character. Option-b stands for breve.Locale
See also
External links
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
