Browsers stumble over six Esperanto letters
Browsers stumbled over six special letters

Browser producers for years had not provided for the six special Esperanto letters, but around 2000 with Unicode ™ the problem seemed to be either solved or close to it. The technology to type the special letters off the keyboard existed since at least 1998, and is being marketed by LingoMAIL in an e-mail system. Once LingoMail is set up, it is fairly easy to use. However, there are thousands of Esperanto Webpages that were devised before the Unicode ™ solution became generally accessible. Casual Websurfers who come across an Esperanto webpage might see any of several systems of providing for the six special letters. The Esperantist who reads Esperanto Webpages and E-mails is sometimes amuzed by the various systems, sometimes mixed in the one document! First, please look at the Esperanto alphabet:

Esperanto alphabet, 156mm x 33mm, probably 526 x 100, green background looks white on AOLpress, linked - jcm 03& 04 Mar 2000

The inventiveness of Esperantists has been strained in trying to provide for the special letters: Ĉ, ĉ, Ĝ, ĝ, Ĥ, ĥ, Ĵ, ĵ, Ŝ, ŝ, Ŭ, and ŭ or C^, c^, G^, g^, H^, h^, J^, j^, S^, s^, U~ and u~ on Webpages and in E-mails. Some people use the circumflex (^) and tilde (~) or apostrophe (') after, or before, the letters, but most add an "X" or an "H" after the relevant ordinary letter. All these systems off the keyboard used little disk space, AND they did not require up-to-date browsers, nor special typefaces nor settings.


A COMPARISON OF THE SOLUTIONS

A number of Esperantists download routinely-available programmes -- Latin-3 and Unicode ™ -- and sets of special typefaces, which allow them to view Esperanto on Webpages, as it ought to be displayed showing the special letters. Here is a display, USING SENTENCES SPECIALLY WRITTEN TO SHOW A LOT OF ACCENTED LETTERS, of what I can show so far. About July 2000 I learnt from a Webpage of John Wells how to use UNICODE ™ to show correct Esperanto, but for a time I could not understand how to make Unicode show in a Table on Netscape.

I was glad to write this at the time! "Note that in the nameplate Table elsewhere on this page, the special letter u~, i.e., ŭ, is showing on Netscape! - jcm 20Jul00."

  Ĵaŭdon matene ŝia fianĉo, forgesema de sia ĉirkaŭaĵoj, ĵetis ruĝan ŝuon al muŝo en la antaŭaĵo de preĝejo, sed ĝi batis ĥor-knabon.
Ĥor-knaboj malŝatas batiĝi de ŝuoj!  Li uzis malbonan lingvon!
Unicode ™ display.
  Jxauxdon matene sxia fiancxo, forgesema de sia cxirkauxajxoj, jxetis rugxan sxuon al musxo en la antauxajxo de pregxejo, sed gxi batis hxor-knabon.  Hxor-knaboj malsxatas batigxi de sxuoj!  Li uzis malbonan lingvon! X or "Ikso" convention
  Jhauhdon matene shia fiancho, forgesema de sia chirkauhajhoj, jhetis rughan shuon al musho en la antauhajho de preghejo, sed ghi batis hhor-knabon.  Hhor-knaboj malshatas batighi de shuoj!  Li uzis malbonan lingvon! "Full H" or "Amplified Zamenhofan" system
  ¬aýdon matene þia fianæo, forgesema de sia æirkaýa¼oj, ¼etis ruøan þuon al muþo en la antaýa¼o de preøejo, sed øi batis ¶or-knabon.  ¦or-knaboj malþatas batiøi de þuoj!  Li uzis
malbonan lingvon!  (NE mispresoj!)
Latin-3 coded: (NOT misprints!  Ugly and looks unpronounceable unless a visitor has the correct settings.)
  J^au~don matene s^ia fianc^o, forgesema de sia c^irkau~aj^oj, j^etis rug^an s^uon al mus^o en la antau~aj^o de preg^ejo, sed g^i batis h^or-knabon.  H^or-knaboj mals^atas batig^i de s^uoj!  Li uzis malbonan lingvon! Circumflex "^" and Tilde "~" (following) style: (Ugly! Hard to read.)
  J'au'don matene s'ia fianc'o, forgesema de sia c'irkau'aj'oj, j'etis rug'an s'uon al mus'o en la antau'aj'o de preg'ejo, sed g'i batis h'or-knabon.  H'or-knaboj mals'atas batig'i de s'uoj!  Li uzis malbonan lingvon! Apostrophe style: (Takes some getting used to. Poets like to use apostrophes to show omissions, so this style does not win their approval.)
  On Thursday morning her fiance, forgetful of his surroundings, threw a red shoe at a fly in the front of a church, but it hit a choirboy. Choirboys dislike being hit by shoes!  He used bad language! English
  Ĵaŭdon matene ŝia fianĉo, forgesema de sia ĉirkaŭaĵoj, ĵetis ruĝan ŝuon al muŝo en la antaŭaĵo de preĝejo, sed ĝi batis ĥor-knabon.  Ĥor-knaboj malŝatas batiĝi de ŝuoj!  Li uzis malbonan lingvon! Unikodo ™ elmontro (denove).

  NE mispresoj en Latino-3!  NOT misprints in Latin-3!   "¬," "¦," "¶," "¼," and the rest are appearing in words on webpages as part of the Latin-3 coding, because under that system people keen enough to download the programme and the special typefaces can read them as correct Esperanto letters, and presumably they consider this result outweighs the unintended consequences of casual internet surfers accidentally coming across an Esperanto page, and seeing it as gibberish! Surely this harms the image of Esperanto!!!


 Esperanto-Ligo de Okcidenta Aŭstralio (Korp.)
 Esperanto League of Western Australia (Inc.)

A short background

Standard Esperanto uses a Latin-type alphabet, just as most of Western and Central Europe does, and other continents. However, when Dr Louis Zamenhof invented Esperanto, he wished to tidy up the letters that he thought caused most problems, and he needed six sounds more than the number of letters left.  Remember, he was in a town where the mother-tongues included German (with several special letters, and at that time written in blackletter similar to "Old English" style), Polish (with a few special letters), Russian (with a different alphabet in Cyrillic script of 40 letters), and Yiddish and Hebrew (with a completely different alphabet, written from right to left!).  

To tidy up, rightly or wrongly Dr Zamenhof rejected four letters -- "q," "w," "x," and "y."  His additional sounds were represented by putting diacritical marks over six letters -- C, G, H, J, S, and U, to give the sounds of ch, j as the g in gem, kkh as the last sound of loch, zh as z in azure, sh, and w or a quick oo in diphthongs like AUX (ow as in cow).  (He left out the "th" sound as in "think" and the voiced "dh" sound as in "then," which are in Greek, English and Spanish.)  Here is how his 28-letter alphabet looks in small letters:-

Esperanto alphabet lower-case, probably 425x98 pixels, red background, linked to source

The inventor might have imagined or hoped that his six special new letters would be quickly stocked by most printers, just as has been done for the various alphabets around him and in Scandinavia and now even in tiny Malta.  He may not have foreseen the invention of the computer and the internet.  If he had, he would not have expected that, although many unusual letters, even from the Icelandic language with only a limited number of speakers, can be put on the internet by typing in a "code" -- for example "Þ" to make "Þ" and "Ø" to make "Ø" -- for years not one Esperanto letter could be easily put on the Internet. Requests to the controlling companies seemingly fell on deaf ears.

How did Esperantists provide for this? Were there 'workarounds'?

Yes, as you would expect, some "workarounds" were devised.  (By the way, the problem of putting Greek on the internet has led a news service to invent substitutes, using the ordinary United States computer keyboard and typefaces, for certain Greek letters, and they produce news quickly in a format they call "Greeklish."  With a little practice, a Greek reader can read these news bulletins, without having to buy special programmes and load down special sets of typefaces.)

The H solution: Back in the days of Dr Zamenhof, the problem of representing the six letters drew a solution from him.  If the printer did not have the Esperanto letters, he recommended inserting an H after the corresponding ordinary letter, to signify that it was a special letter for the special sounds, except for the last one, the letter U.  In quite a number of compound words the addition of the H can cause problems; for instance, an airport is "flughaveno," made up of "flug" and "haveno."  The sound after the "u" is NOT the GH, GX, G^ or Ĝ sound as g in "gem," but there are two separate sounds, a G as in go and an H as in hurry.  However, the problem can by solved by the use of a hyphen, thus "flug-haveno."  Some Websites use the traditional Zamenhof system, and a few use Amplified Zamenhofan, that is, an H is put after every one of the six special letters, including U.

The X convention: Early in the days of internet e-mails, chat sessions, bulletin boards, etc., some Esperantists avoided the problems of the H system by inserting an X to show the six special sounds.  It is a great system, and for many Esperantists it is easier to read than the H system.  Because there is no X in Esperanto,  when an Esperantist sees one in a word s/he knows at once that it is marking the letter before it as a special letter with its different sound.  However, the appearance of the x can be a bit "off-putting" to a non-Esperantist, especially if two or three appear in one word, like these: "cxirkauxajxo" and "cxefnovajxojn"!!! (However, words like this are fairly uncommon.) These words looks more readable as "chirkauhajho" (surroundings) and "chefnovajhojn" (main news).

Special programmes and sets of typefaces: Keen Esperantists have devised the use of special programmes, linked into special sets of typefaces.  The main systems are Latin 3, and Unicode ™, as mentioned above and below.  To use Unicode, it is recommended that Esperantists read: http://www.concinnity.se/bertilow/html/vindozo_unikodo.htm

Let's look more closely at the different methods of showing Esperanto

Below are some organised charts showing some of the methods of representing Esperanto on webpages.

H systems: The H systems are attractive because they look more natural for casual non-Esperantist Websurfers who happen to drop into Websites.  We ought to be trying to attract non-Esperantists.  Here is a key to the "Full H" system:

"Full H" or "Amplified Zamenhofan" style

Sur retpagho ke uzas la "Plena H" auh "Plivastige Zamenhofa" stilo, rigardu "ch", "gh", "hh", "jh", "sh"  kaj "uh" (anstatauh c^, g^, h^, j^, s^, kaj u~).

On a webpage that uses the "Full H" or "Amplified Zamenhofan" style, see "ch", "gh", "hh", "jh", "sh"  and "uh" (instead of  c^, g^, h^, j^, s^, and u~), pronounced as follows:-

"ch" (pron. "CH" as in "church"),
"gh" (pron. "J" sound of the "g" in "gem"),
"hh" (pron. "KKH" as the last sound in "loch"),
"jh" (pron. "ZH" as the "z" in "azure"),
"sh" (pron. "SH" as in "shop") and  
"uh" (pron. "W" or a quick "OO").

X convention:  The X convention is excellent for regular readers of it.

"X" or  "Ikso" style

Sur retpagxo ke uzas la "X" aux "Iksa" convencio, rigardu "cx", "gx", "hx", "jx", "sx"  kaj "ux" (anstataux c^, g^, h^, j^, s^, kaj u~).

On a webpage that uses the "X" or "Ikso" convention, see "cx", "gx", "hx", "jx", "sx"  kaj "ux" (instead of c^, g^, h^, j^, s^, and u~), pronounced as follows:-

"cx" (pron. "CH" as in "church"),
"gx" (pron. "J" sound of the "g" in "gem"),
"hx" (pron. "KKH" as the last sound in "loch"),
"jx" (pron. "ZH" as the "z" in "azure"),
"sx" (pron. "SH" as in "shop") and
"ux" (pron. "W" or a quick "OO").

Latin-3: Some Latin-3 is displayed on this website.  Below is an explanation and key:

Latin-3 version

Se retpaøo estas en "Latino-3" stilo, øi uzas specialaj skribsignoj kiuj povas aýtomate þanøata al la ses super-signataj literoj en majuskloj kaj minuskloj, jene:
CH = Æ, GH =  Ø, HH=  ¦, JH =  ¬, SH =  Þ, UH = Ý, kaj,
ch = æ,  gh = ø, hh =  , jh =  ¼, sh =  þ, uh =  ý.

If a webpage is in "Latin-3" style, it uses special characters which can be automatically changed to the six super-signed letters in capitals and small letters, as follows:
Æ  = CH   
and æ = ch (pron. "CH" as in "church"),
Ø   = GH
  and ø  = gh (pron. "J" sound of the "g" in "gem"),
"Š"   = HH  and  = hh (pron. "KKH" as the last sound in "loch"),
"¬" = JH   and ¼  = jh (pron. "ZH" as the "z" in "azure"),
Þ   = SH   
and þ  = sh (pron. "SH" as in "shop") and  
Ý
  = UH  and ý   = uh (pron. "W" or a quick "OO").

I will be soon investigating the Official new-look Unicode ™ website, at http://www.unicode.org/ - jcm 20Jul00


Unicode ™ version

Se retpaĝo estas en "Unikodo ™" stilo, vidi la specialaj skribsignoj oni bezonas Unikoda-amike tiparoj. Dum mia fruaj eksperimentoj mi ne ĉiam povas montri Unikodo ™ en Tabloj per Netskapo. (Mi uzis multaj " " anstataŭ la Tabla sistemo.)

If a webpage is in "Unicode ™" style, to see the special signed letters you need Unicode-friendly Typefaces.  At first I could not always make Unicode work in Tables on Netscape, but now I can.  Below I have used a Table but inserted an earlier version that used " " to space the columns out. Also, in Netscape, if I insert in the Head UTF-8 (one of the systems' command lines), the following ordinary characters turn into empty rectangles: "clever quotation marks", one of the Latin-3 characters, and the Trade Mark (TM) or ™ symbol if using ™. (The NEW Trade Mark (TM) or ™ coding is ™, or use ™. - jcm 30 Aug 2000)

Another Netscape problem is that viewing the Source sometimes seems useless, because the vast majority of the page is sometimes filled with rectangles. Internet Explorer had the trick, at some stage, probably because I had changed some default settings, of turning the UTF-8 page all into Verdana typeface, whether I want that or not! - jcm 2257 20 Jul 00, revised 30 Aug 00

If you see question marks (?), strange symbols or squares when viewing the next line:
Ĉ, Ĝ, Ĥ, Ĵ, Ŝ, Ŭ,
please adjust your settings:
Netscape Navigator: View/Encoding/Unicode (UTF-8), or
Internet Explorer: View/Fonts/Universal Alphabet (UTF-8), or
similar changes with other Browsers.

Unicode ™ version

         Majusklaj                                   Minusklaj
Ĉ => Ĉ = C^                   ĉ => ĉ = c^       (pron. "CH" as in "church")
Ĝ => Ĝ = G^                   ĝ => ĝ = g^       (pron. "J" sound of the "g" in "gem")
Ĥ => Ĥ = H^                  ĥ =>  ĥ = h^       (pron. "KKH" as the last sound in Scots "loch")
Ĵ => Ĵ  = J^                   ĵ =>  ĵ  = j^       (pron. "ZH" as the "z" in "azure")
Ŝ => Ŝ = S^                   ŝ => ŝ = s^        (pron. "SH" as in "shop")
Ŭ => Ŭ = U~                  ŭ => ŭ = u~       (pron. "W" or a quick "OO")


The Esperanto alphabet is:

A, B, C, Ĉ, D, E, F, G, Ĝ, H, Ĥ, I, J, Ĵ,
K, L, M, N, O, P, R, S, Ŝ, T, U, Ŭ, V, Z.

Copied from Curtis Whalen's webpage http://www.geocities.com/Heartland/Meadows/3044/EsperantoCharacterReference.html

20 July 00: For Unicode ™, the wording in the Head for UTF-7 and UTF-8 seems to be:
<META http-equiv="Content-Type" content="text/html; charset=x-UNICODE-2-0-UTF-7">
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">

STRANGE RESULTS

Before I understood that it is helpful if the Webpage Head has a Unicode ™ command line, OR that one could adjust the Browser settings to read an Esperanto Unicode page, the following strange results used to appear when I copied Unicode coding from Source HTML:
ÄÆ ÄÆ, Ĝ ÄÆ, Ĥ Ä¥, Ä´ ĵ, Ŝ ÅÆ, Ŭ Å­.  from http://www.concinnity.se/bertilow/unikodo_vindozo/index.htm  But later I learnt that these strange results were being created by the Pagemaker I was using being out of date, and actually changing what I had put onto the Webpage!

UNICODE ™ SAMPLER (with many many languages) is at: http://www.columbia.edu/kermit/utf8.html.   In this display Explorer 4+ or 5+ ? showed many more languages than Netscape, even managing to display Hebrew (and Yiddish), although not turning them into Right to Left.

20Jul00: It seems that Explorer can automatically turn itself on to UTF-8; try View/Fonts to see that. Explorer even turns itself to UTF-8 when the webpage is set for UTF-7 !! But Navigator SEEMS not to turn it on, even though UTF-8 and UTF-7 show in View/Encoding. Netscape Navigator also could not display the Esperanto letters in TABLES yesterday. Navigator seemed unable to do so today, until in one table on this webpage it decided to display the ŭ , i.e., the u~ - jcm 3.39pm 20Jul00.


Breakaway systems

There are other variants, including some which use the W to represent the U~, X to represent C^ or S^, Q to represent some sound, and some using Y instead of J.  One breakaway system is at: http://www.nautilus.com.br/~ensjo/misc/senakcenta.htm  as follows:- [Quote begins]

Senakcente...

The two-character solutions to represent the [previously] "undisplayable" letters break in a certain way the "one sound–one letter; one letter–one sound" principle of Esperanto. If you sort alphabetically a set of words in some of these conventions, some words will be missorted (ux, standing for u~, would come before uz, which is wrong).

One could however devise another alphabet for Esperanto, using all the 26 letters available on a keyboard and nothing else:

current letter sound new writing why?
j y in "yes" y found in English
u~ w in "how" w found in English
j^ s in "vision" j found in French, Portuguese
g^ g in "gem" dj g^ = dj^; found in French, Portuguese
s^ sh in "she" x found in Portuguese, Catalan
c^ ch in "cheek" tx c^ = ts^; found in Catalan, Basque
h^ Spanish j or German ch q q is the only letter left; also, everywhere it is used to represent a guttural sound, a class of sounds to which h^ belongs.

dj and tx are not "one letter" each, nor digraphs: they are in fact the juxtaposition of two letters, each having its own well-defined sound.  There are no single letters to pair the "old" Esperanto g^ and c^.  [End of quote]

Notes by John Massam: (In the International Phonetic Alphabet, "t" and "sh" spoken together are taken to represent "ch" as in "choke," and "d" and "zh" spoken together represent "j" as in "joke," just as the above chart shows.)

The disadvantage of a system that uses OTHER LETTERS is that it does not train the eye and brain to recognise ordinary standard Esperanto, thus separating Esperantists and others from the world's Esperanto magazines and the thousands of translated and original Esperanto books.  The H and X systems retain the Zamenhof alphabet as closely as they can.


More information about CODING ESPERANTO is at http://irvdel.webjump.com/kodoj.html. There is information about UNICODE ™ (two types) and LATIN-3.

(Also included is an illustration and explanation of a method of using the ordinary QWERTY keyboard to type the special Esperanto letters.  Evidently it takes some getting used to, but the Q produces a S^, the W a J^, the Y a G^, and the X a C^. The U~ is produced "automate," and the H^ works off the H.  The author warns: " Foje HH/Hh/hh ne estas subtenata. Metodo entajpi oÅ­ aÅ­ sÅ­ ktp ne estas normigita.  Rimarku la sensuperhokajn vortojn kiaj balau, pereu, reutili, praulo, posteulo, ktp.")


LANGUAGES ON THE WEB is an amazing site linking to dozens of languages, and asking help in providing Internet language training. See it at http://www.languages-on-the-web.com/

Esperantaj literoj: <b>ÄÆ ÄÆ, Ĝ ÄÆ, Ĥ Ä¥, Ä´ ĵ, Ŝ ÅÆ, Ŭ Å­</b>. from Source of some site now mislaid, BUT it changed from how it appeared in the Source!!!!!

Putting the Source of that site into the Source of this via AOL:-- Esperantaj literoj: ÄÆ ÄÆ, Ĝ ÄÆ, Ĥ Ä¥, Ä´ ĵ, Ŝ ÅÆ, Ŭ Å­.  When I put this in, the appearance of FIRST TWO ITEMS, the FOURTH, and the THIRD-LAST in the very first line changed.  Instead of it appearing as it had in Bertolow's page, A umlaut ^  A umlaut 0/00, the second character changed into joined AE, the fourth changed from a square to a joined AE, and the third-last changed from a square to a joined AE - jcm 8.31pm 12Mar2000

&Auml;&#198; &Auml;&#198;, &Auml;œ &Auml;&#198;, &Auml;&#164; &Auml;&#165;,
&Auml;&#180; &Auml;&#181;, &Aring;œ &Aring;&#198;, &Aring;&#172; &Aring;&#173;. from Source of the AOL version.


A small company can transmit 31 languages, including right-to-left ones with unique scripts – why not the big firms?

ESPERANTO  inkluzivigxas!   E-POSXTOJ  EN 31  LINGVOJ: Sendu e-posxtoj en Angla, Araba, Bjelorusa, Bulgara, Cxehxa, Dana, Esperanta, Finna, Franca, Germana, Greka, Hebrea, Hispana, Hungara, Islanda, Itala, Jida, Kroata, Latva, Nederlanda, Norvega, Pola, Portugala, Romana, Rusa, Serba, Slovaka, Slovena, Sveda, Turka, kaj Ukraina lingvoj per "LingoMAIL" (ne faras kun Makoj) che http://www.lingomail.com/

ESPERANTO included!   E-MAILS  IN  31  LANGUAGES: Send e-mails in English, Arabic, Byelorussian, Bulgarian, Czech, Danish, Esperanto, Finnish, French, German, Greek, Hebrew, Spanish, Hungarian, Icelandic, Italian, Yiddish, Croatian, Latvian, Dutch, Norwegian, Polish, Portuguese, Rumanian, Russian, Serbian, Slovakian, Slovenian, Swedish, Turkish, and Ukrainian languages with LingoMAIL (won't work with Macs) at http://www.lingomail.com/


Verda Stelo, turnanta  Aĉetu Esperantan poŝtkarton kun bildo de harmonio farita de Ĉelina Gates!

HEJMO  ENHAVO  Traduku  Kunigiloj  Okazajxoj  Libroj  Funkciuloj  Membreco  Kontaktu  Kursoj  Literoj  Motivoj
HOME  CONTENTS  Translate  Links  Events  Books  Office-bearers  Membership  Contact  Courses  Letters   Reasons

Verda Stelo, turnanta  Buy an Esperanto postcard with Chelinay Gates' harmony picture!  Five for $4

This Webpage is mainly in English
Retpagxoj en "X" aux "Iksa" stilo uzas "cx", "gx", "hx", "jx", "sx" kaj "ux" (anstataux c^, g^, h^, j^, s^, kaj u~).
Webpages in "X" or "Ikso" style use "cx" (pronounced CH as in "church"),  "gx" (as the J sound in "gem"), "hx" (pron. KKH as the last sound in "loch"), "jx" (pron. ZH as the "z" in "azure"),"sx" (pron. SH), and "ux" (W or quick OO).  Note other pronunciations: "c" = TS, "j" = Y, "aux" = OW as in "cow," "ej" "aj" "oj" rhymes with "Hey my boy!" and the five vowels are heard in "Are there three or two?"

Esperanto League of Western Australia (Inc.), Perth, WA, Australia.


   • "Keyboarding Esperanto and other accented letters with Microsoft Word and Apple" Macintosh (composed 30 Nov. 2001), <keyboard.htm>
   • "Fulmoklavoj por supersignaj literoj sen transricevo" por Mikrosofto ® Vordo © kaj Aplo Makintoŝo ™ OS X <http://­www.­johnm.multiline.com.au/­fulmokl­avoj.htm> (Sendepende eltrovis je 30-a Novembro 2001).

   Coded using AOLPress/2 ™  01 Mar 2000,  (spellings checked 06Mar00, links 07Mar00), modified to 13 Mar 00, modified at "Breakaway Systems" and at "Senakcente," removed the TV teaching segment, and several other changes 29 Mar 00, last modified using Microsoft ® WordPad © je hejmo je lundo 17 Junio 2013