
All possible syllables in Mandarin — approximately 400 (compared to well over 10,000 in English).
It’s hard enough convincing my two-year-old the word sea doesn’t begin with the letter c. Imagine explaining to non-native speakers spelling oddities like -gh in both through and cough.
English is a false promise. Hypothetically, pronouncing any word is as simple as decoding its consonants and vowels; practically, decoding causes errors. When my students stumbled over words like colonel and I corrected them, I could see their eyes fill with exasperation. English was filled with random landmines; who knew where the next mispronunciation lurked?
Chinese, in contrast, was cut and dry. Characters didn’t pose any threat of mispronunciation because they offered no phonetic information to begin with. A word like 北京 contained no clues about where to place lips or tongue, no clues at all how to vocalize it; the spoken word Beijing could only be connected to the written characters 北京 through memorization. So when the Chinese came across characters they didn’t recognize, they weren’t able to sound them out and thereby guess at their meaning; when reading aloud they couldn’t gloss over unfamiliar words, faking them with a guessed pronunciation. Reading simply stopped until they asked for or looked up the meaning.
In a few cases, the shape or components of characters gave clues. 木 meant tree; two 木 were used for forest, 林; three 木 formed the word wooded, 森. It made pictographic sense. But each was pronounced differently (mu, lin, sen), and nothing beyond rote memorization tied these sounds to their symbols. English speakers have a good shot at recognizing words unfamiliar in written form (like a baby’s onesie), but Chinese speakers needed to look them up in the dictionary.
Korean, by the way, is entirely different. Each component of a Korean character connects to a spoken sound; syllables are built by stacking consonants and vowels together. In high school a friend once gave me an impromptu lesson that in twenty minutes had me “reading” aloud line after line of a Korean calendar, though I had no clue what I was saying. We can imagine this with languages such as Latin, Swahili, perhaps even Russian, but it was utterly impossible with Chinese characters since they lacked sounds. Even the Chinese couldn’t read aloud characters not already memorized. Was it any surprise, then, their education system emphasized rote memorization? Early years devoted to tying oral language, word by painstaking word, to visual symbols tended to flavor the rest.
Mandarin had relatively few syllables – around four hundred possible sound combinations (compared to well over ten thousand in English), due mainly to a lack of consonant endings (just –n and –ng). Fewer possible syllables resulted in more homonyms (words like tow and toe), which in turn offered more opportunities for puns, but also required frequent clarifications: “Is that Dear Abby’s dear, or Bambi’s deer?” I overheard such questions in almost any extended conversation.
I was mystified by Chinese dictionaries (how do you look up a drawn character?) until my students demonstrated (scans below). Each character was drawn by a number of predetermined brush strokes, including a root shape called a radical. The character 京 had the radical 亠, which was drawn by two strokes. So to find 京 in a dictionary you would first scan the list of two-stroke radicals to find 亠, note its dictionary section number and turn there, where common characters using that radical were organized, in a Mandarin-English dictionary, by sound according to the English alphabet. (I supposed Mandarin-Mandarin dictionaries were organized according to the Mandarin oral alphabet: bo po mo fo de te ne le ge ke he ji qi xi zhi chi shi ri zi ci si.)
Along with meaning, dictionaries indicated pronunciation in the form of pinyin, a Westernized phonetic system. A Chinese sentence such as “Wo shi meiguoren (I am an American)” is written in pinyin (as opposed to characters: 我是美国人). In gradeschool Chinese children learned pinyin as a set of training wheels before memorizing and writing actual characters.
Chinese typewriters also mystified me, but I was disappointed to learn that most people typed in pinyin. For the sentence “Wo shi meiguoren (I am an American)” they would begin with w-o, at which point the screen would display a list of all possible characters that correspond to this syllable, the most common (“I”) ranked first. They would then press 1, making the character appear on the screen. Then they would move on to the next word: s-h-i, then press 1 again (another common word) to confirm the character intended. Though it might seem a cumbersome method, typical typing speeds were between forty and fifty words a minute. (I use the same system typing characters for this book, but my speed is closer to one character per minute!) My students were so familiar with this system that their minds associated most words with numbers: “I” was always w-o-1, “am” always s-h-i-1. Thus the most complex characters required only four or five keystrokes – comparatively short considering that even this English sentence contains words requiring eight to ten keystrokes.
Another type of keyboard existed, my students said, one that allowed even faster typing. Although I never saw it, I can describe it as they did. Used by professional typesetters, it had over a hundred keys, each assigned to a specific type of stroke, so that typists built characters stroke by stroke until the computer used process of elimination to determine the character intended. It took a long time to master this system, they explained, but most words could be typed in two or three keystrokes, yielding typing speeds of over a hundred characters per minute.
“Overly complicated,” we English speakers think. “Give us the straight phonetics.” To be sure, writing with blocks of sound has distinct advantages. My students found crossword puzzles, word finds and Scrabble fascinating for their near-mathematization of language. And they relished the ability to read aloud long, complicated English passages without having any clue as to meaning.
Yet our language contains archaic spellings, history preserved in strange non-phonetic habits that befuddle both foreigners and ourselves alike. Why three forms of there/their/they’re? All three are pronounced exactly the same; each is clear enough in context. We correct some words (catsup -> ketchup) and simplify others (doughnut -> donut); why not streamline the rest? Sĕntĕnsĕz mīt lŏŏk līk this, hwĭch wŏŏd sēm ôfəlē fŭnē ăt fûrst bŭt wŏŏd bēkŭm fŭmĭlyər ēnŭf ōvər tīm – probably only a matter of months.
Aside from the ch of church, the letter c would cease to exist, its sounds covered by s and k. Sea would be sē, through would be thrōō, and cough côf. Adopting a system like English phonemic representation would cause mispronunciations to disappear — no small accomplishment since dyslexia has been tied to the arbitrariness of English spellings. (Among speakers of languages with more consistent spelling, such as Spanish and Italian, dyslexia is less problematic.)
Of course this would never take in America; wordsmiths would prohibit such sacrilege. (We can’t even adopt the metric system.) Yet the Communist Party completed just such an overhaul. Chinese characters used to be quite elaborate – if you think 龙 is daunting, try 龍 (the original version). Not every character changed, but many revisions were striking: 飞 replaced 飛, and 厅 replaced 廳. Believing a complex writing system to be both cumbersome and elitist, the Communist Party simplified many characters to rudimentary recognizable strokes, making literacy more achievable for common citizens. Of course such an initiative could only succeed through the kind of forced coordination between publishers, businesses and schools possible within a one-party autocracy. But they pulled it off, enabling all Chinese language learners, national and foreigner alike, to reap the benefit. And Chinese wordsmiths could still appreciate traditional calligraphy to their hearts’ content (perhaps even appreciating the increased esotericism).
The lack of phonetic information in Chinese characters posed a unique benefit. Although numerous Chinese dialects existed, all were able to use the same writing system. This means that two people unable to converse still could write notes to one another fluently. This simplified the efforts of both past emperors and present Communist Party leaders in maintaining control over a region much more linguistically and ethnically diverse than America.

Pocket Chinese-English dictionary. How do you look up a drawn character? Follow the steps below; click on each scan for a larger version.

Step 1: Determine the character’s root shape, or radical. For the character 京, the radical was 亠, which took two brush strokes to draw. Scan the list of two-stroke radicals (near top left) to find 亠, then find its corresponding section number (6).

Step 2: Scan the section number (6) for common characters that use the radical to determine its pronunciation. Section 6, which lists common characters that use the radical 亠, began on the previous page; 京 appears near the top left. It is pronounced “jing.”

Step 3: Characters are organized throughout the dictionary in English alphabetic order. For the character 京, pronounced “jing,” look up “j-i-n-g” to find the definition (middle of far right column).
March 3, 2008 at 8:47 am
Excellent post. That’s certainly answered some questions I had regarding alphabetical order (albeit in Japanese, not Chinese, but hey, they stole the Kanji anyway
).
March 3, 2008 at 11:51 am
Fascinating. I had never put two and two together to realize that a non-phonetic written language must have a supporting phonetic language in order to function!
When you say >10,000 “syllables” in English, you’re doing something like 5*21 vowel-consonant combinations, plus 21*5 consonant-vowel, plus 21*5*21 consonant-vowel-consonant, right?
This system:
reminded me of the paradigm behind this keyless language entry system. Note this page too — can you tell from the screenshots how similar this is to the syllable-then-select-character typing system you are familiar with, or the more sophisticated 100-key advanced system?
March 8, 2008 at 8:44 pm
Naw, I wasn’t that clever — I just parroted something I heard when I was learning to teach ESOL. I did do a quick internet search to confirm whether or not it was correct. Can’t find that page now, but I will note this page, which lists 80,000 possible English syllables. Your math equation gives us only 2,415 syllables, but it leaves out variations in vowel pronunciation plus paired consonants. For example: sh-, sg-, sh-, sk-, sl-, sm-, sn-, sp-, st-, sw- — even triples like str-, and that’s just syllables beginnings. It leaves out endings that can be as complex as -ngth. A figure as large as 80,000 must take into account words from other languages that have been assimilated into English.
When my first son began to speak, I played a syllable game with him. I’d start with a consonant like B and run it against the five major vowels: BAY, BEE, BI, BO, BOO. DAY, DEE, DI, DO, DOO. He’d repeat each after me. Eventually he got so good I could just say a letter and he’d run through the vowels on his own: PAY, PEE, PI, PO, POO! (We always shout the last one.) When I first started these games with him it occurred to me that some were whole words, and just about all were parts of longer words. Sure enough, after awhile he began saying, “PAY, PEE .. hey, pee-pee! MAY, MEE, MI, MO, MOO … moo like a cow!”
Incidentally — just to brag — one of the proudest moments I ever had with him was when, at just over 2 years old, he adapted our game to paired consonants without missing a beat. I wrote about it on this page, Mon 3 Sep 07:
March 8, 2008 at 8:54 pm
Very similar! I remember when you first blogged about those — at the time I was reminded of Chinese typing, too. But Chinese typing isn’t dynamic like that — you just get a little pop-up list of numbered options, and your next keystroke (a number) selects one.
I’ve seen handheld devices that use similar systems (not the dynamic, just the predictive) for English typing, mainly for text messaging. Type in enough letters on a word and the device predicts which word you want by filling in the rest — just hit the spacebar to accept it, or keep typing if it’s not the one predicted. It’s smart that we’re designing such software. Language may be infinite, but its building blocks, though incredibly numerous, are finite. In the near future we’ll probably have typing prediction customized to our own individual vocabularies.
Which makes me wonder — since you’re such a PERL scriptmeister, can you write a script that would rank the words I use the most in my writing? It would be similar to the WordPress tag cloud, or those online tallies you see sometimes of the words used most often in Bush’s speeches. I’d be curious to know what words I overuse, and which oddball words I’ve only employed once ever.