That “certain cut”: towards a characterology of Mandarin Chinese

In his book Language: An Introduction to the Study of Speech, published in 1921, Edward Sapir wrote that ‘ … there is such a thing as a basic plan, a certain cut, to each language’. Linguists of the Prague school referred to the “characterology” of language; for example Vilém Mathesius, in his article ‘On linguistic characterology with illustrations from Modern English’ written in 1928, wrote that ‘… linguistic characterology deals only with the important and fundamental features of a given language at a given point in time’, noting as an example that ‘… Modern English shows a characteristic tendency for the thematic conception of the subject’. It is a concept that proves extremely difficult to make explicit; whatever it is that constitutes the unity of all the components of the picture seems elusive. Can we give any overall characterisation of that ‘certain cut’, or is it simply ineffable?

In a well-known passage from his book Language: An Introduction to the Study of Speech, published in 1921, Sapir wrote that ‘… there is such a thing as a basic plan, a certain cut, to each language’ (Sapir 1921). A similar notion was expressed by linguists of the Prague school, under the name of “characterology”; notably Vilém Mathesius, in his 1928 article ‘On linguistic characterology with illustrations from Modern English’ (Mathesius 1964). Many people will share the feeling that, in some undefined sense, every language is unique. Some will even insist that their own language is “more unique” than any other; but there is no known way of measuring uniqueness, and in any case this claim has been made about too many languages to be taken seriously. But the idea of the “certain cut” is very appealing; it is, so to speak, the limiting case of a typological grouping–ultimately, every language is the only exemplar of its particular type. The challenge is, to make this certain cut explicit: can we identify it, or is it simply ineffable?

Mathesius put it like this: “ … linguistic characterology deals only with the important and fundamental features of a given language at a given point in time, analyses them on the basis of general linguistics and tries to ascertain relations between them” [1964: 59]. He gave as an example the fact that “ … Modern English shows a characteristic tendency for the thematical conception of the subject” [ibid: 61] – that is, for mapping the Subject on to the Theme of a clause rather than on to the Actor. Subsequently Mathesius was able to relate this to other changes that were taking place in the language in its evolution from late Middle to Early Modern English.a

I shall try to apply this notion of a “certain cut” to Modern Chinese; and specifically to Modern Mandarin, because some of the features I want to talk about are specific to Mandarin, and not found in all other forms of Chinese (other dialects, or other Sinitic languages, whichever way you wish to look at these). Mandarin is, obviously, a “world language”: it has more native speakers than English and Spanish combined, and – more importantly - it is now being widely taught as a foreign language in institutions around the globe. It is helpful for foreign learners to have some idea of what Mandarin is like, and how it resembles or differs from other well-known languages.

I am, of course, far from being a native speaker. I heard my first clause in Mandarin just after my seventeenth birthday; I became fluent and was a teacher of the language to foreign students for the early part of my career, though now, sadly, I have lost much of my earlier fluency. But I do not apologise for writing about the language as a foreigner. It has been clear for a long time that these two perspectives, that of the native speaker and that of the foreign linguist, when taken together are complementary to each other, and give a more rounded, dimensional picture of a language than either just taken by itself. (Mathesius was not a native speaker of English.). And to be aware of that certain cut, it may actually help if you are a foreigner.

For a start, we could describe Mandarin as a fairly typical East Asian language, part of–perhaps at one end of–a continuum formed, in terms of major languages, by Mandarin, Wu, Hokkien, Cantonese, Vietnamese, Khmer (Cambodian), perhaps Thai, and Malay. These languages have invariant word forms, without morphological variation; they have a constant syllabic structure in the morpheme, generally monosyllabic but disyllabic in Malay; and they have a fixed order of modification, the modifier preceding the modified throughout Sinitic, the other way round in Vietnamese and further south. In representing time, all these languages share a general preference for locating the process by aspect rather than by tense. Aspect is the contrast between latent or ongoing (grammaticalised as imperfective) and actualized or complete (grammaticalised as perfective); tense is deictic time, past, present or future by reference to the here-&-now. In these languages aspect is grammaticalised, while deictic time is unspecified, or realised lexically. Like all such broad generalisations in “areal linguistics” (the comparative study of languages within a given region), this one begs a number of relevant questions; but it will serve as a starting point for the present discussion.

So let me identify certain features of Mandarin Chinese which might form part of a character sketch of the language. It will not matter exactly where we start, because we are not looking for any kind of a causal chain. I shall come back to this point later; but it is important to introduce it here, as we begin. It will often be possible to select two features which, taken on their own, could be thought of as one “causing” the other. This can be a useful device for someone learning the language; but it is misleading. Such features may well be related to each other; but not as cause-&-effect. Rather, they are related elements in a network of interrelations whereby the language functions as a whole (as does every language) to construe human experience and to enact human relationships. Our task is to interpret these patterns within the context of the overall system of the language.

It may be helpful first to enumerate the features that I shall be referring to.

(1) “Chinese is a monosyllabic language”: there is a regular correspondence between morpheme and syllable.

(2) The number of distinct syllables is limited: they form a set which can be exhaustively enumerated.

(3) Morpheme boundaries are clear: we know where each morpheme begins and ends. Word boundaries are unclear.

(4) Morphemes are not assigned to syntactic classes. Words are–including words consisting of one morpheme.

(5) Words normally consist of one, two, three or four morphemes. Most verbs are monomorphemic; polymorphemic words tend to be nouns.

(6) Polymorphemic nouns (“noun compounds”) are typically constructed on the principle of strict taxonomy: ax, bx, cx are kinds of x.

(7) The syllable is structured prosodically rather than phonemically: it “consists of” an initial state, a final state, and a trajectory from one to the other.

(8) The syllable can be exhaustively described in a network of about thirteen systems; any one syllable selects in some subset of these.

(9) When Mandarin “borrows” expressions from other languages, it matches them to syllables (not to phonemes; cf. (7) above).

(10) Mandarin “borrows” on the content plane (by “calquing”) rather than on the plane of expression.

(11) There is a clear semantic relationship among the parts (morphemes) of a compound word.

(12) Morphemes retain their identity over the course of time; they do not diverge.

(13) The syntactic classification of words (primarily into verbs, nouns and others) is fairly strict.

(14) Location in space and time proceeds from distant (broader focus, less delicate) to close-up (narrower focus, more delicate).

(15) Modification in the nominal group proceeds from most deictic, least permanent properties to most permanent, least deictic.

(16) Ordering of elements in the clause follows the thematic principle. Selection in the system of MOOD is not thematised.

(17) Minor processes are analysed into relation plus facet; the former is construed by a verb, the latter is construed by a noun.

(1) “Chinese is a monosyllabic language”: this is often dismissed as a myth, but, as Y.R. Chao remarked, it is one of the truest myths in circulation. It just needs spelling out properly. Mandarin has innumerable polysyllabic words; but the morphemes of which they are made up are all monosyllables. Some early loanwords, known already in Old Chinese, were disyllabic in their original language; they were reinterpreted in Chinese as consisting of two morphemes. Throughout the known (reconstructable) history of the language there has always been a cross-stratal match, one morpheme being realized phonologically as one syllable.

(2) The number of distinct syllables is limited, and there is a high degree of agreement among different speakers. The number varies considerably among the different Mandarin dialects; but in the dialect of Beijing, which is the basis of standard Mandarin (pǔtōnghuà, the ‘common tongue’), the number of distinct syllables is 400 ± 2. There are four distinct syllabic tones; if tonal distinctions are taken into account, the number rises to about 1,150 (i.e. 1,600 minus about 450 combinations which do not occur). In other Mandarin dialects the number of syllables ranges between (I think) about 300 and 450.

In its citation form, every syllable is tonal. But in connected discourse only the salient syllables are tonal.

(3) The morpheme/syllable complex (that is, the complex element formed out of one (lexicogrammatical) morpheme and one (phonological) syllable) is clearly bounded syntagmatically: other than some fusions (mainly with the nominalising suffix ér, especially in Beijing where, e.g., mén ‘gate, door’ becomes mér), we know exactly where each one begins and ends. But while morpheme boundaries are clearcut, word boundaries are not. The uncertainty is not apparent in writing, because each morpheme is written with one character and word boundaries are simply ignored; but it becomes very obvious when Mandarin is written in Hanyu Pinyin, the official alphabetic transcription. Here there is no clear guidance on where a word should begin and end, and there seem to be two competing practices: one with shorter, English-style written words and one with long, German-style written words, the latter being favoured particularly in the labelling of industrial products.

We might compare here Vietnamese, a language that is typologically very similar to Chinese, and was written with Chinese characters until the time of the French colonial administration. When the charactery was replaced by an alphabetic script, each morpheme continued to be written with a space on either side, with no indication of which morphemes combined together to make a word. This practice was not adopted in spelling Mandarin, where the word has more of a phonological presence than it has in Vietnamese (or in Cantonese). In Mandarin, some sense of where words begin and end in a spoken text can be found in the intonation and rhythm; but that still leaves considerable uncertainty.b

(4) The morpheme, as such, is not assigned to any syntactic class. The word is. Words fall rather clearly into major syntactic classes (cf. (13) below); this is one reason for asserting that there is such a thing as a “word” in Chinese grammar, which has sometimes been called into question.

If a given morpheme also functions as a word, then of course as a word it does fall into a syntactic class. Many words do consist of only one morpheme. But not all morphemes occur as words on their own. This point is often misunderstood, because in Old Chinese there were many more monomorphemic words; since their modern descendants are written with the same characters, it is easy to forget that they no longer occur as words in the modern language.

(5) In Mandarin, words are normally made up of one, two, three or four morphemes. These may include a grammatical morpheme such as an aspect marker. Leaving those aside, there is some correlation between the length of a word, in terms of morphemes, and its grammatical class.

I am considering here just the open classes, lexical verbs and lexical nouns; for other classes of verb and noun, see (17) below. In everyday language, the majority of verbs are monomorphemic, whereas words of two or more morphemes tend to be nouns. This is less observable in modern technical and scientific registers, and also in “officialese”, because these are characterised by a high degree of grammatical metaphor. Processes (actions and events) offer less scope for being construed into taxonomies than do things (entities) (cf next point).

(6) Noun compounds (i.e. polymorphemic nouns) tend to be organized in strict taxonomies: given a Head noun x. and Modifiers a, b, c, then ax, bx and cx will all be kinds of x. Metaphoric compounds, those where ax is not a kind of x, are relatively uncommon (though they may be created in the course of lexical borrowing).

For comparison with English: Mandarin noun compounds strongly favour the pattern of carthorse, racehorse, rather than that of clotheshorse; or mailboat, tugboat, rather than sauceboat. A sauceboat is not a kind of boat; a clotheshorse is not a kind of horse. The distinction is not totally clearcut, of course: a carthorse clearly is a kind of horse, while a clotheshorse equally clearly is not; but in between the two we find hobbyhorse, in its original sense of a bicycle without pedals, and rockinghorse, a child’s wooden rocker shaped to look like a horse. In a strict taxonomy neither of these is a kind of horse as horse is prototypically defined; they are accepted as hyponyms on grounds of functional or formal similarity. In Chinese the taxonomic relation is more strictly maintained. Not that such metaphoric compounds are never found – they are; but they are rare, since they depart from the scope of semantic relationship that accords with the Chinese sense of ‘modifying’ a noun.c

(7) The syllable of Mandarin is structured prosodically rather than phonemically; it contains no segments smaller than itself. (It can of course be analysed “as if” phonemic; but such a description is both more complex and less predictive and explanatory.) It has a starting point and a finishing point, and the articulation makes a trajectory between them. Following the Chinese tradition, we can describe it as a structure consisting of Onset + Rhyme, with the trajectory included as a function of the Rhyme. The next section sets this out in detail, accompanied by an illustrative example.

(8) The entire syllabary is set out in Table 1, represented alphabetically in Hanyu Pinyin. Figure 1 is the system network which specifies just this inventory of syllables (to avoid clutter, I have left out the four-term system of TONE). The terms “initial” and “final” are used in the network as alternatives to “Onset” and “Rhyme”.

Table 1 Mandarin Chinese syllabary (in Pinyin spelling)
Figure 1

Network specifying total Mandarin (Pekingese) syllabary.

Table 2 gives a description of one particular syllable by showing the features in respect of which it contrasts with all others in the syllabary:

Table 2 A syllable defined by its contrasting features

The syllable qiān is [initial state] POSTURE: y-prosodic; MANNER: affricated; VOICE ONSET: late; [final state] POSTURE: y-prosodic; RESONANCE: nasal; APERTURE: open; TONE: high level. I have left out one feature, which represents a complex aspect of Mandarin phonology: qiān also selects “palatal” in the initial system of ALIGNMENT (PLACE): it is palatal, in contrast to chān (cerebral, or retroflex) and cān (dental). It thus selects the “y-prosody” three times over, which explains the quality of the vowel in the transition from initial state to final state. For a fuller account, see Halliday 2005: chapters 5, 6.

This example illustrates the way the phonology of Mandarin works. The entire syllabary can be specified as a network of systems whose features are essentially prosodic: they are paradigmatically distinct, but syntagmatically may be quite indeterminate. A syllable selecting “final RESONANCE: nasal” will end with a prosody of nasality; it doesn’t matter where the nasality starts, or whether or not it ends in closure – nasal consonant or nasal vowel or both. One effect of this is that the quality of the vowel, in the trajectory from initial state to final state, is strongly coloured by the initial and final prosodies (in contrast to English, where the consonants are coloured by the quality of the intervening vowel). Altogether, the Mandarin system is highly patterned, with remarkably few loose ends.

(9) This means that when foreign words are incorporated into the language (‘borrowed”, in the original sense of taking over the sounds), the unit to which they are assimilated is not the phoneme–because there isn’t one–but the syllable. To this we may further relate the fact that when Mandarin does accept items from other languages it favours “calquing”, or loan translation: the “borrowing” takes place on the content plane rather than on the plane of expression.

For these two different ways of “borrowing” we may compare Mandarin with Japanese. In Japanese, a giraffe is called ziraafu; in Mandarin it is chángjǐnglù (‘long neck deer’). A typewriter is (or was, when such things existed) in Japanese taipuraitaa; in Mandarin, dǎzìjī (‘strike character machine’). The first type is clearly a form of borrowing; there is no way that Japanese ziraafu could have evolved except by derivation from (presumably) English (which had in turn borrowed the word from Arabic). But the second type is simply following the normal pattern of word formation in Chinese, whereby a compass is a zhǐnánzhēn (‘point south needle’), a chimney is a yāntǒng (‘smoke pipe’) and so on. Chinese tiělù ‘railway’ could be a calque on German Eisenbahn, or zìrán kēxué on English natural science; but Chinese diànhuà ‘telephone’ is not ‘distant speech’, it is ‘electric speech’ (or, in the earlier sense of diàn, ‘lightning speech’); a mobile phone, or cellphone, is shǒutí diànhuà ‘hand-held telephone’; and there is no plausible source language in which a giraffe is made up of ‘long + neck + deer’.

Mandarin does, of course, “borrow” sounds, in rendering foreign proper names, mainly personal and geo-political. These are accommodated into the Mandarin syllable structure. A few names assigned earlier to well-known countries have a positive spin, like Germany Déguó ‘land of virtue’, France Fǎguó ‘land of law’, England Yīngguó ‘land of heroes’, U.S.A. (America) Měiguó ‘land of beauty’; but most proper names clearly proclaim their foreignness. Likewise with personal names: foreigners familiar in China may be given names that could be the name of a Chinese man or woman, but otherwise they stand out, either by their length, like the Russian name Aleksandrovskaya ã + liè + kē + sān + dé + luó + fū + sì + jiā + yǎ, or else because the first syllable is not known as a Chinese surname (and subsequent syllables might not be acceptable as a personal name). All such borrowings are written in the charactery, which means that they can be read aloud in any dialect, Mandarin or non-Mandarin. Hence there may be little or no resemblance to the sound of the original. The British prime minister Churchill became in Mandarin qiū + jí + ěr, which is recognizably similar to the English; but in Cantonese he was yao + gat + yi – no concession is made to speakers of other kinds of Chinese!

The Chinese language is of a type that tends to resist phonological borrowing. There are early loanwords from Buddhism, but they remain associated with the esoteric; when modern loanwords were introduced, like àidìměidùn ‘ultimatum’ or démókèlàxī ‘democracy’ , they were soon discarded and replaced by Chinese terms. Borrowed words contravene two principles of the language: one, that each syllable is a minimal lexicogrammatical unit (a morpheme); two, that the parts of a compound word define each other by mutual selection in some identifiable semantic relationship, most typically one of hyponymy like the long neck deer. Democracy is construed as mín + zhǔ + zhǔyì ‘people power principle’; principle, in turn, is zhǔ + yì ‘master idea’.

(10) Related to the last point: the days of the week, and the months of the year, in the international calendar as used today, are numbered, not named. Monday to Saturday are ‘week-1’, ‘week-2’ to ‘week-6’ (Sunday is outside the system, being made up of ‘week + day’); January to December are ‘1-month’, ‘2-month’ to ’12-month’. This form of naming distinguishes them from the traditional Chinese calendar, while avoiding the need to “borrow” foreign names or to coin new terms in Chinese. (They could have been mapped into some semantic sequence, such as the names of the planets; but the planets are already compounds formed from the general word xīng ‘heavenly body’ – and there would be no natural connection between the two parts).

(11) Morphemes in Mandarin, and in Chinese in general, retain their identity over time; they do not split up into phonologically distinct forms. This is in contrast to English and other IndoEuropean languages, where they do. In English, the words bread, breed and brood, and probably also bird, are all variants of the same etymon (descendants of the same earlier form), although not identified as such by speakers of the language today. Many English place names end in -ingham, like Birmingham, Nottingham, Gillingham, Lastingham; these were originally distinct morphemes, ing meaning ‘household, dependents’ and ham meaning ‘home’; this -ham is in fact the same word as home, but again English speakers are not normally aware of this. Mandarin, likewise, has many place names ending in -jiāzhuāng; but it is quite clear to any speaker of the language that Lǐjiāzhuāng means the homestead (zhuāng) of the family (jiā) of somebody surnamed Lǐ.

For this reason – though unfortunately, for the user – most Chinese dictionaries today are still arranged according to the first morpheme, even when they are ordered alphabetically, in Pinyin transcription. Since there may be anything up to forty different morphemes all spelt alike, it can take a long time to track down an unknown word – especially when the tone marks (accents) are omitted, as they usually are.

(12) Any given morpheme, taken by itself, may have accumulated a great range of different meanings over the period since it was first recorded, perhaps about 3,000 years ago (it was at about 1,000 B.C. that the charactery had reached the stage where every morpheme had its written form). In the normal course of change in a language, many of the earlier meanings would be no longer current; and they would not figure in a dictionary of the modern language: a dictionary of Modern French, for example, will not include the meanings of the Latin words from which the French words are descended. In Chinese, on the other hand, the earlier meanings are in some sense still around; there are two reasons for this, one to do with the language, the other to do with the script. The first one is that classical Chinese (based on the language as it was about two thousand years ago) continued to be used as the norm for written texts up until the early twentieth century (whereas Latin had given way to French well before that time). The other reason is that the writing system maintains an illusion of continuity: the morphemes that have survived from the time of Old Chinese, though now pronounced quite differently, are still written with the same charactery (the forms of the characters have changed, but their identity is clearly preserved), so there is no clear separation between the classical and the modern language. Now that the great majority of adult speakers of Mandarin are literate, even if they haven’t studied classical Chinese they are likely to have some unconscious awareness of the historical depth of the language.

(13) As mentioned in (3) above, there is a significant difference in Mandarin between the morpheme and the word. Morphemes are not distributed into classes; words are. For example, the morpheme péng in the noun péngyǒu ‘friend’ belongs to no syntactic class; nor does the huān in the verb xǐhuān ‘to like’, or the morpheme yù in the verb yùbèi ‘to prepare’. Of course, if the same morpheme also functions as a word, then as a word it does fall into a syntactic class; but that does not restrict the class of compound word in which it may occur as a constituent. Thus rénlèi ‘the human species’ is a noun, and its two components rén ‘man (human being)’ and lèi ‘class’ are both (as words) nouns; suàn (as a word) is a verb ‘to reckon’, and it occurs as a component both of the verb jìsuàn ‘to calculate, compute’ and of the noun suànpán ‘abacus’.

The syntactic classification of words is fairly strict; among the three major classes (verbs, nouns and others) there are constraints on transcategorising, as can be illustrated by some comparisons with English. First, verbs into nouns (see Table 3):

Table 3 English: verbs into nouns

Then nouns into verbs (see Table 4):

Table 4 English: nouns into verbs

The rejected examples would I think be perfectly intelligible; they are just grammatically wrong. I am not suggesting , of course, that English has no comparable constraints; it has. There is probably about the same degree of syntactic flexibility in both these languages.d

Let me now go back over some of these thirteen points, and present them as if they formed a chain of causal relationships, starting with the Mandarin syllabary. (i) Mandarin has rather few distinct syllables: less than most other modern forms of Chinese, and considerably less than at earlier stages in its own evolution. (ii) “Because” there are fewer distinct syllables, words have got longer – if there are fewer distinct morphemes, you need more of them (there is syntagmatic compensation for paradigmatic constraint). We know from a study cited by Y.R. Chao (1972) that there is a consistent correlation whereby, given a particular passage of text rendered in different dialects, the fewer the number of distinct syllables in the dialect, the longer the version of the text. (iii) “Because” there is a great deal of homonymy among diffrerent morphemes, compound nouns tend to be formed on a strictly taxonomic principle: they are congruent rather than metaphoric (it takes more energy to decode metaphoric compounds).

We might even take this kind of reasoning further still. (iv) “Because” there are fewer syllables than there used to be, morphemes have retained their phonological integrity: any tendency to diverge morphologically has been counteracted by the tendency to converge phonologically.e (v) “Because” there is no morphological variation, words are assigned to clear syntactic classes, and their function in the clause is given by their place in sequence (in other words, experiential meaning is realised as the order of clausal elements, as in English: gǒu yǎo rén ‘dog bites man’, rén yǎo gǒu ‘man bites dog’).

But notice that this whole chain of reasoning could be reversed, as a story not of imposing constraints but of relaxing them. We could have said, picking out the main points, “because” syntactic classes are clearcut and the order of the elements of clause structure signals experiential meaning, there is no need for words to carry any morphological marking; and “because” words have got longer, and compounding is rather strictly taxonomic, there is no need to maintain an inventory of so many distinct syllables.

In other words, what we have are syndromes of coherent features, such that given any pair, either member can be said to be the cause of the other. And that shows that we are seeking the wrong kind of explanation. These are not material systems, they are semiotic systems; and such systems do not work as cause-&-effect. Their component parts are related not by causation but by realisation; they work as value-&-token. Neither of these can be pointed up as the cause of the other.

There are a number of other features that we might consider, to see if they seem to form part of a putative characterology of modern Mandarin. I will continue with the same numbering; they are not in any clearly designed sequence, but they are part of a general progression from the plane of expression to the plane of content. The next few points will relate more to the lexicogrammar.

(14) Location in time and space, including institutional space, proceeds from distant to close-up; there is always a move towards greater detail, increasing delicacy of focus. Dates are given in the order year–month–date–hour (time of day); addresses go from country to province to township to suburban district to street to apartment block to apartment number and finally to the identity of the addressee. Personal names are ordered as surname followed by “given” name followed by title. (It is ironic that just as the rest of the world was learning that in Chinese the surname comes first, many Chinese decided to accommodate to the “western” ordering and started turning their own names around, causing considerable confusion to publishers of journals and news media, together with numerous unsuspecting readers throughout the globe).

If you are giving a direction, in English, to someone who is fetching some item for you from its place, you are likely to say (for example) that it is in the left-hand corner of the bottom drawer of the cupboard next to the window in the bedroom at the back of the upstairs floor. In Mandarin you would start at the other end (and probably omit the locative expressions ‘in, at, on’).

(15) I referred in (6) above to the structure of compound nouns, pointing out that they were typically organized on the principle of strict taxonomy. Thus, every wheeled vehicle is a subtype of chē, every fish is a subtype of yú, every mechanical appliance is a subtype of jī, and so on. The Head noun occurs on its own (or in a further compound, usually of a different type) as the general member of the set: chē ‘wheeled vehicle’, yú ‘fish’, jīqì ‘machine’.f

All such modifying elements precede the Head noun. This is typically expressed as the principle that in Chinese “the modifier precedes the modified”. Interestingly, the same principle applies in Japanese, which is typologically a very different language from Chinese; whereas in Vietnamese, which is very similar to Chinese typologically, the principle is the opposite: the modifier follows the modified; and this principle is maintained, I think, in all the major languages from Hanoi to Singapore.

The sequence of elements within the modifier is very similar to that of English: ‘my aunt’s two most valuable Assyrian gold bracelets’ would come out in very much the same order in Chinese. Where the Modifier is a string of nouns, as in the names of institutions, and in scientific and technical terminology, the pattern is still the same; e.g. tiělù guǐdào jiǎncháchē ‘railway track inspection car’. In both languages, the nominal group is structured as Deictic + Numerative + Epithet + Classifier + Thing, as in nà liǎngge jiùshí-de zhédiéshì zhàoxiàngjī those two antiquated folding cameras.

Is there any relationship between features (14) and (15)? If we take English and Vietnamese as our two points of reference, then Chinese is like Vietnamese in (14) but like English in (15). It might be that the two features are simply unrelated; it is not to be expected that all the features of any one language will fit together like the pieces of a jigsaw puzzle. There are far too many intersecting dimensions for that to be possible.g But once we have put it like that, we have recognized an important point: that the question is not that of whether or not two particular patterns are connected – it can probably always be shown that, ultimately, they are. The question is, rather, which features are selected by the language as vectorial in the overall management of meaning. Everything is like everything else in some one way or another; among all the possible ways of being alike, which are the ones that matter in this language? That “certain cut” is the product of the interaction among those particular strands in the meaning potential (and also the “sounding potential”) that determine what goes with what to make the overall pattern – rather like the underlying motifs in a complex piece of woven fabric, or the patterning in a Persian carpet.

If we consider just (14) and (15), there would seem to be more consistency in the English and in the Vietnamese pattern than in that of Chinese. In (14), Vietnamese goes from a broad view with distant focus to a narrow view with close-up focus, while in (15) it goes from class to subclass to individual; and each of these can be seen as a progression from the general to the particular. The pattern in English is the same but in reverse: the direction is from most particular to most general. In Chinese, the two seem not to match.

And that may be the end of the story. But we might look at the ordering of the elements within the nominal group. In many languages the nominal groups are consistent in the way they are arranged in a sequence outward from the Thing: if the modifier precedes, then a typical sequence is Deictic + Numerative + Epithet + Classifier + Thing; if the modifier follows, then the sequence is reversed: Thing + Classifier + Epithet + Numerative + Deictic. With the latter, there is a counterpressure whereby the Deictic adheres closely to the Thing; here the principle of “general to particular” is overridden by another one whereby the Deictic element, that which locates the entity in its discursive context, tends to come at the beginning. With a premodifying language, such as Chinese, the two tendencies coincide: either way, the Deictic comes first.

There are two motifs at work in the ordering of elements within the modifier: one is constructing the identity of the Thing; the other is specifying its attributes. By and large, the more permanent the attribute of some thing, the less that attribute contributes to its identification. In Chinese, as in English, the nominal group begins by identifying, giving the location of the thing in the context of the speech event: ‘this, that, the, my, your, any, some, all &c.’ It then proceeds through a chain of attributes which have less and less identifying potential but which, by the same token, are more and more permanent in their assignment. In postmodifying languages the progression is the same, but the movement is in the opposite direction.

Whatever the relation between them, each of the patterns displayed in (14) and (15) is consistent within itself; and any such regularity saves cognitive energy, since repeated patterns require less brain power both to produce and to understand. The question we are asking is: is there any way in which they conform to each other? We might think of them as each being a move towards the concrete. In (15), the features become more and more “thingified”, increasingly part of the entity’s inherent state; in (14), the location in space-time becomes more and more exact. Seen in this light the progression appears as textual rather than experiential, moving from a starting point that is thematic and given, to an endpoint that is maximally “loaded” with information (a complex of new and rhematic, like Fries’ “N-rheme” (Fries 1995)). This would then tie in with another feature, to be considered next.

(16) Mandarin displays the principle of the “textual” organization of the clause as a thematic progression, whereby the first element is typically functioning as Theme (Fang, McDonald & Cheng, 1995) and there is a strong tendency for the New to appear at the end, as noted many years ago by Y.R. Chao (1948).

Up to that level of delicacy, the Mandarin clause is similar in its textual structure to that of English. Beyond that, differences appear: for example, in Mandarin a lexical item that is Given does not lose its newsworthiness (as shown by phonological prominence) when it occurs in clause-final position, as it does in English. More generally, in English the clausal Theme is strongly tied to the system of MOOD: the choice of Theme (that is, the way it is mapped on to other elements) signals the mood of the clause, as declarative, yes/no interrogative or Wh-interrogative. This is not so in Mandarin, where there is no correlation between mood and theme.

Question elements do not come at the beginning of the clause – they stay in their place, so to speak, in accordance with the transitivity structure; and there is no contrast of ordering realising the choice between declarative and yes/no interrogative – in other words, no inversion like that of English Subject and Finite (and therefore no requirement that such elements should be made explicit).

(17) Minor processes – the prototypically locative expressions that are construed as circumstantial to the process of the clause – are factored out into two components: (i) how the process relates to the entity, ‘to(wards)’ , ‘at’ , ‘away from’ &c., and (ii) what facet of the entity it relates to, ‘top’ , ‘side’ , ‘inside’ , ‘front’ &c. Examples are zài zhuōzi-shàng ‘at + table + top’ , dào huāyuán-lǐ ‘to + garden + inside’ (English on the table, into the garden).

This relates to (13) above, the point that Chinese maintains a rather clear distinction between verbs and nouns: it recognises processes to be systemically distinct from entities, or things. Things exist in space; but they persist in material form through time; they are relatively stable. Processes happen in time; but they take place (in material form) in space. In Chinese, nouns tend to be the same (same etymon) all across the dialectal range (the Sinitic languages); whereas verbs may differ from one dialect to the next – almost from one village to the next, in the old days (Halliday 2005, ch. 4).

Because processes and entities inhabit different semiotic realms, they subcategorise in different ways. Verbal words are: (i) lexical verbs (including adjectives, which in Chinese are a kind of verb, not a kind of noun as in English); (ii) modal auxiliary verbs, which construe the likelihood and the desirability of the process; (iii) postpositive verbs, which show phases of the process, and (iv) prepositive verbs, which construe the relation between the process and some particular entity. (For comparison with English, prepositive verbs are similar to prepositions and postpositive verbs are similar to post-verbal adverbs.) Nominal words are: (i) lexical nouns; (ii) determiners (personals and demonstratives), which construe the identity of the thing and create cohesion with others; (iii) numeratives (numerals and “classifiers”), which itemize and construe quantity, and (iv) postpositive nouns, which indicate the facet of the thing. These are the most general categories; there are of course more delicate distinctions to be made within them.

The minor process, then, is construed as prepositive verb plus postpositive noun. If no prepositive verb is specified, the default meaning is normally ‘at’, e.g. Cháng Chéng wàimiàn ‘outside the Great Wall’. Certain classes of noun – those designating place and persons – do not ordinarily take facet words: cóng Běijīng ‘from Beijing’, gěi nǐ ‘for you’. With non-spatial relations there will often be no facet anyway, e.g. gēnjù ‘according to’, wèi ‘on behalf of’; if there is, it usually takes a special form combined with the morpheme yǐ as in èrbǎi kuài yǐshàng ‘above two hundred dollars’. The two-part construction is readily extended into abstract space; e.g. chúle wǒmen yǐwài ‘apart from us’ , zài Tángdài yǐqián ‘before the Tang dynasty’.

The prepositive verb is clearly marked out as a kind of verb. All prepositive verbs can also function as main verbs (i.e. as Process in the clause), and many of them can be marked for aspect, e.g. ràozhe shù pǎo ‘run round (circling) the tree’. To compare with English, where also many prepositions take an aspectually marked form (concerning, given, excepting &c.), we could consider the different ways of construing a complex process like chopping down a tree:

(a) he took an axe and felled the tree

(b) using an axe he felled the tree

(c) he felled the tree with an axe

(d) he felled the tree axewise

In English, the preferred type of construal of a minor process is (c), with the prepositional phrase following the verbal group. In Chinese it is (b), and the minor process typically precedes. This sequence, where the minor process precedes the main process of the clause, might be taken as another case of “the modifier precedes the modified”, whereby a circumstantial element has the function of modifying the general character of the process.

(18) I have been presenting these various features as patterns that are found in specific areas within the phonology and the lexicogrammar of Mandarin. Before coming to the last two in my list, I will try to suggest one more general characteristic regarding the management of sound and meaning in the language.

Phonetically, most of the complexity in the Mandarin syllable is carried in the trajectory from initial state to final state. The syllable progresses from a simple initial (there are no consonant clusters) to a simple final (there are no stops), by a complex movement involving both variation and shift in the quality of the mediating vowel. The organization of sound is entirely prosodic; and we know from the history of Mandarin phonology how the basic prosodic systems of palatalisation and labialisation have persisted over two thousand years – during which time the actual morphemes that selected within these systems have changed several times over (Wang Li, 1936).

Semantically, a consistent feature of the organization of meaning is that, as in other East Asian languages, many of the grammatical systems include an unmarked term. The choice is not that of ‘a or b?’ , but that of ‘a or b or neither?’. That, at least, is how it presents itself in the grammar. But semantically that is the wrong alignment. The first choice is ‘marked in respect of feature x, or not?’; then, if the answer is ‘yes’ , the next question is ‘marked as a or as b?’. The feature (feature x) is a kind of semantic prosody, a motif that may or may not be present at certain moments in the discourse. As with prosodic features in general, it is not always entirely clear where its domain begins and ends.

(19) An example would be the grammar’s construction of time. In Mandarin, as in many East Asian languages, time as construed grammatically is essentially aspectual: it is not anchored in the present, as in systems of tense where every instance is marked as either past, present or future, but either left unmarked or, if marked, then marked not as a kind of digital time but as some (broadly temporal) aspect of the process. Typically, this means a choice of one out of two possible marked states for which the grammatical terms are “perfective” and “imperfective”. The imperfective means foregrounding the process itself: it may be ongoing, or unbounded, or significant in its own right. The perfective means foregrounding the outcome of the process: it may be completed, or bounded, or significant in terms of its realization. But because aspect represents time as a kind of prosody, the ideational scope of aspect systems varies widely. It varies even among the different forms of Chinese: the meaning of perfective and imperfective is not the same in Mandarin as it is in Cantonese. It also varies from east to west across the Eurasian continent: it seems as if there is a gradual shift in perspective, such that at the eastern end, as in Chinese or Tagalog, aspect is the primary modelling of time; towards the centre, as in Russian or Hindi-Urdu, it intersects on an equal footing with tense, while at the western end, as in English or Spanish, only tense is fully grammaticalised and aspect takes second place. (What is called “aspect” in modern structural grammars of English is not really aspect – it is secondary, or serial, tense. Aspect in English is grammaticalised only in the non-finite verb.)

(20) Related to aspect, in Mandarin, is the temporal category of phase, where the basic opposition is that between “conative” and “reussive” – between process viewed as attempt and process viewed as success. English speakers make frequent use of the verb try, marking the process as conative, as in I tried to tell you but you wouldn’t listen. They try to find a similar verb in Chinese; but there isn’t one – because the process itself is construed as inherently conative. So while in English the process is inherently reussive, and can be marked as conative with try, in Mandarin it is inherently conative, and there is a large class of postpositive verbs marking it as reussive – as “completive” phase (Halliday and McDonald 2004; cf. Halliday and Ellis 1951). The class is fairly clearly defined, though it includes some rare ones which would be heard only in unusual discursive contexts; in general, the meaning of the whole construction, the process plus the particular respect in which it is successful (lexical verb + postpositive verb) is predictable from the meaning of the parts – not always, but notably more so than the meaning of its nearest equivalent (verb plus post-verbal adverb) in English.h

Should we try to discover general motifs or principles underlying these rather diverse observations? I suggested at the beginning that any such general principles might turn out to be ineffable – that even if there were any common underlying factors it might be impossible to tease them out, to see them from a distance as aspects of the overall architecture of meaning. But let me try.

Mandarin, it seems to me, displays a notably high degree of internal regularity. This can be seen in the lexicogrammar in the way it construes the “things” of human experience: in the principles of noun compounding, the ordering of attributes, even in the build up of the numeral system – itself an effect of the stability of the morpheme/syllable complex. Here especially we are aware of the contrast with Indo-European languages such as English and Hindi.

; its internal organization as a postural trajectory from an initial state to a final state determines the phonetic values that arise from the combination of systemic features and enables us to predict where there is likely to be uncertainty and variation: for example, the varying realization of the open and half-close aperture, the alternation of [e]/[o] in certain half-close syllables, and the variation in how the final prosody of nasal resonance will be realised. Such regularity contrasts with the mixed phonological systems of Japanese, Thai and English [Chao, 1934; Henderson 1951; Wang 1983; Halliday, 1992.

But what we are seeing in the lexicogrammar is more than just internal regularity. Mandarin seems to take a consistently analytical approach to the construal of experience. This appears in all three primary elements of the clause: verbal group, nominal group and “prepositional” phrase. The last is in fact pre-/post-positional: it analyses out the domain of the minor process into the two factors of “relation” and “facet”. Typically the nominal group analyses the “thing” into a general class plus optional subclass; the verbal group analyses the process into an event plus optional culmination; and the meaning of the whole is derivable from the meaning of its parts.

Since there is much in common among all Sinitic languages (i.e. all varieties of Chinese), it is important to take account of some of the ways in which they differ. When we compare Mandarin with Cantonese, certain differences stand out. Cantonese deploys a large inventory of modal particles, coming at the end of the clause; these realise choices within the system of mood, in combination with a range of other interpersonal meanings comparable to those realized in English by the intonation system (Kwok, 1984; Halliday and Greaves 2008). Phonologically, Cantonese is a tone language, with relatively little work done by intonation. Mandarin, on the other hand, has few final particles – those that there are realize very general categories of mood and aspect; and it is a fairly equal mixture of tone and intonation, making considerable use of intonation in expressing interpersonal meanings.i

A language is the product of its heredity and its environment. I don’t mean its cultural environment; this has only a very long-term effect, commensurate with the “ages” of sociocultural history. I mean its environment in the sense of the other languages in its neighbourhood.

As far as its heredity is concerned, Mandarin is one of the Sinitic (or simply “Chinese”) group of languages forming one branch of the Sino-Tibetan family; this also includes numerous smaller languages spoken in China itself and elsewhere in eastern Asia. The major languages in its neighbourhood include Burmese and Tibetan, both cousins; Thai, possibly a more distant cousin; and four others, Mongolian, Korean, Japanese and Vietnamese, which show no evidence of common ancestry but with which different groups of Chinese speakers have had fairly extensive contact from time to time.

Three of these languages, Vietnamese, Japanese and Korean, have borrowed extensively from different varieties of Chinese at different periods in their history. Chinese, on the other hand, seems to have been rather little influenced by any of the others. This may be one of the conditions that has contributed to its high degree of internal regularity: it has been very little perturbed by pressures from the outside. There may be some very minor effects – the breathy quality of the low falling tone in Cantonese could have come in from Vietnamese, where voice quality (breathy/creaky) is a feature in the realisation of the tone system; but these have no significance on a general scale. It seems that Chinese – and Mandarin in particular – has evolved very much along its own lines.

Now conditions are changing. Mandarin is being learned as a “standard language”, by speakers of other dialects; and virtually the whole adult population is literate. It is also being learned as a “foreign language”, by speakers of other languages, who are looking at the Chinese language from the outside, and adapting it (no doubt in non-Chinese ways) to their own communicative needs. We do not know how, or how much, these developments will affect the way Mandarin continues to evolve. But in achieving the status of a world language it is unlikely to stay exactly as it was before.


aMathesius’ further work on English was published in 1961, some sixteen years after he died. It was compiled and edited by Josef Vachek, from Mathesius’ typewritten notes. The English translation appeared in 1973 (Mathesius 1973). Mathesius makes detailed comparisons between English and other languages, primarily Czech (the language in which he was writing). The book displays a real insight into patterns of meaning making in English.

bThe hotel where I stayed while writing this chapter was variously spelt Zi Jing Yuan (on the taxi card), Zi Jing yuan (on the laundry list), Zi Jingyuan (on the laundry bag), Zijingyuan (on the booklet) and Zijing Yuan (on the coaster). I would have opted for the last of these. There is considerable uncertainty about word boundaries in English; in Mandarin there is rather more, though not as much as these examples suggest provided it is recognised that words fall into syntactic classes.

cThere are of course some compounds formed originally by analogic extension, e.g. those with kǒu ‘mouth’ as the general term: hékǒu ‘river mouth, ménkǒu ‘doorway’ , hùkǒu ‘household’ , rénkǒu ‘population’ – and as a result (?) kǒu is no longer used for ‘mouth’ , being supplanted by zuǐ. Similarly diàn ‘lightning’ , now used in all contexts for ‘electric, electricity’ , has been replaced in the sense of ‘lightning’ by shǎn.

dSyntactic transcategorisation in Chinese is not accompanied by morphological changes. But in modern technical and scientific registers two suffixes are commonly deployed, xìng for nominalising and huà for verbalizing; e.g. xiànxìng ‘linearity’ , rǔhuà ‘emulsify’.

eAnd there seems to be a continuing tendency to reduce the inventory still further. In Beijing in the 1940s I learnt ruá ‘gone soft, limp’ and èng (then pronounced ngèng) ‘tough (as meat)’; I understand that these are no longer heard today.

fNoun compounds are formed on all three principles of expansion: elaborating, e.g. jīqì ‘machine’; extending, e.g. shānshuǐ ‘scenery’; enhancing, e.g. huǒchē ‘train’. The vast majority are of the enhancing type, based on the relation of hyponymy.

gLinguistic typology distils general principles from comparison of different languages - to which there will usually be found to be exceptions. Greenberg showed many years ago (1966) that, in general, what he called “SVO languages” (those with the verb coming between subject and object in the clause) have prepositions, while “SOV languages” (those with the verb coming at the end) have postpositions. This makes sense, because the pre/post- position can often be seen as a kind of verb, realising a “minor process”. English illustrates the first type, Japanese the second. But Latin does not conform: it is SOV (verb final) but has prepositions. There are various other patterns to which these factors might be related; in any case, syntactic function in Latin was marked morphologically, and the ordering of elements in the clause was “free” (i.e. it carried textual rather than ideational meaning). In such cases the OV/VO ideational ordering can change relatively quickly in the course of time.

hThere are two types of completive: directionals, which are limited in number and typically predictable in meaning, e.g. zǒu ‘walk’ , zǒushàng ‘walk up’ , zǒuguò ‘walk past ’ , zǒujìnlái ‘walk in here’; and resultatives, a large class including some that are less easy to predict. Both have affixed negatives: either unmarked for aspect with infixed negative bù, or perfective aspect with prefixed negative méi; contrasting with bù is a marked positive form with infixed dé. Some examples of resultatives, with lexical verb kàn ‘look at’: kànjiàn (look + perceive) ‘see’ , kànbújiàn ‘can’t see’ , kàndéjiàn ‘can see’ , méi kànjiàn ‘didn’t see’ , kàntòu (look + penetrate) ‘see through, understand’ , kànbùqǐ (look + not + rise) ‘look down on, scorn’; kàndéqǐ (look + manage to + rise) ‘look up to, think highly of’; cf. mǎibùqǐ (buy + not + rise) ‘can’t afford’ , qǐngbùqǐ (invite + not + rise) ‘can’t afford to invite’ , shuōdélái (speak + manage to + come) ‘get along, be on good terms’.

iBut not textual meanings, or not so much as English. I used to illustrate this point by referring to a contrast which in English is realized by intonation, where Mandarin would bring in a combination of intonation and lexis: yuánlái shì nǐ ‘so it was you!’ , guǒrán shì nǐ ‘so it was you!’


I would like to thank Professor Randy LaPolla and Professor Peng Xuanwei for their detailed and thoughtful comments on this paper.

