21 June 2009 @ 04:42 am


I apologize that posting here has been so light this year, but other demands on my time have taken priority. I have tried to adjust this post to the recent Google Books changes; please let me know if any of the links are misbehaving.

The first post here was a footnote to the history of the word vegan, which was coined around 1944. Just about a century before that, the word vegetarian was coined. It really took hold with the formation of The Vegetarian Society in 1847, but is attested before that.

Most authoritative etymologies form vegetarian irregularly from vegetable and -arian, somewhat along the lines of unitarian. So the OED, AHD and Skeat. Weekley says, “Currency of barbarously formed vegetarian dates from formation of Vegetarian Society at Ramsgate (1847).” Partridge has a slightly different take:

From ML vegetāte comes the ML adj vegetālis, whence EF-F végétal, whence E vegetal, EF-F végétal has derivative végétarien, whence végétarianisme: whence E vegetarian, vegetarianism. (s.v. vigor)

Though I am not sure on what evidence; most sources trace végétarien to English, not the other way around.

An alternative derivation is directly from Latin vegetus 'vigorous' without any intermediate vegetable. For example, in a letter to “Ask Ms. Natural” in the 1981 Vegetarian Times. The Souvenir of the XVth World Vegetarian Congress, India, 1957 (pp. 104-106) excerpts Carlos (Charles) Brandt's The Vital Problem (a translation of El fundamento de la moral), where he traces vegetarianism through vegetus and its uses in Latin to other cognates. (In the version on the IVU site, the editor inserts a disclaimer about the starting assumption.) He credits Meyers Großes Konversations-Lexikon (6th edition, not 18th; s.v. Vegetarismus) for being the only reference source he consulted that got the etymology and meaning right. The masthead of The Vegetarian, the organ of the Society from the 1880's, has a scroll beneath the title that reads, “Vegetus — Vital, Healthful, Vigorous.” I have not been able to find an image of this online, but an advertisement promising the inaugural issue on 19th December (1881) says similarly, “Vegetus — Signifying all that is Vital, Healthful, and Vigorous.”

One of the reasons for promoting this was that the name made mockery of vegetarians like that in Punch shortly after the Society's foundation easier. (Although there would certainly be something else in any case; a review of such satire in various places, languages and times might make for another post.)

As the Wikipedia points out, this proposal is rather suspect. (For one thing, there are earlier uses than the society, like this.) In other words, it is a learned folk-etymology. But it does come with some interesting learned associations.

Thirteen Satires of Juvenal (leaving out, as often happens, II, VI, and IX, though the work isn't really intended for younger students) is, or was, a minor monument of Victorian scholarship. Its author, John E. B. Mayor, was Professor of Latin at Cambridge University. The commentary is intended less as an aid to understanding and more as an exploration of the environment through a collection of references to related works. Gilbert Highet's Juvenal the Satirist says, “A text with very learned notes on all satires except 2, 6, and 9; the comments consist chiefly of parallel passages, and do not go deeply into problems of text and interpretation.”

Mayor was a philologist and delighted in the details. He contributed five notes to the first volume of Notes and Queries, the Victorian group blog; and nine articles to the first volume of The Classical Review, which was edited by his brother, Joseph. He wrote an article on Latin lexicography for the Journal of Classical and Sacred Philology, in which he summed up his destiny, “there still remains work enough to keep the memory and the understanding employed to the end of life; there will still be new facts to collect, or forgotten facts to recover, to store up, and to classify.”

A . E. Housman, who succeeded Mayor as Latin Professor, a chair that was then renamed for Kennedy, who had refused while he was alive, in his 1911 Inaugural Address, after attributing to Kennedy's Sabrinae Corolla (a collection of translations of English poems into Latin and Greek) his “genuine liking for Greek and Latin” (Kennedy became Regius of Greek at Cambridge; in Stoppard's first scene, AEH tells this and has digs at him and Jowett, the Regius of Greek at Oxford; we still used an edition of Kennedy's public school primer in Form I Latin in the late '60s; I imagine they still uses it today) said of Mayor:

Most good scholars are much fonder of learning than of teaching, and to Munro the duties of his office proved uncongenial and irksome. He resigned the Chair after a tenure of three years, and in 1872 it passed to the venerable man who left it vacant only last December; a scholar who in learning, if that word is taken to mean range and thoroughness of reading, had no equal in England and no superior in Europe. To dwell on the erudition of John Mayor is not merely superfluous but presumptuous; and I will now speak rather of a characteristic on which speech perhaps is not unnecessary. It is well known and sometimes lamented that for all his amplitude of knowledge he left behind him no complete work and no work having even the air of completeness. This regret I do not share; I am much more disposed to recommend for imitation the examples of one who recognized his own bent and followed it, and whose inclinations were exactly in harmony with his talents. Many a good piece of work has been spoilt by the vain passion for completeness. A scholar designs to edit a certain author, a complete edition of whom would involve the treatment of matters to whose study the editor has not been led by his own tastes and interests, and in which he therefore is not at home. The author discourses of philosophy, and the editor is no philosopher; or the author writes in complex metres, and the editor's metrical education stopped short at Porson's canon of the final cretic. It then sometimes happens that the editor, having neither the humility to acknowledge his deficiency nor the industry or capacity to repair it, scrapes a perfunctory acquaintance with the unfamiliar subject, and treats it incompetently rather than not treat it at all; so that his work, for the sake of ostensible completeness, is disfigured with puerile errors, and he himself is detected, not merely in ignorance, but in imposture.

It is the absence of any such vanity, the abstention from all misdirected effort, which redeems and even converts into merit what might else appear defective in the works of Mayor. The establishment and the interpretation of an author's text were not matters in which he took the liveliest interest nor tasks for which he felt in himself a special aptitude: his likings pointed the same way as his abilities, to the collections of illustrative material. I said while he was alive, and I shall not unsay it because he is dead, that this labour is labour bestowed upon the circumference and not the centre of the subject. But this also is work which must be done, and which no other could have done so thoroughly. ‘If a man read Richardson for the story’, said Johnson, ‘he would hang himself’; and much the same may be said not only of Mayor's Juvenal but of a still more celebrated book, Lobeck's Ajax of Sophocles. When you have finished Lobeck's commentary you have imbibed a vast deal of information, but your knowledge and understanding of the Ajax has not proportionally increased. Lobeck himself in his preface admits that this is so; τὸ μὲν πάρεργον ἔργον ὣς ποιούμεθα [, τὸ δ' ἔργον ὡς πάρεργον ἐκπονούμεθα. 'we treat our by-work as work, and perform our work as by-work. ' Agathon, frag. 11]. He in his commentary is not principally the critic nor the interpreter, but the grammarian; and Mayor in his is principally the antiquarian and the lexicographer: his main concern is not with what the author wrote or meant, but with the words he used and the things he mentioned. These he carried in his mind through the whole width of his incomparable reading, and brought back from the limits of the literature all the parallels and imitations and echoes which it contained. What he has bequeathed us is less an edition than a treasure of subsidies: there he saw his true business, and to that business he stuck: and ‘it is an uncontrolled truth’, says Swift, ‘that no man ever made an ill figure who understood his own talents, nor a good one who mistook them’.

(Housman, too, would edit a Juvenal, with just an English introduction and no notes.)

Around 1880, Mayor joined the Vegetarian Society and in 1884 was elected president. At Cambridge, he was not known as a particularly effective lecturer, tending to deliver unadorned citations, as in his commentaries. But he was a proselytizer. M. R. James recalled in his memoir Eton and King's: Recollections, Mostly Trival, 1875-1925 (pp. 181-2):

There was never any translation, or any explanation of an interesting point. Of course Mayor, imagining that everybody was as conscientious as himself, thought that one would go home and look up all these references and copy out the passages in a neat hand. But! At the end of the lecture there was an oasis. I used to carry Mayor's books back to his rooms in St. John's, and he would reward me with a copy of the last number of the Vegetarian Magazine, or refresh me with the reading of a letter he had written to, or received from, one of the Old Catholic Bishops. Innocency, charity, the purest enthusiasm for learning were seen at their best in Mayor: accompanied by a want of sense of proportion (and humour) which could hardly be exaggerated.

In 1886, he greatly expanded the “Advertisement” at the head of his Juvenal, which had been previously unremarkable — the early editions just noting that he had purposely not read anyone else's English edition and containing the ultimately unfulfilled promise to do all sixteen satires, and the enlarged second edition just providing some more details on the text. In this fourth edition, it became a wide-ranging discourse on various matters, including in particular vegetarianism and temperance, only barely managing to return to the topic of Juvenal at the very end. There was so much new material that it was also published separately as a supplement to the first volume.

All of which explains why Mayor's Juvenal, in its final edition, contains a footnote on the etymology of vegetarian. To be specific, it refers to an address he had given, which was issued as a pamphlet, titled What is Vegetarianism? I have not had any luck tracking down a copy of this; it is just possible that it is included in the collection of essays, Plain Living and High Thinking (see here), but I have not been able to track down a copy of that, either. (Suggestions would be welcome.) Only one library lists it in OCLC, I suspect because others catalog it differently, perhaps because it it's in a box with other Vegetarian Society ephemera. Still, it is possible to piece together much of it from quotations in the Mar. 1907 Vegetarian Magazine and a note by William E. A. Axon in the Christmas 1909 Notes & Queries (snippet only in GB; note that he says that as of then the word had only been traced back to 1845, just before the Society's founding).

The name was born with the Society. [...] No lexicographer has learnt our secret, ‘fruit and farinacea’. The vulgar error that we devour a wheelbarrow load of cabbages at a meal is fostered by definitions like these:

[Here everyone omits Mayor's inventory of “wrong” definitions from various lexica, which is too bad, since that is just the sort of thing this blog is all about.]

Would you be surprised to learn that as Vegetarians, looking at the word etymologically, not historically or in the light of our official definition, we are neither required to eat all vegetable products, nor vegetable products only, nor even vegetable products at all? Far from committing us to abstain from milk and eggs, the name derives its connexion with diet exclusively from the definition given to it by our Society.

When librarian means an ‘eater of books,’ antiquarian ‘an eater of antiques,’ even then vegetarian will not, cannot, mean ‘an eater of vegetables.’ Your learned townsman, my old friend Mr. Roby, has cited many nouns substantive and adjective ending in arius = Engl. arian. All of these are derived from nouns substantive or adjective, none from verbs. Prof. Skeat was misled by a borrowed definition. Antiquus, ‘ancient’; antiqua, ‘antiques’; antiquarius, ‘one who studies, deals in, has to do with, antiques an antiquary or antiquarian.’ So vegetarius, ‘one who studies, has to do with, vegeta.’ What vegetus means you shall hear from impartial lips :—

Vegetabilis is not used in good Latin at all. Cicero's word for plants is gignentia.

‘Vegetus, whole, sound, strong, quick, fresh, lively, lusty, gallant, trim, brave; vegeto, to refresh, recreate, or make lively, lusty, quick and strong, to make sound.’ Thomas Holyoke, ‘Latin Dictionary,’ London, 1677.

Ainsworth adds to the senses of ‘Vegetus,’ agile, alert, brisk, crank, pert, nourishing, vigorous, fine, seasonable; and renders the primitive ‘vegeo’ to be lusty and strong, or sound and whole; to make brisk or mettlesome; to refresh.

The word vegetarius belongs to an illustrious family. Vegetable, which has been called its mother, is really its niece. Vegetation, vigil, vigilant, vigour, invigorate, wake, watch, wax, augment; the Gr. ὑγιὴς (sound) ; Hygieia, the goddess of health; hygiene, the science of health; all these are more or less distant relatives.

The Vegetarian, then, is one who aims at wholeness, soundness, strength, quickness, vigour, growth, wakefulness, health. These must be won by a return to nature, and the natural food for man is a diet of fruit and farinacea, with which some combine such animal products as may be enjoyed without destroying sentient life.

In his clarification, and in specifically addressing his footnote to Sir Henry Thompson's work on diet, and Eduard von Hartmann's essay “Was sollen wir essen?” 'What should we eat?', Mayor is also addressing an issue which would be framed in modern terms as the difference between vegetarians and vegans. His opponents claim that vegetarians, transparently eaters only of foods with vegetable origins, are deceptive when they also eat animal protein like eggs or milk. So, there is position to be won by showing that vegetarian actually refers to a healthy diet in which those are permitted. (This being before it was clear how one might manage to get complete protein without them.) This same idea is presented by Josiah Oldfield in reply to two articles by Thompson and he gives a similar derivation from vegeto 'invigorate'. Eustace Miles rejects the name, since animal protein is needed for his athlete's vegetarian diet. Henry Salt hedges his bets a little:

Mind, I am not saying that the originators of the term “vegetarian” had this meaning in view, but merely that the etymological sense of the word does not favour your contention any more than the historical. (The Logic of Vegetarianism, p. 5)

(In other words, he goes one step beyond the argument that etymology exposes the true meaning of a word, because it does so even when the coiners did not know or intend it.)

In August, 1907, the Third Universal Congress of Esperantists was held in Cambridge. Mayor (then 83) took advantage of the opportunity to learn enough of the language to deliver a speech in Esperanto on the last morning (apparently he addressed them at other times in English that had to be translated). The Times (Mon., Aug. 19, 1907) reported:

Professor J. E. B. Mayor, professor of Latin at Cambridge, addressed the congress amid a scene of the greatest enthusiasm. He said that in their meetings miracle followed miracle, and he had ceased to be astonished at the mutual comprehensibility of all nations. It had come to be plainly seen that their Esperanto Congresses had resulted in the discovery of a new international nation, of which Dr. Zamenhof was the Christopher Columbus. They had witnessed a new Pentecostal festival, as shown by the different nationalities there represented. Professor Mayor proceeded to say he considered it a great mistake for people to suppose that the learning of Esperanto would interfere with the study of other languages. He was convinced that if a child of five learnt Esperanto he would afterwards learn with ease French, Latin, German, &c. Esperanto was, in fact, the lernigilo for all other languages.

See also here and here. I wonder whether a copy of this speech survives someplace.

For more information on Mayor, see the DNB, the obituary by J. E. Sandys in The Classical Review, the “Memoir” in Twelve Cambridge Sermons, and John Henderson's Juvenal's Mayor: the Professor who Lived on 2d a Day. (I do not agree with the Wikipedia that this last portrait is "unsympathetic," though it is indeed "idiosyncratic." It aims to strike some balance between Mayor as a useless old kook and ignoring his quirks. See also the review of it by one of Henderson's students, Susanna Morton Braund, who is herself a vegetarian and an editor of Juvenal.) Henderson has also edited a version of Mayor's Juvenal, adding commentary on the commentary. Someone should track down a copy of the Catalogue of the Library of J. E. B. Mayor, Deceased, Comprising Upwards of 18,000 Volumes of Books and get it started in LibraryThing's Legacy Libraries project.

Mayor's predecessor as president of the Vegetarian Society was Francis W. Newman, who is best remembered today for holding less orthodox religious views than his brother, John Henry, Cardinal Newman. (There was a third brother, Charles Robert, who was an atheist and a hermit. The Grammar of Assent sums up the situation, “Thus, of three Protestants, one becomes a Catholic, a second a Unitarian, and a third an unbeliever.”) Francis was Professor of — Latin! — at University College, London. (Charles Kingsley, Cardinal Newman's adversary in the debate on Catholicism and Truth which sparked, and is given in an appendix to some editions of, Newman's Apologia, had the same tutor at Cambridge as Mayor, Dr. Bateson.) Professor Newman was a polymath, writing on mathematics as well as religion, philology and vegetarianism. (See the bibliography here.)

Newman translated some English works into Latin, as part of a scheme for facilitating teaching the language:

  • Hiawatha:
    Juxta ripas Aequoris Maximi,
    Lacûs latissime relucentis,
    Nocomidis stabat tugurium,
    Nocomidis e Lunâ genitae.
    Nigra surgit pone silva,
    Atrâ contristata pino
    Atque abiete nucamentis squameâ.
    Clara jactatur in froute unda
    Praeter ripam Lacûs Maximi,
    Unda aprica, late relucens. (p. 22)
  • Shorter Translations of English Poetry into Latin Verse, such as [15] “Erin's Days of Old”:
    Tempus Ierne revocet veterum,
    Prava priusquam sua progenies
    Infidè proderet ipsam:
    Quum colli déçus aurea torqnis,
    Derepta superbo invasori,
    Malachaeum laudibus auxit;
    Regesque sui, viridi elato
    Panno, miniâ fronde Quirites
    Ducebant per fera bella;
    Necdum regia nostra maragdus
    Maris Hesperii gemma refulsit
    Tempora circum peregriui. (p. 45)
    Note how the philologist cannot help inserting a footnote proposing that Curaidhe 'knights' and Quirites must be cognate, an idea that he also picks up in Regal Rome and which gets blasted by Donaldson here. Curaidh seems to be from a root *k̂ū-ro-s 'strong' and cognate with κύριος and शूर.
  • or [61] “Peace After War”:
    Nobis hiems morosa tandem splendida
    Evasit aestas sole sub Ebŏrāceo;
    Nubesque cunctae, quae domum obscuraverant,
    Evanuere, penitus immersae mari. (p. 147)
  • Robinson Crusoe (Rebilius Cruso; as much a retelling in Latin as a strict translation, and so actually missing most famous passages):
    209. Tamen neutiquam satiata est mea cupiditas. Ad cocos nuces demetendas falculam illam mecum apportavi; scalas novas ipsis in hortis relinquebam. Dum autem infra incedo, ananassas video multas, (mala pïnea vulgo nos vocamus): nunquam ego anteà has animadverti. Jam intelligo et plurimas esse et maximas, paene ex arenis cum cactis nascentes. Unam illicò vindemiavi, nec abstinui quin grande frustum comederim. (p. 55)

He prepared a text of the Iguvine Tables with interlinear latin translation.

Of Newman's translation of the Iliad, Matthew Arnold wrote, “while for want of appreciating the fourth, [Homer's] nobleness, Mr. Newman, who has clearly seen some of the faults of his predecessors, has yet failed more conspicuously than any of them.” Newman replied with Homeric Translation in Theory and Practice and Arnold rejoined.

Newman wrote a Handbook and Dictionary of Modern Arabic, both avoiding the Arabic alphabet. He produced a number of monographs on Berber languages (I am not qualified to say how these compare to the ones listed recently at Jabal al-Lughat):

Newman's solution to the naming problem was V E M ('vegetables, eggs, milk'), suggested to him by his friend Thomas Jarrett, Professor of Arabic at Cambridge and then Regius of Hebrew. Of Jarrett, Edward James Rapson, Professor of Sanskrit at Cambridge, writing in the DNB, says:

As a linguist, Jarrett was chiefly remarkable for the extent and variety of his knowledge. He knew at least twenty languages, and taught Hebrew, Arabic, Sanskrit, Persian, Gothic, and indeed almost any language for which he could find a student. He spent much time in the transliteration of oriental languages into the Roman character, according to a system devised by himself; and also in promulgating a system of printing English with diacritical marks to show the sound of each vowel without changing the spelling of the word. (Vol. 10, p. 690)

Jarrett's New Way of Marking the Sounds of English Words Without Change of Spelling is online.

Newman sported the timeless look of a center part and long scraggly gray beard. For more information on him, see the DNB, the memoirs here, and the website of the Francis Newman Society.

Mayor's successor as president was, I believe, Ernest Bell, one of the sons of the publishers George Bell & Sons, and, as far as I know, not otherwise relevant to this post.

16 June 2009 @ 02:22 am

Open to interpretation
Songhay's lexical economy - the way it keeps its lexicon rather smaller than its neighbours' by using a single word to fulfill the functions of what in most languages would be several different words - has attracted the attention of several of those who have written about the language from the 1850s onwards. While Kwarandzyey (Korandje) is so full of Berber and Arabic loanwords that the size issue probably no longer applies, it still has many striking examples of polysemy. Take "open", for example.

fya (from Songhay *feeri) is best translated as "open" (its commonest sense). Of course, to open one's mouth can be to start eating - hence the frozen compound fya-mmi "open-mouth" means "breakfast". But opening is also what you do to release something from an enclosed space; hence to "open water (for something)" (fya iri), or just "open", is to irrigate, and to "open for an animal or person" is to release them. Likewise, to "open a rope (for something)" is to untie it. To release something from your grasp is to let it fall - hence to "open for something" is also to drop it. And for a man to release his wife from her obligations towards him is to end the marriage - hence to "open for a woman" is to divorce her.

We can map the connections between these easily enough, making it clear that they form a coherent network of meaning:

breakfast untie
\ / \
open - release
\ / \
irrigate divorce

But not only will any single English translation applied literally and consistently yield ludicrous results for at least some of these cases - translating it differently in different circumstances will force you to choose a single meaning in cases where the text is ambiguous. "He opened for the woman" probably means he divorced her, but in principle it could mean he released her (eg from prison), or untied her, or (literally) dropped her; in fact, since Songhay has no gender distinctions in pronouns, it should even be able to mean "It (eg an automatic door) opened for her". And of course, this kind of ambiguity can be deliberately exploited for effect, as in puns.

In Kwarandzyey, this is never likely to cause serious ambiguity - the language is almost never written down, and it's a small enough community that the context is usually known to everyone anyway. But imagine worrying about this kind of thing in a millennia-old text in a language that no one today speaks natively, and you can really see why even the most literal translation of such a text is unavoidably an act of interpretation.

09 June 2009 @ 05:01 am

Why dead snakes are like clothes
What would you say if, in some science-fiction novel, you read of a language where the situations that in English would be described as "The clothes blew down from the clothesline", "Push that dead snake away with a stick", and "I see where he's carrying the rabbits he killed hung from his belt" were all naturally expressed with the same root, plus nothing more than different affixes? What about "I slammed together the hunks of clay I held in either hand", "I slung away the rotten tomatoes, sluicing them off the pan they were in", and "I picked up in my mouth the already chewed gum from where it was stuck on the table"? My inclination would have been to dismiss it as a neat but implausible idea, placing some strain on the reader's suspension of disbelief. But - until no more than thirty years ago - such a language existed right in California. Go to Part III of Leonard Talmy's dissertation Semantic Structures in English and Atsugewi to get the data; here's a slightly less surprising example as a taster:

Subject=I, Object=3rd personfrom a linear object moving axially [with one end] non-obliquely against the FIGUREfor a small shiny spherical object to moveout of a snug enclosure/a socketfactual
I poked his eye out (with a stick.)
Subject=I, Object=3rd personfrom the mouth/interior of a person, working ingressively, acting on the FIGUREfor a small shiny spherical object to moveall about, here and there, back and forthfactual
I rolled the round candy around in my mouth.

Of course, people are people; after explanation, the similarities are easy enough to make out, and presumably given enough time anyone can learn to look at a situation and decompose it into elements like these, rather than the elements that "leap out" at an English speaker. In fact, I suspect that having to learn to see things the way the people you talk to do is one of the subtler drivers behind contact-induced language change. But cases like this provoke thought: just how much can the attributes of a situation most relevant to formulating a sentence vary from language to language?

Eastern Berber vocabularies on Google Books
Some digitised Eastern Berber vocabularies from the first half of the 18th century for your perusal, if you're into that sort of thing. I was particularly impressed to find a Sokna vocabulary - I haven't yet read any other source on that language, though admittedly I haven't looked that hard.

* Lyon's vocabulary of the Berber of Sokna, from 1820
* Hornemann's vocabulary of Siwi, from 1798 (at my homepage)
* Minutoli's vocabulary of Siwi, from 1826
* Caillaud's vocabulary of Siwi, from 1827
* Koenig's vocabulary of Siwi, from 1839 (lots of other vocabularies in here - Somali, for example, and Nubian and even Fur)

09 May 2009 @ 06:21 am

Some Zenaga (Mauritanian Berber) words
Zenaga is the barely surviving Berber language of southwestern Mauritania around Boutilimit. Here are a few words I think are found only in Zenaga (and in some cases Tetserret.) Unfortunately, I haven't found any really comprehensive dictionaries of (for example) Tashelhit, so I could well be wrong. If I am (as I was with agwəḍ), I'd love to hear it!

  • ämkän "young herd animal (eg sheep, goat)"
  • ārwiy "scorpion" (< *arwəl)
  • täygaḌ "young she-goat" (< *talgaḍ)
  • agaḏ̣iy "Moor, bidani (white man)"
  • tašanḍuḍ "mirror" [transcription to be fixed later]
  • taʔgaṛḏ̣aS "paper". (Other varieties have similar forms, but without any final s.)
  • tämärwuS "bride" (Ahaggar Tuareg has rwəs "to be in rut" - obviously related, but not quite the same sense!)

28 April 2009 @ 05:02 am

French among Algeria's elite
The key issue in Algerian linguistic politics - substantially overshadowing the question of the role of Berber - is what should be the language of bureaucracy and education: Standard Arabic (the official language, and the primary pre-colonial language of literacy for all Algeria) or French (the colonial language, and hence ironically the language which most of the few educated Algerians at independence had studied in.) In practice, it's settled on the one setup most certain to minimise social mobility: Standard Arabic is the primary language of education and symbolism, and French of bureaucracy and social climbing. On top of that, the language of everyday life is Algerian Arabic or Berber, from either of which reaching fluency even in Standard Arabic, let alone the much more different language French, is an uphill struggle.

I recently came across a very illustrative quote from a survey specifically focusing on minor political actors in Algeria - party cadres, journalists, bureaucrats, businessmen, trade unionists, etc:
"To a limited extent, the only space open to [political] actors with little or no knowledge of French were independent unions, independent NGOs, the Arabic press and Islamist parties. This tendency was illustrated by the fact that third-generation elites barely speaking French - only one out of ten interviewees - came from one of these domains. Most other interviewees were either Francophone or bilingual, the latter having difficulties determining which language they considered to be their mother tongue [a footnote suggests she means "primary language"]. The same interviewee often gave different answers depending on whether he filled in this author's questionnaire prior to the interview, or whether he was asked in the course of an interview what language he felt most comfortable speaking and writing. A huge majority of the third-generation interviewees according to their own assessment were better with written French than Standard Arabic. As far as oral skills went, a third of the interviewees said they spoke Standard Arabic as well as or better than French. Over half the interviewees put their oral French skills at the same level as their command of Algerian Arabic or Kabyle Berber dialect, and one out ten claimed to speak French better than anything else." (Isabelle Werenfels, Managing Instability in Algeria, pp. 85-6)
This kind of situation is a recipe for resentment. The government has spent years educating people to be better at Standard Arabic and telling them that it was everyone's duty to use it rather than French; but unfortunately their passion for reform, after creating legions of eager Standard Arabic-using job-seekers, stopped at the gates of the Civil Service. Check out Algerian government websites sometime - many of them don't so much as have Arabic versions (eg Energy, Health, CNRC), and most default to French.

As always, I think language skills should be a barrier only when they're necessary in themselves, not merely as a badge of class membership (and regionalism - people from Algiers or Kabylie are enormously more likely to speak good French than people from, say, the Sahara.) I'd certainly prefer Standard Arabic to French - it's much more like Algerian Arabic than French is, and more a part of Algeria's identity - but in the long run it would be better to create a situation where people could use their own mother tongue for official purposes.

24 April 2009 @ 03:41 am

Healed by the right words
We all know that placebos can be surprisingly effective. But - though it's not exactly surprising - I hadn't realised that there is experimental evidence that simply saying the right thing can have a curative effect.

Two hundred patients with abnormal symptoms, but no signs of any concrete medical diagnosis, were divided randomly into two groups. The patients in one group were told "I cannot be certain what is the matter with you", and two weeks later only 39% were better"; the other group were given a firm diagnosis, with no messing about, and confidently told they would be better within a few weeks. 64% of that group got better in two weeks." (Bad Science, p. 75, citing Thomas 1987)

I can imagine a lot of factors that could affect the effectiveness of the doctor's words here - mainly anthropological, but some of them would certainly fall within the domain of linguistics. For example, the intonation pattern will affect the patient's perception of the doctor's confidence; does that affect the efficacy? Likewise, the accent and the choice of vocabulary could both affect comprehension and perceived competence, and hence presumably the efficacy. Not really my field, but it could be a line of research with unusually clear-cut potential benefits. The obvious problem with this example is that it involves doctors lying to patients, but if the effect could be reproduced without that it would certainly be worth doing.

Thomas KB. General practice consultations: is there any point in being positive? BMJ (Clin Res ed) (9 May 1987); 294 (6581): 1200-2.

"Political complexity predicts the spread of ethnolinguistic groups"
An interesting paper: Political complexity predicts the spread of ethnolinguistic groups. Two basically unsurprising claims that it's good to have calculations supporting: "pastoralists were found to have larger language areas than agriculturalists" and "languages associated with more politically complex societies cover significantly larger areas than those of less complex societies". They also present arguments that "although regions of high biological and cultural diversity do overlap to a striking degree, it is unlikely that biological diversity has any direct effect on cultural diversity on a global scale." Surprisingly, mountainousness was found to correlate with larger language areas, not smaller ones - seems a little suspicious that, though some mountainous areas are pretty un-diverse. Flaws: well, it relies on Ethnologue data and GMI maps, both of which are often unreliable, and systematically more splittist in some areas than in others; but it's not obvious that that would substantially affect the result. Also, ethnic groups, languages, and political units very often don't match up, and their measure of political complexity is based on data for ethnic groups rather than for languages.

(Via GNXP.)

18 April 2009 @ 05:43 am

How many words are there in a language?
In a recent discussion, the question came up of whether a language's vocabulary could be tallied (briefly addressed at Language Log a while back, and at FEL.) I have no firm answer to that (and it's logically independent of whether or not you can estimate the proportion of the vocabulary coming from a given language - that's a sampling problem.) But, notwithstanding the bizarre if occasionally entertaining acrimony of that discussion, it's actually a rather interesting question.

Clearly, any given speaker of a language - and hence any finite set of speakers - can know only a finite number of morphemes, even if you include proper names, nonce borrowings, etc. ("Words" is a different matter - if you choose to define compounds as words, some languages in principle have productive systems defining potentially infinitely many words. The technical vocabulary of chemists in English is one such case, if I recall rightly.) Equally clearly, it's practically impossible to be sure that you've enumerated all the morphemes known by even a single speaker, let alone a whole community; even if you trust (say) the OED to have done that for some subset of English speakers (which you probably shouldn't), you're certainly not likely to find any dictionary that comprehensive for most languages. Does that mean you can't count them?

Not necessarily. You don't always have to enumerate things to estimate how many of them there are, any more than a biologist has to count every single earthworm to come up with an earthworm population estimate. Here's one quick and dirty method off the top of my head (obviously indebted to Mandelbrot's discussion of coastline measurement):
  • Get a nice big corpus representative of the speech community in question. ("Representative" is a difficult problem right there, but let's assume for the sake of argument that it can be done.)
  • Find the lexicon size required to account for the 1st page, then the first 2 pages, then the first 3, and so on.
  • Graph the lexicon size for the first n pages against n.
  • Find a model that fits the observed distribution.
  • See what the limit as n tends to infinity of the lexicon size, if any, would be according to this model.

A bit of Googling reveals that this rather simplistic idea is not original. On p. 20 of An Introduction to Lexical Statistics, you can see just such a graph. An article behind a pay wall (Fan 2006) has an abstract indicating that for large enough corpora you get a power law.

But if it's a power law, then (since the power obviously has to be positive) that would predict no limit as n tends to infinity. How can that be, if, for the reasons discussed above, the lexicon of any finite group of speakers must be finite? My first reaction was that that would mean the model must be inapplicable for sufficiently large corpus sizes. But actually, it doesn't imply that necessarily: any finite group of speakers can also only generate a finite corpus. If the lexicon size tends to infinity as the corpus size does, then that just means your model predicts that, if they could talk for infinitely long, your speaker community would eventually make up infinitely many new morphemes - which might in some sense be a true counterfactual, but wouldn't help you estimate what the speakers actually know at any given time. In that case, we're back to the drawing board: you could substitute in a corpus size corresponding to the estimated number of morphemes that all speakers in a given generation would use in their lifetimes, but you're not going to be able to estimate that with much precision.

The main application for a lexicon size estimate - let's face it - is for language chauvinists to be able to boast about how "ours is bigger than yours". Does this result dash their hopes? Not necessarily! If the vocabulary growth curve for Language A turns out to increase faster with corpus size than the vocabulary growth curve for Language B, then for any large enough comparable pair of samples, the Language A sample will normally have a bigger vocabulary than the Language B one, and speakers of Language A can assuage their insecurities with the knowledge that, in this sense, Language A's vocabulary is larger than Language B's, even if no finite estimate is available for either of them. Of course, the number of morphemes in a language says nothing about its expressive power anyway - a language with a separate morpheme for "not to know", like ancient Egyptian, has a morpheme for which English has no equivalent morpheme, but that doesn't let it express anything English can't - but that's a separate issue.

OK, that's enough musing for tonight. Over to you, if you like this sort of thing.

