Re the sweet-potato: thanks a lot for all your points, Pacal!  First: no, I have not been influenced by Thor Heyerdahl!  I was noting that in a few instances, including this one, where the words are of intermediate length and where there is some non-linguistic evidence, there might be a respectable (not certain) case for a diffusionist account of shared word-forms.  But yes, I should probably have said ‘Mesoamerican’ rather than ‘South American’ in respect of the origin of the sweet-potato (although opinions still seem to vary and I am not myself expert in such matters).  And I grant fully that most Central/South American words for the vegetable do not resemble Polynesian (k)umara (I never intended to be read as suggesting otherwise).  Re the word-forms which do appear partly shared: if this is not a coincidence (and I did and do not rule out this possibility; I said only that the hypothesis of an actual link was not implausible), it is conceivable that the plant itself diffused westwards from the Americas while the word-form later diffused eastwards from Polynesia (though other things being equal this is obviously a more complicated hypothesis than that of a single diffusion of word-and-thing).  But if this really is one of the ‘best’ cases for the hyper-diffusionists, their overall position is not promising – as Pacal and I would obviously agree.

I take on board Kenneth G’s point that in many cases it is not clear that anybody actually gets hurt by a strange belief about language.  But I would argue, with Pacal, that there are cases where harm of various kinds can arise from such beliefs (reinforcement of racism, religious fundamentalism or extreme nationalism; reliance upon ineffective therapies; etc.).  And of course I would hold, as a ‘modernist’, that (other things being equal) it is better to adhere to (probably) true ideas than to false ones.

The comment about amateur attempts to find ‘cognates’ at the phoneme level which I mentioned last time was made by Ken, not by Kenneth G (sorry for the mix-up!).

To return to my theme:

There are several especially interesting sub-sets of ‘fringe’ historical linguistic claims of the type I’ve been discussing.  The first of these sub-sets involves the alleged mutual intelligibility of languages generally believed to have had no common ancestor in historic times and no significant pre-modern contact.  For example, Cyclone Covey and Ethel Stewart have claimed that Uighur (Turkestan) and Navajo and the Na-Dene languages generally (USA) are (or recently were) mutually intelligible, explaining this in terms of some Asian groups having migrated to the Americas much more recently than is normally supposed.  Other such claims involve Basque or Gaelic being mutually intelligible with languages of the Americas, Crimean Tatar being mutually intelligible with Latvian, etc., etc.  In a broadly similar vein, Gavin Menzies claims that Chinese is widely spoken or at least understood in unexpected places such as Peru, because of the alleged circumnavigation and exploration of the globe by 15th-Century Chinese fleets.  But in no such case is there adequate evidence of genuine mutual intelligibility.

A related sub-set of cases is illustrated by Bruria Bergman’s claim that a temple chant used in Japan (with a trite and irrelevant meaning in Japanese, as is typical of such chants) is in fact in distorted Hebrew; in fact, it is much more easily, by way of a spoof, interpreted as, for instance, late Latin introduced by ‘Dark-Age’ Christian missionaries who are known to have been active in neighbouring parts of China (I myself was able to invent such a reading in twenty minutes).

There are two further special, overlapping groups of claims about: a) the deliberate or semi-deliberate conspiratorial concoction of languages or language data out of other known languages or reconstructed (or invented) languages (often by churches and other bodies with an alleged interest in deceiving humanity); and b) ancestor languages of a specific type involving very short words.  For cases of sub-type a), the relevant statistical considerations are much more difficult, since these considerations assume normal unplanned change.  These particular theories, although they are typically both implausible and indemonstrable, are thus almost immune to effective disproof along these lines.  However unsystematic and/or otherwise implausible a set of changes might be, it could occur if it was deliberately planned as part of a project of language concoction.

Examples of this type of claim include those of Polat Kaya (almost all words of almost all languages are really Turkish, deliberately corrupted and in this case often ‘anagrammatized’ so as to conceal their origin), Ior Bock (the source language is Finnish Swedish), Isaac Mozeson and others (Hebrew), Edo Nyland (Basque), P.N. Oak and his followers (Sanskrit), Michal Tsarion (Irish Gaelic, seen as a ‘secondary Ursprache’ arising after the fall of Atlantis), etc.  As before, the language identified as source is usually one favoured by the author, typically his own language or its ancestor.

By way of an additional feature (b), some cases of this special type, and a few of the otherwise ‘normal’ type as discussed above, involve the re-analysis of known linguistic forms (especially of the alleged ancestral forms) into sequences of monophonemic (single-phoneme), monosyllabic or other very short morphemes (meaningful word-parts) or words.   For example, some of those who assert that Hebrew is close to the Ursprache, and that the ultimate origins of Hebrew and Hebrew-derived words have been concealed, also propose monophonemic or other very short morphemes for early Hebrew.

Another such claim is that of John J. White, who uses the usual amateur philological and etymological methods to trace all languages back to an Ursprache called ‘Earth Mother Sacred Language’ (EMSL).  This alleged language had only very short morphemes.  White proposes fifteen basic morphemes; two of these are monophonemic (the vowels a and u), nine are monosyllabic (eight of these have the form Consonant-Vowel, the ninth is en) and the remaining four are disyllabic (Consonant-Vowel-Consonant-Vowel).

Each of EMSL’s very short morphemes has variant phonological forms (allomorphs).  One morpheme, de (basic sense ‘the’) has as many as nine variant initial consonants and various variant vowels.  The total number of ‘morphophonemic’ forms (word-shapes) is thus substantial; but all forms are allomorphs of the basic fifteen morphemes and derive their meanings from these.  For instance, za is a variant of de and therefore retains its sense (‘the’).  The supposed existence of so many very short and often widely differing forms with shared meanings obviously increases the freedom of the inventor enormously.  In particular, the acceptance of many variant vowels in each morpheme generates a theory in which the vowels, as White acknowledges, are often irrelevant.

In addition, in EMSL the ordering of these short morphemes is itself said not to be significant; a given sequence of morphemes will normally have the same meaning regardless of their linear order as spoken.  Thus both ty-re and ra-za – the same two morphemes, in opposite orders and also in variant forms – mean ‘the earth’.  This stipulation obviously increases the freedom of the inventor even further.

Some of the EMSL morphemes are transparently similar to short words with the same meaning in known languages; for example, ge (‘earth’) is very similar to the equivalent Ancient Greek form.  Others, such as ni (‘people’), are less obviously linked with known forms from familiar later languages and are more clearly the result of White’s very loose application of the comparative method.

White’s use of cross-linguistic data is typically naive.  For example, he deduces from the use of suffixed definite articles (the equivalents of ‘the’) in Albanian that this pattern may be of long standing in European languages generally (it is not), and he therefore frequently interprets final -de and its equivalents, in various languages, as a definite article; this includes the endings of English earth and the closely related German equivalent Erde.  He also interprets initial d(e)- in a similar way, for example treating de- in Latin deus (‘god’) and dea (‘goddess’) as an article – even though Latin has no definite article, meaning that these words, at least in their Latin guises, contain no material with that meaning.  In all such cases, White is of course implicitly rejecting known or very likely etymologies for the words in question.

More generally: it is possible for a few morphemes in a language to be very short, and indeed some may consist of only one phoneme.  These are usually grammatical morphemes/words rather than ‘lexical’ morphemes (normal dictionary words and word-roots); grammatical morphemes more generally are often especially short.  Most monophonemic morphemes involve vowel phonemes, as in English /ə/ = a (the indefinite article, as in a book), but this may also arise with certain types of consonantal phoneme, as in the case of Russian v, meaning ‘in’ or ‘to’.  (When speaking, linguists refer to the phone [ə] or the English phoneme /ə/ as schwa, from a Hebrew word.)

However, this pattern can apply to only a very few morphemes of any given language; and these monophonemic morphemes are still morphemes in precisely the same way as are other morphemes which consist of two or more phonemes.  Thus, English a contrasts grammatically and semantically with the diphonemic (two-phoneme) English definite article the, and is identified as a morpheme by the same methods of grammatical analysis.  (It itself also has a second, diphonemic allomorph: an before an initial vowel, as in an apple.)  Such morphemes are simply morphemes which happen to be monophonemic.  Where the phoneme /ə/ occurs in English words other than the indefinite article, for instance as the initial a- in around, it does not represent the indefinite article or its meaning.

And there are solid general linguistic reasons why the morphemes of a language cannot be predominantly monophonemic.  For instance, if most or all morphemes were monophonemic, the result would be a truly vast amount of homophony: different, unrelated morphemes/words with identical pronunciations.  Such a language would be unusable.  Even White’s system yields huge numbers of homophones.  (And even languages with predominantly monosyllabic morphemes and limited inventories of possible syllables – such as Chinese of various types – have had to develop special systems in order to have vocabularies of adequate sizes.)

Claims of this kind, like claims regarding the ‘concoction’ of languages, are highly implausible but very difficult to refute.  Very many longer words of known languages will contain each given sound, and it is not difficult to concoct accounts deriving the meanings of these words from those allegedly associated with each sound making up the longer, known word (especially where their order is said not to be significant, as in EMSL).  Where the etymologies do lend themselves to serious examination, the derivations are typically far-fetched and naturally in conflict with those generally accepted.

More next time on a particularly sensationalistic claim of this last type; then more special types of ‘fringe’ claim in this area.




  1. Navajo and the Na-Dene languages generally (USA)

    Slight nitpick: The Na-Dene languages should probably be described as North American, rather than “USA”, as the largest territory of speakers is in arctic and sub-arctic parts of Canada, and the Apachean branches of Navajo extend into Mexico. (Here’s a cool map.)

    Actually, the Navajo/Na-Dene connection is kind of germane to the discussion, as it represented an early (Sapir 1915) triumph of mainstream linguistics in demonstrating an actual linguistic diffusion across a fairly huge amount of geography separated by a large swath of unrelated languages.

    I wouldn’t be surprised if the Navajo/Na-Dene connection is what inspired a lot of the ETREEM-linguistics-to-the-MAX! of the hyperdiffusionists.

    • marknewbrook says:

      Thanks a lot, Ernest! I agree that the Na-Dene languages should probably be described as North American, rather than ‘USA’; my apologies for this slip!And yes, thishas been an important area for mainstream as well as ‘fringe’ historical linguistics. Cheers! Mark N

  2. Pacal says:

    Thanks for your comments regarding the sweet potato. The main reason i brought it up is that the Quechua word that is similar to the Polynesian word kumara is confined to a small part of highland Ecuador. Secondly the early Spanish accounts, lexicons etc., of Peru do not record the natives useing this word for sweet potato. In fact it appears the first notice taken of this word seems to be mid to late 19th century.

    I frankly have no idea why the word is so similar to the Polynesian word. I will note that beginning in the early 19th century and contin uing to c. 1870 Peruvian ships kidnapped for forced labour large numbers of Polynesians for death dealing work on the Guano islands and on shore in Peru.

    Your right though if this is the best case linguistic diffusionists have it is not very good. I now that in the last 20 years it has been definietly proven that the sweet potato diffused to Polynesia before Columbus. Although just when is up for debate.

    I’m glad to read that you were not influenced by Thor Heyerdahl. Nothing that man said can be trusted. Everything he said must be checked.

    Regarding the mutual inteligibility test. It has been my experience that those asserting mutual intellgibility have almost always been people who don’t know the languages in question. Thus we get Barry Fell aseerting that certain Indian languages were in fact Semitic! A classic case was in the 19th century various people asseerting that the language of the Mandan Indians and Welsh were mutually intelligible!

    Speaking of absurdity have you ever read some of the far out pseudo-speculation about the fascinating click languages of Southern Africa?

    • marknewbrook says:

      Thanks again, Pacal! I have already mentioned claims regarding Welah in the Americas and may return to this point; I also hope to discuss ‘clicks’. Mark

