Wednesday, August 31, 2016

Missing "macrolanguages" of Africa

Screenshot from VOA's Kinyarwanda/Kirundi site
The Voice of America (VOA) recently had a job opening for "International Broadcaster (Multimedia) (Kirundi/Kinyarwanda)." Kirundi and Kinyarwanda are the mother tongues, national languages, and co-official languages in, respectively, Burundi and Rwanda. And they are mutually intelligible, with only minor differences, such that apparently a fluent speaker of either could work on a program serving speakers of both. But there is no term covering both - unless one counts the hyphenated Rwanda-Rundi - and no language coding category to cover material designed for use across the two.

This is a situation encountered with many languages in Africa, and one for which there is at least one potential solution - the neologism and language coding category "macrolanguage." There are actually some macrolanguages defined in Africa, but these are few, and as I discuss below, kind of accidental. Is it time to systematically identify (and code) macrolanguages in Africa?

What defines a language?

For most of us, the distinction between languages seems pretty straightforward. But beyond the most spoken international languages - those used officially by the United Nations or ones you are likely to see on a school curriculum - the situation is often more complex. Sometimes two or more closely related languages are so similar that their speakers can understand each other, but sometimes variations within one language can make understanding difficult. An earlier posting on this blog looked at the notion of "neighbor languages" in Scandinavia and Africa. A broader consideration of these issues by Columbia University's John McWhorter suggests that we're really all speaking dialects, some of which benefit from written forms, and one might add, status, resources, and policy support. There is some truth to the saying that "A language is a dialect with an army and a navy."

However, the issues of what to call a "language" and where to draw the boundaries between it and another "language" are still of practical importance for communication (standardization, references, ICT use) and planning (government, business, education). There are two broad approaches in linguistics to doing this, corresponding with the splitter/lumper (or joiner) approaches to categorizing:  one focusing more on distinctions, and the other focusing more on commonalities.

Without going too deeply into that discussion, which gets more complicated when accounting for issues of identity, names, written forms, and national boundaries, suffice it to say that in considering African languages, there are many situations where one encounters the splitter/lumper choice.

The major reference of languages in the world, Ethnologue, takes a more splitter approach, which means that speech varieties that are closely related and interintelligible may be classified as separate languages. It is their estimate of the number of language in Africa (over 2000) that is most commonly cited, but there are other more conservative estimates.A good academic discussion of this issue entitled "How many languages are there in Africa?" was published in 2004 by Jouni Filip Maho (his estimate is under 1500).

What is a "macrolanguage"?

To make the story brief, the term "macrolanguage" is not a term that was used in linguistic description before the inauguration of the  ISO 639-3 system for encoding all languages in the late 2000s. Since that system is based on Ethnologue's "splitter" data, a new category was needed to accommodate existing codes in the earlier less comprehensive parts of ISO 639 (1&2) that in many cases were more "lumper" in approach. The term macrolanguage was in effect a "shim," to borrow someone else's term, to fit the two systems together.

There are by my count 14 macrolanguages listed for Africa (names linked to the Ethnologue macrolanguage pages): Akan; Arabic; Dinka; Fulah; Gbaya; Grebo; Kalenjin; Kanuri; Kongo; Kpelle; Malagasy; Mandingo; Oromo; and Swahili. There could be others.

That brings us back to Kinyarwanda and Kirundi. How is the relationship between them different - more distant - than any of the above established macrolanguages? One difference, as mentioned above, is no common name to make it easy, and another is that they are dominant in different countries - perhaps analogous to the situation of Scandinavian languages?

Another curious situation is that of Mandingo, which includes several western Manding languages, but not Bambara and Jula (Dyula). Even if the latter two were considered too different from the other Manding tongues, they are close enough that one could localize software for the two together. Keep in mind also that the emerging literary standard N'Ko covers all Manding languages (in a different alphabet). Should the Mandingo macrolanguage be extended to include them all?

The four languages of southwestern Uganda - Kiga, Nkore, Nyoro, ajd Tooro - are close enough to be covered by Runyakitara, a proposed (but not encoded) standard which is being used in various ways, including at least some teaching and a localization of the Google interface. Should these four be considered a macrolanguage under perhaps that same name, thus finally providing a code for localization in Runyakitara?

And there are other examples around the continent that could be discussed.

What good would more macrolanguages do?

The first benefit of identifying more macrolanguages would be in language coding - the very environment in which the term was first used. The language of VOA's website for its Kinyarwanda/Kirundi service - - is coded as "rw" (Kinyarwanda) since there is no macrolanguage code covering both languages. Likewise, in many cases, the grouping of very close and mutually intelligible languages as a macrolanguage could facilitate localization of software and apps to serve larger populations - and those larger markets could make it more likely that such localization would be pursued and maintained.

Another benefit would be to complement the tendency in language coding towards seeking more granularity, by recognizing natural groupings of languages (for more on this, see a message to the IETF-languages list last May). In effect providing more balance between splitting and lumping/joining.

In the broader picture, identifying macrolanguages could have benefits for policymaking and program development involving languages within macrolanguage groups, by calling attention to the closely related languages. Especially where foreigners are involved, projects may overlook such relationships and the potential resources they may provide. For example materials development for education, and various communication needs might benefit from tapping efforts and resources in closely related languages.

(Minor edits and image added, 2 Sep. 2016)


Anonymous said...

This is very interesting and is along the work done by Prof. Kwesi Kwaa Prah who's working on harmonising dialects into mutially understandable languages

He works at CASAS, South Africa, which have published much in this field where standardised orthographies are being published in cooperation with dialects or languages on the ground

Some argue that rather than there being 1,500 languages in Africa that the contintent is actually quite homogonised with about some 85% of Africans speaking 15 'Languages' What is called a 'language' in Africa would, in Italy or Germany or Wales, be called a dialect but, with the added complication of ethnic identity and colonial borders creating dislocation. (but, Im interesting in learnign more about this not being an expert in any way!).

Essentially one could argue, for instance, that rather than there being 11 language in SA there 4 main ones - English, Afrikaans, Nguni (Zulu, Xhosa) and Tswana (Sotho). One could argue that the whole Nguni family from Xhosa to Ndebele in Zimbabwe are one language which many dialects, which, had they different dialects not been written down at different times by different imperial missionaries or educationalists could have created on stadard written language which would now be large and strong enough to counter the threat of English or French.

As a white Welshman I obviously don't know a quarter of enough about the situation. And, I'm guessing, that its more complicated than that. But, if African languages are to survive and flourish, it makes sense in my mind that they try and encomass or coalese around one standard written form and official form which recognising that people will continue to speak their own dialect in their own communities. If this doesn't happen, then the current (lack of) police is creating a ragged quilt of poor, unconnected and unsupported dialect/languages which will lose strength and confindence among its speakers which is what seems to have happened with Igbo in Nigeria

Prof Prah's papers are very interesting:

Cymru | Wales


Don said...

Thank you for this comment and I appreciate your bringing up Prof. Prah's work. I actually cited it in African Languages in a Digital Age (in chapter 4, which also mentions macrolanguages). His position is very much a lumper/joiner one, which I'd put at the extreme opposite end of the spectrum from Ethnologue's splitter approach. From personal experience, as one who is neither African nor a formally trained linguist, and communication with others who have worked with African languages, it is my impression that while Prof. Prah's 15 languages is important for highlighting the different ways one can count languages in Africa and may have applied uses for some kinds of standardization and development of language technologies, it is also a simplification of the realities in the field. There are many instances where distance of speech varieties one from another are significant, and when combined with lack of formal education in those languages (since education systems focus often uniquely on Europhone languages), they can make communication difficult.

An anecdote: When in Niamey one neighbor was a Fula woman originally from Guinea Bissau, but she could not communicate with the Fula (Fulfulde) speakers of Niger (her native Pular being kind of an outlier in the range of varieties of that language). I on the other hand had studied Fulfulde in Mali (west of Niger) and adapted to Pular in Guinea (just south of Guinea Bissau) and was once able to interpret. This as a second language speaker, because I had spent time figuring out the grammar in Mali before we had a book on it, and working on vocabulary in both countries, etc. The limited amount or entire lack of formal education in African languages is a huge factor. I wonder if we'd have the same "babel" image of the continent's linguistic terrain if more resources and effort had been put into the African side of what we now call MTB-MLE beginning in the 1960s.

The cases of Nguni and Sotho/Tswana in South Africa are interesting. I've mentioned elsewhere more than once that translations in that country often go from English to one of the Nguni languages, and then from the latter to the other Nguni languages, and the same with Sotho/Tswana. So at least on that practical level, those similarities are used. I don't know enough about them to suggest whether one could propose say an Nguni macrolanguage.

Anonymous said...

Don: Fascinating, as always.

Here's another comment from a white Welshman, though this one lives in Indonesia.

In her plenary presentation during the 11th Language & Development Conference in New Delhi in November last year, Birgit Brock-Utne made this statement:

"... according to Kwesi Kwaa Prah, ... 90 per cent of the total population of sub-Saharan Africa can be grouped into 23 language clusters; in fact 12-15 such languages would suffice for 75 per cent to 85 per cent of the population (Prah 2005, 2009b)."

(I am currently editing the Proceedings of that conference, which should be available on the website later this year or early in 2017.)

So Prof Prah, quoted by Prof Brock-Utne, seems to be saying something somewhat more modulated than just '15 macrolanguages in Africa'.

Hywel Coleman

Anonymous said...

Diolch Hywel! Funny that this post about African language has been colonised by Welsh people.

I'm guessing Prof Prah's contention is that these dialect/language continium includes no more difference than between Classical Arabic used across the Arabic world and then the various venacular forms of Arabic. It's possible to have both and so strengthen the African languages and the mass of the African population.

Of course, this is my sketchy view. I look forward to reading the proceedings and would like to hear from people in the ground in Africa about how this is developing (if it is?). Maybe the force of French and English within the state polities are too strong?

I'd also like to know more about Sheng the new 'language' spoken in Kenya - is it a language or a version of Swahili?


Don said...

Thanks Hywel, Siôn, for your comments. My understanding of the range of speech varieties included in some of Prof. Prah's 12-23 languages would be wider than that among varieties of colloquial Arabic, but that's not based on any research. The main benefit of his system that I see - as one who has been focused mainly on localization and use of languages in development communication & education - is in giving a broader focus on the otherwise balkanized linguistic maps we otherwise are given for Africa.

The linguistic situations and trends in Africa are complex - if I had the time I'd be posting a lot more about them. As for the question re Sheng, others like Chege Githiora at SOAS, have written on that.

Hywel, am also looking forward to seeing the proceedings...

Anonymous said...


Thanks, one last question on this (broad) subject. I'm slightly baffled by what people mean by 'Bantu' languages. What does it actually refer to? From my European perspective is it:

1) the 'Indo-European' to many African languages. That is very disparate languages from Nguni being a 'Germanic' language to Kongo being a 'Celtic' language and Yoruba being a 'Romance' language?
2) or is Bantu the 'Latin' to many languages? So that the Nguni dialects are 'Italian' ; Kongo is 'Spanish' and Yoruba is 'Fench'?

Hope this makes sense.


Don said...

Sion, A quick answer. Bantu has several meanings, but in the usual linguistic classification (sometimes called Narrow Bantu). It is maybe more analogous to Romance or Germanic or Slavic - though that could be misleading. It is in the Niger-Congo family; Indo-European is also a language family. The hierarchies below each are not necessarily comparable steps, as far as I understand.

But it is a fact that Bantu languages share many structures. And lower in the taxonomy similarities among languages in vocabulary. Nguni languages are very close - from what I understand, perhaps roughly analogous to Scandinavian languages, or Iberian languages.

Macrolanguages of course are well down the scale; I don't recall anyone suggesting that Nguni, for instance, would be a macrolanguage. On the other hand, Manding (which is under Mande within Niger-Congo) is partly classified as a macrolanguage (per the discussion in the post).

Hope that helps...

Anonymous said...

Don - thanks, so, basically (and knowing that no two situations are identical etc.)

Bantu languages = say Romance languages:

'Nguni' languages = 'Iberian languages'
'Sotho' langauges = 'Italian'
other language grouping would 'equal' say 'French' 'Romanian'

So, speaking one Bantu language/dialect makes it very easy to learn other.