Saturday, November 30, 2013

Ethnologue and the cross-border languages of Africa

Two content features seem to me to detract from the overall quality of Ethnologue - which is the indispensable reference on all the world's languages. One is that pages on cross-border languages - a prominent sociolinguistic feature in Africa - are titled as the language of only one of the countries where they are spoken. The other is that in the current online version, the country pages list "national language," which is a problematic heading, given diverse use of the term, notably in a number of African countries. In this post I will deal with the first of these two items.

A continent traversed by cross-border languages

Due to the way borders in Africa were drawn, a great many of its ethno-linguistic groups are split among two or more of the modern African countries. The languages spoken by such groups can today be called "cross-border languages," and in fact this is the term used by the African Academy of Languages for some of the work it is doing, notably the Vehicular Cross-Border Language Commissions.

Languages that are spoken in more than one country - whether as a first language, which is often the case, or as a vehicular language, which is also frequent - have been a concern of language planning in Africa since independence. The various conferences on African languages that I have cited in some previous posts reflect this concern. Cross border languages were also highlighted by former Malian president Alpha Oumar Konaré, who compared them to "sutures" uniting African countries.*

"A language of" one country, more than one country, or a region?

When one looks up one of these cross-border languages in Ethnologue, however, they are as a rule listed as "a language of" a single country. It bears noting that when borders divide a language community, that community is rarely split in equal parts, so it appears that Ethnologue usually assigns the language to the country where there are more speakers. Other countries, regardless of significance in the language usage, are placed under "Also Spoken In:..."

So, for instance Hausa, the first language of 18.5 million Nigerians (according to Ethnologue, based on a 1991 SIL estimate), but also of about half the much smaller population of Niger, and used across large parts of West Africa (as a lingua franca), is titled simply "... a language of Nigeria." Is Hausa any less a language of Niger, given that the historic home of most Hausas (sometimes called Hausaland) extends well into the latter country? Better "A language of Nigeria and Niger"? Or given the number of other countries where it is "also spoken," maybe Hausa is really "A language of West Africa"?

Another example is the Ewe language, spoken by a population split between southeastern Ghana and southern Togo, which is listed as "... a language of Ghana." Similar to the case with Hausa in Nigeria and Niger, Ewe is spoken by more people in Ghana, but by a larger percentage of the population in Togo. So why not "A language of Ghana and Togo"? The reverse is noted in the case of Southern Sotho, which has more speakers in South Africa, but a higher percentage of population speaking it in Lesotho (it has legal status in both countries) - and is listed as "A language of Lesotho."

Examples abound, among which the major regional language of Swahili is listed as a language of Tanzania (see also discussion of macrolanguages, below).

Ultimately it seems (1) misleading to title pages on cross-border languages as languages of one particular country, and (2) inconsistent the way it is done. Going back to Ethnologue's "Plan of the Site" page, one finds mention of counting "each language only once as belonging to its country of origin" - but what if the area of origin of a language (to the extent one can determine that with any precision) is divided by borders?

Would it not be possible to develop a simple set of criteria by which cross-border languages were given titles  based on the extent (countries) of their major use?

Cross-border macrolanguages

The category of macrolanguage - defined as "multiple, closely related individual languages that are deemed in some usage contexts to be a single language" - takes this issue up another level. Although defined on linguistic criteria, macrolanguages in Africa are even more likely to cross borders, often many borders. There are 14 African macrolanguages by my count, with many of those being cross-border and some really looking like regional languages. Yet all of those are listed as being of one country or another:
  • Arabic, "A macrolanguage of Saudi Arabia" (spoken in many countries, including at least 9 in Northern Africa)
  • Dinka, "A macrolanguage of South Sudan" (spoken mainly in South Sudan)
  • Fulah, "A macrolanguage of Senegal" (spoken in well over a dozen countries, mainly in West Africa)
  • Gbaya, "A macrolanguage of Central African Republic" (spoken in CAR and Cameroon)
  • Grebo, "A macrolanguage of Liberia" (spoken in Liberia and Ivory Coast)
  • Kalenjin, "A macrolanguage of Kenya" (spoken mainly in Kenya, and also in Uganda and Tanzania)
  • Kanuri, "A macrolanguage of Nigeria" (spoken in 5 countries of West and Central Africa)
  • Kongo, "A macrolanguage of Democratic Republic of Congo" (spoken in DRC, Angola, and Congo)
  • Kpelle, "A macrolanguage of Liberia" (spoken in Liberia and Guinea)
  • Malagasy, "A macrolanguage of Madagascar" (spoken mainly in Madagascar)
  • Mandingo, "A macrolanguage of Guinea" (spoken in 7 countries of West Africa)
  • Oromo, "A macrolanguage of Ethiopia" (spoken in Ethiopia, Kenya, and Somalia)
  • Swahili, "A macrolanguage of Tanzania" (spoken in at least 9 countries mainly in East Africa, among which some governments have accorded it legal status)
  • (Akan is described as "A language of Ghana" in Ethnologue, and is also a macrolanguage in ISO 639-3. Either way it is considered to include Fanti and Twi, and is spoken mainly in Ghana.)
Here again, would it not be possible to adjust certain titles to more accurately convey the range of use? For instance, Fulah as "A macrolanguage of West Africa" and Swahili as "A macrolanguage of East Africa." The macrolanguage items may be easier to modify on a case-by-case basis, as there are fewer of them than languages, and their respective circumstances are somewhat unique. The language entries on the other hand might, as suggested above, need a set of criteria to avoid case by case discussion.

Final thoughts

These observations and suggestions are made in the spirit of helping improve the Ethnologue resource, with a mind particularly to what kind of information that people new to the study of languages of Africa would take away from their initial encounter with it. Cross-border languages exist in all world regions, of course, but perhaps in none more than Africa, where borders were never intended to respect the integrity of ethno-linguistic groups. This category of languages seems to me to merit attention and appropriate revision in how it is presentated.

* "Les langues nationales transfrontalières doivent être non pas des points limitrophes, des points de démarcation, mais des points de suture entre nos pays."

1 comment:

Don said...

Ethnologue editor Dr. Paul Lewis of SIL responded to this issue in reply to a post on H-Africa in March 2014. Full text is at

Excerpt follows, with my comment below:

"...we continue to be required to identify each language as being 'A language of...' some geopolitical entity simply as a means of organizing the data and making it searchable. We may be able to find a way to more clearly identify and highlight the other countries in which each language can be found and we certainly explore the possibilities. The obstacles are primarily database reporting issues and I have a lot of confidence that our database wizards can find a way. I want to make it clear, however, that our intent is not at all to obscure the locations of languages nor to deny the existence of widespread languages across borders and entire regions. In this next update, we are providing a significant amount of new information about widespread languages that serve as languages of wider communication and about the use of languages as second languages with expanded descriptions of second-language use (Also use... Also used by...). We want to provide as much information of this sort as we can and in a way that is maximally useful to all users of the Ethnologue in all parts of the world."

I appreciate the follow up and explanation. However, though I'm not an expert on databases and appreciate the challenges Ethnologue faces, I still am puzzled by the idea that a database could only allow a language to be associated with one country.