Friday, October 31, 2014

Translating Hope - Health Education in African Languages

In order to facilitate networking and resource sharing on ebola messaging in (West) African languages, a new online initiative is in the works, consisting of an email list, with a web-based platform for collaboration coming. Watch this space for more information.

Addendum (3 Nov. 2014)

The web-based platform is now up. It uses the BlackBoard system, and is hosted by the University of Alaska - Anchorage (UAA). This and the email list are intended to work in tandem.

I have the privilege of collaborating on this with Dr. Catherine Knott (UAA, Anthropology), who was a volunteer in Peace Corps/Mali at about the same time I was, and her colleagues at UAA.


The aims of Translating Hope (Health Education in African Languages) are described in the following (this list is posted on the BlackBoard site, and may be updated and changed there):

Understanding the importance of language in communication and learning, the necessity of good public education and health training for ebola in areas of West Africa affected by the ebola crisis, and the multilingual characteristics of societies in West Africa, this informal working group is established to facilitate communication across and about needs and initiatives for effective communication and learning in the languages of the region.

This includes, among other purposes:

  • networking across diverse organizations working on ebola and those able to provide assistance with African languages (specifically translation and composition)
  • facilitating access to existing information in African languages
  • facilitating use of African languages in messaging by organizations, especially international donors, technical agencies, and NGOs (including via composition and via translation)
  • facilitating review, storage, and reuse of materials on ebola in African languages, including standardized language names and use of approved orthographies
  • facilitating work with cross-border languages with goals of harmonizing messages and standardizing terminologies
  • considering best practices for use of technology to deliver content in African languages, from radio to mobile, and including links between technologies
  • applying best practices for culturally sensitive and community approved approaches to messaging in African languages
This group also:
  • recognizes the importance of the official and international languages of the region - English, French, and Portuguese - even as it focuses mainly on African languages
  • respects the prerogatives of African nations, organizations, communities, and individuals concerning the use of their languages
  • respects the role of the African Union's African Academy of Languages (ACALAN) in working on cross-border vehicular languages of the region 


If you can contribute to this effort, and/or if your work involves messaging on ebola in West Africa, you may join by subscribing to the Translating Hope email list. Subscribers can then be given access to the BlackBoard site.

Wednesday, October 29, 2014

Two issues in use of African languages in ebola messaging

A recent tweet highlights two issues about use of African languages in messaging about ebola in West Africa: nomenclature, the names and spellings of the languages; and re-use of material in one language variety (or dialect) in an area where another is spoken.
(NB- "FLOTRG" = First Lady of the Republic of Guinea.)

The reference is to two of the eleven language names for radio spots produced for the CDC (which have been mentioned previously on this blog). "Madingo" is actually a misspelling of "Mandingo," which may refer either to: (1) one of the Manding languages known to its speakers as Mandinka and spoken mainly in Senegambia; (2) the Mandingo "macrolanguage," a subgroup of Manding languages in the west of West Africa, including notably Mandinka, and Malinké (Maninkakan) of Guinea and neighboring countries; or (3) the entire range of Manding languages, per older usage (in English).

Misspellings happen - I also noted a couple on the Ebola Communication Network site (since corrected) - but they also make it hard to locate information in searches. But, the apparent imprecision on the Manding variety in which the radio spot is recorded is another issue: these languages have a significant degree of mutual intelligibility, but material in one of them on technical or sensitive topics will not necessarily be understood as intended by speakers of other varieties.

On the latter topic, Mary Crickmore, a high school classmate who later on took a different path to learning and working in Fula in Mali than I did, noted in correspondence how Fulfulde translations of sections of Where There is No Doctor ran into issues with some anatomy terms when tested in different parts of the country. I personally noted similar issues in shifting from Fulfulde in Mali to Pular in Guinea.

This is not to say that material developed for one variety of a language (or one of a group of close and mutually intelligible languages) cannot be used in others, but that attention to differences of expression and vocabulary is essential to being understood as one intends, so it is helpful to be clear about the specific language variety(ies) involved. (The flip-side of this issue is how materials and terminology developed separately in different varieties of a language can be compared and harmonized, which requires awareness about the relationships of languages.)

So possible re-use of the CDC radio spots in other countries where the languages are spoken (as suggested in the tweet above) would require review and likely a "localized" version to re-record. This in turn would benefit from access to scripts for such radio spots, or where these were not used or are not available, transcriptions, of the audio. (See 2Ds & 4Rs on this blog for further discussion.)

As for "Fullar," it is not a term I have encountered, but it clearly refers to "Fula" (or "Fulah" with that random or gratuitous "h") and "Pular" (the main endonym for the language in Guinea, Sierra Leone, and Guinea-Bissau). Fullar in any event is not a standard term, and as such, would likely not be found in searches. (The case of names used for Fula is complicated enough already, without adding another term!)

The CDC radio spot webpage also lists "Themne" for "Temne" - the former is a known alternative spelling, and seems to be another case of the "random/gratuitous h."

Non-standard terms or spellings pose the same problem for searches that misspellings do. So, as more work is done on ebola messaging in African languages by diverse governmental, intergovernmental, nongovernmental, and academic organizations, there will be a need to catch instances where corrections are needed. Recourse might be made to the coding and names in ISO-639, as an available set of standards.

Monday, October 27, 2014

Wikipedia, ebola, and African languages

MTP/TTF logo
An article in the New York Times, "Wikipedia Is Emerging as Trusted Internet Source for Information on Ebola" (26 Oct. 2014), mentions translations of a main article on ebola "into other languages" - from English being understood. Those translations are coordinated through the Medical Translation Project / Translation Task Force (MTP/TTF) of the WikiProject Medicine, the WikiProjectMed Foundation, and Translators Without Borders.

This post will briefly spotlight this translation effort, and what it has meant so far for African language editions of Wikipedia. First, however, I'd like to say that it is good to see the WikiProject Medicine and its head, Dr. James Heilman, receive this attention. Their current work on expanding available information about ebola is important and useful. I was also interested to learn from the article that the project was actually started a decade ago by Dr. Jacob de Wolff - making it a another lesson in the life cycle of ideas and their application.

The MTP/TTF began in 2011; I first learned of it at the 2012 WikiMania conference. According to its webpage, it has facilitated translation of over 600 articles into 100 languages. This is impressive, especially since these translations are the result of volunteer efforts.

However, progress on African languages* so far seems much less robust. The MTP/TTF monitors progress on translations under "full" articles and "short" (simplified) ones, and within each category (accessible via tabs on its homepage) there are several numbered groups (breaking the languages down into manageable numbers for display in table format).

Under progress on the "full" articles, only 10 of the 30 current African language editions of Wikipedia are included (in group 7) - Chichewa, Hausa, Igbo, Kinyarwanda, Luganda, Shona, Swahili, Xhosa, Yoruba, and Zulu - and of these Swahili has by far more translations than any other language (29 of 33 articles). Seven languages have nothing at all. (The article on ebola is not included for any language in this category.)

Under progress on the "short" (simplified) articles, African language editions feature in several groups (languages in italics are editions in "incubator," not full editions; there is some redundancy among the groups as accessed on 27 Oct. 2012):
Progress for most of these languages has been limited, though it is clear that there is a concerted effort for translation of articles on ebola.

However, it is also important to note that seven African language editions of Wikipedia - Afrikaans, Bambara, Ewe, Fula, Lingala, Sango, and Wolof ** - are not represented in any of the above lists. So it is good to note that Kasper Souren, who in 2005 played a pivotal role in getting the Bambara and Fula editions started, has just proposed an "Ebola translation task force" effort to promote translations in languages of West Africa, including facilitating an article on ebola in Bambara. Hopefully this initiative can, in addition to promoting translation of ebola-related information, also reinforce and expand the scope of the MTP/TTF to include all African language editions.

Ebola articles in African languages

Here's a list of African language editions of Wikipedia with ebola articles (26, by my count, as of 28 October): Afrikaans; Akan; Amharic; Bambara; Chichewa/Nyanja; Ewe; Fula; Hausa; Igbo; Kikuyu (Gikuyu); Kinyarwanda; Kirundi; Luganda; Oromo; Sesotho; Sesotho sa Leboa (N. Sotho); Setswana; Shona; Swahili; Swazi; Tigrinya; Tsonga; Venda; Xhosa; Yoruba; Zulu


Another resource of possible use in facilitating translation (and creation) of articles on ebola and related health and social issues for African language editions of Wikipedia is the "AfrophoneWikis" list. This was founded eight years ago, following the WikiMania 2006 conference, during which both Kasper and Martin Benjamin of Kamusi delivered presentations on support for Wikipedia development in African languages.

* Although Arabic is an African language, spoken natively in North Africa for centuries, it has resources and enjoys a level of support typical of world languages. For purposes of this blog post, I am not including it in the discussion.
** There is also a Wikipedia edition in the Twi language, which is generally considered part of Akan.  
NB- This article edited on 5 November to add the Lingala edition.

Wednesday, October 22, 2014

Languages & communication in Nigeria's ebola success

In a comment to my posting on possible roles of language and miscommunication in the tragic murder of 8 ebola campaign workers in Womey, Guinea, Charles Chukwuemeka Okolie commented on the Nigerian experience, noting that the "Ebola message was given in more than 100 languages including the tiny minority tongue[s] both in the print and electronic media." Now that Nigeria has been declared ebola-free, some more details are being reported about ebola messaging in Nigerian languages.

For example, an article entitled "Ebola-free: How did Nigeria and Senegal do it?" the Los Angeles Times today mentioned public awareness campaigns on ebola in Nigeria in which "Information was communicated in multiple languages via radio, television, social media, text messages and a large electronic billboard in the center of Lagos."

Another current article, "How Nigeria prevented an Ebola epidemic" in Medical News Today (MNT) mentioned language use in the context of the Nigerian authorities' quick response to the ebola threat: "House-to-house and local radio campaigns - using local dialects - explained the risks, how to take personal preventive measures and what was being done to control virus spread."

A recent article in the Globe and Mail, "Ebola: How Nigeria and Senegal stopped the disease ‘dead in its tracks’," explains further:
"In Nigeria, social mobilization teams went house-to-house to visit 26,000 families who lived within two kilometres of the Ebola patients. They explained Ebola’s warning signs and how to prevent the virus from spreading. Leaflets and billboards, in multiple languages, along with social-media messages, were used to educate the broader Nigerian population."
And an IRIN article last week, "Ebola and the media – Nigeria’s good news story," provides a different perspective, implying primarily English use on internet and mobile devices, with Nigerian languages prioritized on other media:
"Nigerians who do not have access to the Internet and mobile phones have not been left out of the Ebola campaign. Traditional mediums like radio, flyers, posters, village meetings and announcements by town criers are all being used. Priority is given to local languages.

"Comparing the traditional methods of campaigning to social media and SMS campaigns, Nwokedi Moses, better known as Big MO, a vernacular language broadcaster with Wazobia FM, said the two approaches worked well together. 'The social media Ebola campaign was massive, but it complemented the traditional media. This is due to social media’s limited reach within rural areas.'"

Not overnight

Nigeria's success in multilingual ebola messaging evidently benefited from existing capabilities, including those developed in anti-polio campaigns (according to the MNT article cited above). Also, according to Johns Hopkins Center for Communication Programs director Susan Krenn (as paraphrased by CNN in an article yesterday, "Using music to fight Ebola in Liberia"), there have been in the past various family planning and anti-malaria programs on the state and city levels in Nigeria, in its "four main languages" (these are not specified, but probably include English, Hausa, Yoruba, and Igbo).

Nigeria also has prepared specifically for messaging on ebola. For example, last April, Nigerian health minister Onyebuchi Chukwu was quoted in a Xinhua article, "Nigeria not safe from Ebola virus: health minister" as saying:
"So Nigeria is in danger but we have recently said fine that in addition to the leaflets that we are producing for lassa and other hemorrhagic fever, we will now emphasize Ebola fever. As I speak to you, we have already approved for jingles to be produced in various languages produced for Nigerian Center for Disease Control to be aired on Radio, Television and newspaper adverts,"

What now?

There are several questions:
  • Which languages were used in ebola messaging? 
  • How were health workers who went house-to-house trained for messaging in the languages they would encounter?
  • What kind of materials have been developed in these languages and how are they being stored and made available for other ebola efforts? These would include not only items published for distribution, but also scripts for broadcasting and materials for instruction.
  • Most of the above information apparently concerns official (different levels of government in Nigeria) response. To what extent did international partners also contribute to ebola messaging in Nigerian languages? (The collaboration between Translators Without Borders and the Nigerian Institute of Translators and Interpreters was mentioned in a previous post on this blog.)
  • Are there lessons from this experience for other countries in West Africa?
This brief article focuses on Nigeria's success, which is not to overlook that of Senegal. I hope to be able to post soon on how Senegal handled ebola messaging in its languages.

NB- Some additional edits were made to this article after posting, on the same day.

Tuesday, October 21, 2014

Economics of language and the “long tail” effect (part 2)

This is the second of two postings on the "long tail" and languages. The previous one took a general look at the long tail distribution in the contexts of the economics of languages and multilingual societies. This one, also reposted from a 2008 post in "Multildisciplinary perspectives," reflects a parallel discussion in the Wikinomics blog, and looks at what the long tail distribution of languages would look like in one country: Mali. A major point here - which I think is relevant for something like strategies for public education on ebola in rural West Africa - is that the long tail distribution of languages holds regardless of the geographic scale, and the languages at the head of the distribution (on the left of the graph) on more local scales may be totally different from those on the larger scales.

On the Wikinomics blog, Dan Herman responded to my discussion of use of the long tail model for languages (the date of that article was 11 April 2008). He raised some interesting points that I’ll come to in a moment.

Part of the reason I posted on the long-tail concept is that I believe it will be useful in various ways to analyzing the situation of less widely-spoken languages (LWSLs; previously I’ve used "MINELs" which says less about the size of the speaking community). I deliberately framed it in the context of the economics of language because I see the long tail as a model useful in the broader context of that field. In any event, we’re just beginning to explore this and it would be of interest to know of other efforts.

A clarification also needs to be made between what I’m seeing as two dynamics in the long tail of languages. Dan writes (referring to a previous Wikinomics posting by his colleague Paul Artiuch that I had referenced):
"As Paul highlights in his post there are several tools and applications that, in theory, facilitate learning, or given Don’s take, not leaving, the long-tail."
It seems to me that these are really two different, although related, things. On the one hand, Paul looked more at how the potential “consumer” of language learning would perceive minority languages. On the other hand, I’m mostly interested in the view from the points of view closer to where the language is spoken, from individuals, households and communities who speak the language, to regional and national entities that serve them - govt., business, NGOs, education. The latter are all a different kind of “consumer” than potential language learners. (Parenthetically, I think this difference reflects one that I’ve noted in events related to the International Year of Languages [2008]: some people and organizations are focusing more on language learning and others more on a nexus of issues relating to language rights, endangered languages, etc.)

All of these viewpoints are valid, of course, but when considering language development and indeed survival it is useful to know whether ICT’s effect of lowering barriers for doing various things in/for less widely-spoken languages down the long tail ultimately balances or outweighs other factors that either encourage speakers of less-widely spoken languages to focus uniquely on more widely-spoken languages at the head of the distribution. Which is to say in effect, that the long-tail effect makes production and use of content and products in a language somewhere down the tail - say Soninke (language spoken by about a million people in Mali, Senegal & Mauritania, which has a historical link to the Ghana empire) - easier and cheaper for Soninke speakers than it was previously. But how will this affect use and development of the language?

In his Wikinomics blog article, Dan is skeptical, posing the question this way:
"… in a world where the language of economics is conducted in one, perhaps two, and in the future maybe three languages, can a combination of technology, ethno-nationalism and culture trump trade and economics?"
I’m not sure we can answer either question but it might help to look at the long tail in different ways to see what’s involved. In his book, The Long Tail, Chris Anderson shows that if you zero in on a section of the long tail, you find … another long tail distribution (see p. 21). One could for instance do the same with languages based on populations of speakers, or, to consider the viewpoint from a country and its citizens, look at just the languages in that country. For example, the following graph uses figures from Ethnologue [15th ed., accessed in 2008] of first language (L1) speakers of languages of Mali :

This is another classic long-tail distribution. I’ve used color codes for very closely related tongues that are interintelligible (at least to some degree - this is a question that could be discussed at length another time). For instance, dark blue is used for the Manding tongues like Bambara, Jula, Malinke and Khassonke. The red color is for languages not in one of those groups. Soninke (snk) is one of these, with 700,000 speakers and 1 million or so overall - pretty significant in a particular region and fourth among the language categories Ethnologue lists for Mali.

Of course, in a multilingual societies people generally learn other languages no matter where their mother tongue may be in the distribution. So it makes more sense in terms of usage to plot out first & second (or additional) language speakers. In the following graph I plot out the combined figures for the closely related groups - whether they be called “language,” “macrolanguage,” or language cluster - and add estimated second language (L2) speakers above those:

There is some uncertainty about L2 speakership - estimates about the percentage of Mali’s 10+ million population that speak Bambara run from 65-80%; and for the official language of French, one probably low estimate is 15%. Fulfulde has historically been a lingua franca in central Mali.

And there are other ways we could graph out long tails of language as well. For instance on more local levels. Or, since there is a lot of trade and movement among countries of the West Africa region of which Mali is a part, and many of the language communities are divided by borders, one could do regional or subregional graphs.

What is the point? First, the dominant “two or three” languages when you narrow the geographical scale are not necessarily - and in fact usually are not - the same as one sees on the international level. English, Mandarin Chinese and Spanish may be the most significant worldwide, but none of them are major in Mali for instance. And languages that are relatively far down the tail in the international distribution may be at the top on a country or regional scale. Some languages specific to a country or region have some significant advantages in this context. And indeed, locally dominant languages do displace weaker languages to some degree. This may be the case with Bambara in Mali, or at least in much of the country, for instance.

Second, a language like Soninke which is pretty far down the tail in the international scale, has a higher profile nationally or subregionally (remembering it is a cross-border language).

The global distribution hides these realities. While it is true I think that the long-tail effect of advances in ICT generally lower the barriers and increase the potential for various kinds of work with LWSLs way down the tail (to the point where the main problems encountered are when the languages have few resources) - including for language learners (among whom the particular category of “heritage language learners” deserves special note) - it may be that the long tail distributions on more local levels are more informative for discussions of linguistic situations and language policy.

In other words, the significance of ICT’s effect on the potential to do various work (like publishing) in LWSLs may best be seen in reference to long tail distributions on country and regional levels.

Dan suggests that
"As countries migrate through the demographic transition, and subsequently become increasingly urbanized, there’s an inherent move towards common languages in order to facilitate the trade of services and goods."
Whether this means more a “trimming” of the tail or more an evolution of the language portfolios of multilingual speakers and communities is open to discussion. None of us are suggesting that speakers of LWSLs should abandon their languages in favor of languages of wider communication (LWCs), but the question is whether a combination of application of ICTs and good language and education policies can facilitate people keeping and developing their languages, even if their numbers be few.

Monday, October 20, 2014

Economics of language and the “long tail” effect (part 1)

In April 2008, I posted on my other blog, "Multidisciplinary Perspectives," two explorations of use of the "long-tail" distribution in understanding language use in multilingual societies. In subsequent years I have used this concept as a counterpoint to the more well known application of a "constellation" model for world languages.1 Without pretense as to its value, but with the thought that it is instructive to have alternative frames2 for understanding multilingual societies such as most of those in Africa, I'm reposting the first of those two efforts below (with minor modificatons and updates), and will follow later with the second.

“The economics of language has been neglected and deserves much greater attention,” wrote economist Donald Lamberton in a book he edited in 2002. That may not have been too much of a revelation at the time - only a few years earlier (1994) another economist, François Grin, wrote that this field was tolerated “as an intriguing fringe interest” by the discipline of economics. I’d like to briefly explore an intriguing idea on the fringe of that fringe: whether there are or could be “long-tail” dynamics that give some advantages to minority languages.

But first, what is “economics of language”? Grin, in the same article mentioned above defined it as covering the study of:
"…the effects of language on income (possibly revealing the presence of language-based discrimination), language learning by immigrants, patterns of language maintenance and spread in multilingual polities or between trading partners, minority language protection and promotion, the selection and design of language policies, language use in the workplace, and market equilibrium for language-specific goods and services."
Actually some of these issues are getting increased attention (another book on the topic by Barry R. Chiswick and Paul W. Miller was published in 2007 and released earlier this year in paperback, for instance), so I suspect that economics of language is becoming a little more mainstream. A good review of the subject under the title “The Economics of Multilingualism: Overview and Analytical Framework” was published by Grin and François Vaillancourt in 2009 (this apparently expands on an online version previously available on the World Bank website).

The "long tail" of languages

What does the “long tail” have to do with any of this? Well to begin with, the distribution of languages by number of speakers, if plotted out on a graph like the figure (modified from image on Wikipedia) to the right, is a long tail distribution. The question is whether this means anything with regard to the economics of languages - and in particular for minority or less-widely spoken languages (the ones I’ve liked to call MINELs3) which are in the long tail.

By way of explanation, the “long tail” refers to a distribution where a few categories have a lot of each (they would be the green-shaded area in the figure), and many categories have progressively fewer (the yellow-shaded part). It was popularized by Chris Anderson in a 2004 article, and then a 2006 book (revised and expanded in 2008), on new marketing strategies facilitated by the internet. As such, it is a kind of economic model.

How do languages fit this pattern? I plotted out a bar graph for the 50 languages with the most mother tongue speakers using figures from Wikipedia (accessed 2008; figures originally from Ethnologue) and an online utility at

It’s “quick and dirty” but gives an idea of how the actual distribution compares to the long tail model. Needless to say, there is a very long and low “tail” to the right in this distribution after the first 50 languages.4

I got the idea of connecting the long tail concept with languages from Laurent Elder of IDRC. When I finally got to read up on the subject it began to make sense. At least partway…


I have been among those suggesting that information and communication technologies make a lot of things possible or less expensive for MINELs that were impossible or too costly before. Desktop publishing or using webpages reduces barriers to producing and sharing text in any language - critical for languages with few resources and examples of a long tail effect. Cheaper communications via VOIP and expanded availability of cellphones facilitate dispersed members of a minority language community being able to speak their languages with each other. Community radio (a new use of an old technology) opens new ways of using the oral language. And so on. To be sure, dominant languages can use the same technologies, but the real advantage I think is for the non-dominant languages.

On the other hand - and here the application of the long-tail concept to language runs into problems perhaps similar to other attempts to apply economic analysis to languages - people don’t move “down the tail” to niche markets with language in the way they might with music or books (two of the examples in Anderson’s writing on the subject). With language, the most prominent fact is that people "live" in the long tail, as it were, and there are some incentives to move up the tail to dominant languages. Part of the issue is how the new technologies facilitate not abandoning the linguistic home in the long tail when dominant languages are learned and used. Most people after all learn more than one language.

In any event, the long tail seems to be a useful concept in looking at the present and future of world languages. When I did a little research on this in fall 2007, I came across an article on the Wikinomics blog that looked at the distribution of languages on the internet and posed questions re language learning. In other words, is there a long tail market for language services (mainly language learning)? This is a different take than mine above but also interesting. There may yet be others and perhaps, as the field of economics of language develops, more ambitious applications of the concept.

1. Per Abram de Swann, as elaborated in his 2001 Words of the World. I first encountered this "constellation" analogy in a 1999 work by Jean-Louis Calvet, Pour une écologie des langues du monde.
2. I take the concept of "(re)framing" from study of organizational development, and in particular a book by Lee G. Bolman and Terrence E. Deal entitled Reframing Organizations (in its 5th edition as of 2013). For complex issues, it is often useful to use more than one perspective or model in analysis.
3. "MINEL" is a proposed acronym that never caught on as a way of expressing non-global languages: Minority, Indigenous, National (in the sense used in some African countries), Endangered, Local.  
4. The top five are: Mandarin Chinese, Spanish, English, Hindi, and Arabic. The order when counting second-language speakers would be different, but still in a long tail distribution.

Friday, October 17, 2014

Balancing Act Africa media survey and languages

Yesterday evening I had the opportunity to meet and talk with a longtime virtual acquaintance, Russell Southwood of Balancing Act Africa. Russell and I have been in email contact since 2000 or so, when Balancing Act was establishing itself as an authoritative site on internet, media, and communications in Africa, and when I was developing the concept for Bisharat. And in 2001, the Balancing Act newsletter No. 69 featured an early article I wrote on support for African languages in text.

Russell and Balancing Act recently published the results of a 2013 "detailed market research study in seven Sub-Saharan countries [Ghana, Kenya, Nigeria, Senegal, and Tanzania] in the vanguard of adopting the Internet and social media," available via Balancing Act's issue No. 724 (19 September 2014). The report mentions languages in several parts, which I'll briefly touch on below. A key takeaway from the report is:
There is not a great deal of research on what language is used in relation to media in Sub-Saharan Africa: what does exist are largely academic studies from within the field of linguistics. (p. 6)

Overall, the report discusses trends in Africa that:
  • affect media and communication delivery
    • (5 areas) rise of social media; growth of feature & smartphone ownership; liberalization of media; mass media vs. niche audiences; and news
  • affect media and communication use
    • (4 "recurring patterns") urban vs. rural; education levels; income; and language
    • (5 country profiles) Ghana; Kenya; Nigeria; Senegal; and Tanzania
  • will affect communication and media over the next 5 years
    • (3 "known futures") more media & fragmentation; more devices; continuing growth of internet & social media
    • (6 "speculative futures") closing the rural media deficit; mobile media; edutainment; learning channels; public interest media; and online platforms with reliable info.
The subject of language(s) is touched on in several places. In discussion of "fragmentation" of radio and TV audiences, language specialization is mentioned as one possible characteristic of "niche" channels (another one is topical specialization).

In the context of what the report terms "the rural media deficit" (by comparison with urban areas), I read language as an implied factor. Likewise for education levels.

The importance of Africa languages is discussed at some length as one of four "recurring patterns" in a section entitled "Beyond Official Languages – Reaching People in Vernacular Languages." Here there are anecdotes concerning audience preference for or greater facility in use of what I'd call first languages and local lingua francas. It also mentions the role of media liberalization in facilitating formation of language specialized stations.

In discussing "known" future trends, an increase in African language broadcasting is foreseen. The possibility of social media platforms in major African languages is indicated as a possibility as social media use increases.

In general it is good to see the attention given to the linguistic dimension of media and communications in Africa. hopefully this will spur more interest in and research on the topic. A few quick comments bear mentioning:
  • The report cites the possible figure of 3000 languages in Africa. Indeed there are counts that go that high. However many groups of separately counted "languages" in such large figures may have high mutual intelligibility, and alternatively be considered by to be dialects or varieties of the same language. This is important for media in that often one language variety can be understood by speakers of close varieties in the broadcast radius. For example, when I worked in Djenné, Mali in the mid 1980s, I had Songhai-speaking co-workers originally from the Gourma-Rharous area of Mali who listened to the Zarma language broadcasts from neighboring Niger.
  • Further discussion of fragmentation of the media market and language-oriented niche stations would benefit from three perspectives:
    • the previous point about mutually intelligible languages;
    • the existence of so many cross-border languages in Africa (the previous point illustrates that too), which would add another dynamic to language specific broadcasts; and
    • the fact that many stations historically have divided their broadcast day among emissions in different languages of the audience (I've noted this on national level, official and commercial stations, as well as in community radio stations) - how might this approach blunt the "fragmentation" while serving different segments of clearly multilingual audiences?
  • The report uses the term "vernacular" in discussion of African languages. On one level this is technically correct, but on another, it is problematic, implying a lesser status form of speech. Personally I avoid it, especially after hearing a prominent African expert in African language policy criticizing another African for using the term (in French) to refer to African languages.
  • One gap for attention in future research is localization of content and software/apps in African languages. This relates to the mention in the report under social media futures of localized interfaces, as well as to wider use and usability of internet content for various purposes like education and extension as we move forward.

Wednesday, October 15, 2014

Two ebola info sites - almost no African language content

Several recent posts on this blog have highlighted various efforts to provide information about ebola in diverse African languages. Here I'd like to mention two important efforts to share material for communication on ebola, which include almost no information (yet) in African languages: the Ebola Communication Network (ECN), funded by USAID and run by the Center for Communications Programs at the Johns Hopkins Bloomberg School of Public Health; and "Ebola and C4D," a page on UNICEF's Communication for Development (C4D) website.

The purpose here is not to criticize but to help show the current language gap in messaging.

The ECN site has, according to the dropdown for "languages" on the search page, materials in the following languages (with number of items): English (133); French (17); Portuguese (2); Spanish (2); Krio (1); Pidgin (1); and Symbolic (1). Not counted in the total are the CDC's radio spots 11 African languages accessible via a link. To be fair, the ECN was only launched last week, and this is a significant collection as far as it goes.

The ECN site allows subscribed users to upload material, which would allow materials in more languages to be made available. Two questions for ECN are:
  1. How will review be handled for a wider range of languages?
  2. Will there be any proactive effort to develop the collection of materials in African languages in affected areas that might otherwise be overlooked?
The "Ebola and C4D" page, apparently launched in August, also has a significant collection. From perusal of the lists (organized under tabs for Fact Sheets, Social Mobilization, Planning Documents, and Other Tools & Resources), it appears that all linked materials are in English, French, or Portuguese, with one item in Khmer and one poster from Uganda in "Bantu" (which is a language family - may be Runyoro or Luganda - seeking to identify).

Here too, a means to submit materials is provided, so the above 2 questions may also be asked of UNICEF.

An effort should be made to upload existing material in African languages (with correct identification), as a necessary first step in helping to expand these collections.

Tuesday, October 14, 2014

Ethnologue: "National" and "Principal" languages in Africa

Since raising the issue of Ethnologue's use of the term national language last December, that resource has undergone some revisions. Among the changes is replacing the problematic heading of "National Languages" (problematic because it is used in various distinct ways) with "Principal Languages" on the "Country" tab of the country information pages.

This is a positive step as far as it goes (I'll come back to that below), but the new heading raises new issues. I believe these are important to review since Ethnologue is a major reference on the world's languages, and as such its presentation of data will influence how people (especially those from outside the region concerned) understand or misunderstand linguistic situations, with the potential to influence approaches taken to extension, public education, training, etc. for emergencies like the ebola outbreak in West Africa.

What counts as a "Principal language"?

A  reader looking up information on the languages of Niger would first come to the "Country" tab of the Niger page (a screenshot is pictured). On it, they would see under "Principal Languages," one language, French. A logical assumption the reader might make is that this language is unambiguously the "most important, consequential, or influential" (per's definition) in the country. But what of Hausa, spoken by perhaps half the population as a first language, which Ethnologue itself notes is also "the main trade language of Niger"? Or Zarma, spoken by 18-25% of the population,* which although concentrated in the west of the country, represents a number greater than the number of French speakers (5-15% of the population)? The Peace Corps program in Niger for many years (before its closure) prioritized Hausa and Zarma language training for rural development volunteers since French, however important on the governmental level, was not as useful where they worked.

So where to draw the line in what is considered "principal" is a new problem. It turns out, however, that Ethnologue has a narrower definition of "Principal Languages":
"Languages that have been identified as having a function at the nation-wide level are listed here. This includes all the languages that function at the national level as a working language or a language of identity or both, whether this is by statute or is the de facto situation. For a fuller discussion, see Official recognition."
But even stated this way, couldn't Hausa, as the main trade language and one of Niger's statutory "national languages," still be considered a "principal language" in its own right? Also, from personal observation, Hausa as well as Zarma have been used de facto in local government work (as spoke languages), even though everyone knows French is the de jure language of governance. So there are several criteria on which one might add Hausa and perhaps Zarma as "Principal Languages."
In this regard, the treatment of  Senegal (another example used in the previous posting) seems even more problematic. Here too, only French is listed in the category "Principal Languages," although Wolof is the most widely spoken language in the country, as well as being statutorily a national language

Similarly, in Mali, Bambara is most widely spoken, and by a number of people larger than those speaking French. It is also statutorily a national language.But only the official language French is listed among "Principal Languages."

Part of the reason for citing these examples is that by a common understanding of this new category "principal language," and arguably by a broader reading of Ethnologue's definition of the term, major languages other than the official one in some countries would seem to qualify. Certainly what one counts as "principal language" in many multilingual countries may depend on the criteria used, and a that in turn would depend on the intended application.

Levels of official recognition

One criterion in Ethnologue's treatment of "Principal Languages" in these countries is evidently the kind of official recognition involved. So in the case of South Africa, which has eleven official languages, all eleven are listed as "Principal Languages." Same for Chad's two designated official languages - French and Arabic.

However, for the Republic of the Congo, three languages are listed as "Principal Languages": French, which is official; and Kituba and Lingala, which are statutorily national and vehicular languages (which seems similar to Hausa in Niger, Wolof in Senegal, and Bambara in Mali).

What of countries where no language is designated in the constitution or legislation as official? (This is the case for quite a number of countries, including the US.) For Kenya and Tanzania, which each have English and Swahili as de facto official languages, both are "Principal Languages" for each.

On the other hand, the page for Sierra Leone lists only English (de facto official) under "Principal Languages" even though Krio is used more widely (by at least 90% of the population). Though not formalized, Krio in practical terms could be regarded as a "principal language" of the country, since it is so widely used and arguably serves in part as a language of identity (another criterion in Ethnologue's definition). English, on the other hand, is reportedly understood well by only 13% of Sierra Leonean women - how principal is it from their perspective?

An exhaustive review is beyond the purpose of this posting, but from the various examples, it seems that a narrow application of Ethnologue's definition for "Principal languages" on that important first page of country language information gives an incomplete picture of the linguistic reality in a number of African countries.

Suggestions regarding "Principal Languages"

Changing the heading "National Languages" to "Principal Languages" on the "Country" tab of Ethnologue's country information pages was a positive step for presenting first-glance information on the linguistic situations of multilingual African countries. A next step would be to review the criteria for giving languages that categorization. It might be useful to think of this as a way to give the readers a quick sense of the linguistic reality, which in multilingual states may be complicated, involving more than one language playing important roles in different ways.

Part of the problem is using a commonly understood term like "principal" in a very limited way, requiring the reader to find the specific definition and adjust their understanding accordingly. I suspect that many readers will, like I did when first looking at the page, assume the common definition of "principal."

Maybe a key would be to make the definition of "principal language" less dependent on the EGIDS framework. That would lead to another problem, mentioned above in the case of Hausa and Zarma in Niger - where to draw the line in a more flexible application of the term. One way to address this would be short annotation highlighting the criteria used. For example (not advocating this but giving as example), one could list for Niger: French (official); Hausa (main trade language). Or for Senegal: French (official); Wolof (most widely spoken). Sierra Leone: English (official); Krio (most widely spoken; identity). And so on.

"National language," cont'd

When one gets past the "Country" tab of the country information pages to the "Languages" and "Status" tabs, Ethnologue still uses "national language" in the way it previously did. This is again a question of nomenclature, important I would argue in the case of African countries that use the term in different ways (see the previous posting on this topic for a more complete discussion). Ethnologue has evidently reduced its use of the term "official language," so maybe "national language" could also be replaced by a term not already used in divergent senses or (like "principal language") carrying a generic meaning beyond that intended.

Concluding note

As in my previous posts about Ethnologue's content, I would like to stress that the purpose here is to offer constructive criticism and contribute to improving this important resource.

* There do not appear to be any published percentage estimates of speakers of Zarma (including closely related and mutually intelligible varieties of Songhai) in Niger. When I worked there in 2000-04, the common understanding was that 25% of the country's population spoke it (as a first language). Ethnologue's 2006 estimate of 2.35 million speakers would be about 18% of the 2006 total estimated population of 13.248 million.

Thursday, October 09, 2014

Putting the "ɛ" (back) in Mende

Mende-speaking area (UCLA LMP)
How might Mende - a Mande language spoken by about 1.5 million (which many non-Sierra Leoneans may have encountered for the first time watching Amistad or Blood Diamond) - be used in written form as part of public education and health worker training for combating ebola?

The Mende language (Mɛnde yia) is a major first language and lingua franca of southern Sierra Leone, an area particularly impacted by the current ebola epidemic in southwestern West Africa. In public education efforts on ebola, I understand that Mende, as well as other languages of Sierra Leone, have been used on broadcast media, and that in some cases, songs have been used to get messages across. However in this post the focus will be the written form of the language, with some thoughts on why that's important, and ways it might be used.

Mende is written in two ways::
  • the century-old "Kikakui" script, a right-to-left syllabary, which is apparently little used today (Ethnologue's entry on Mende states "limited usage except for correspondence and record keeping, especially accounting"); and 
  • a modified version of the Latin alphabet, which is more widely known, but apparently not used consistently since most of its speakers do not study it in school. 
I'll focus on use of the Latin script for Mende, beginning with what I've learned about Mende's Latin-based orthography. The Mende alphabet (per's page on Mende) is as follows:

This includes three digraphs, which represent meaningfully significant sounds and count as letters, and two modified letters or "extended characters" to represent the additional vowels in Mende's 7-vowel system. Like a number of languages in West Africa (such as Bambara), Mende has, in addition to a, e, i, o, and u, the open-e, represented by ɛ, and open-o, represented by ɔ. (Mende is also a tonal language, but apparently tones are not marked.)

For an example of Mende text, we may turn to the translation of the Universal Declaration of Human Rights - "Dunya Lahi Nuvuu Lɔnyisia Va" - from which I reproduce part of the preamble below (NB - the linked translation for some reason has the ɛ's and ɔ's in upper case, which I've corrected below):
Magona Yɛpei

A jifa kiliyei na kɛ numu vuu kpɛlɛɛ ti maa hɛwulei lɔ towa kpaupau le laha va, tɔnya kɛɛ ndilɛli dunyihu.

A jifa ngawulɛɛhu kɛɛ baagbuala nuvugaa ti lɔnyisia ma ti wanga a pie hindangaa na hii i wotɛa a nɛmahugili waa nuvuu ma, dunya ninahu mia mahoingɔ muvuu i gu i yɛpɛ kia ngi longɔla, kɛɛ ngi lanayei kɛɛ ngi lima hinda.

Perhaps the longest Mende text in Latin alphabet is a translation of the Bible, which dates to 1959.

Ebola messaging and written Mende

Focusing here on the written language does not in any way minimize the important, indeed central, role of the spoken language in communicating about ebola in West Africa. I've chosen for the above subheading "ebola messaging AND written Mende" rather than "ebola messaging IN written Mende" because as I see it, the former phrase encompasses the latter as well as uses of text designed to:
  1. take most advantage of spoken messages (for example, transcribing what is said, which lets one do more with it than simply recording would); and 
  2. enhance the effectiveness, accuracy, and consistency of oral communication (for example through use of scripts and talking points). 
Oral and written communication are of course different (a topic I hope to come back to another time), but also complementary. Moreover, transcription of speech/audio, along with printed material destined for reading or reference - the language "reduced" (in)to writing, if you will - can be reviewed, revised, and re-used for public education and training programs. Understanding the complementarity of the written and spoken language seems to be important to expanding public education and health training efforts.

Technology and standard written forms

Language technology may have an important role to play in linking text and audio. For example, with text-to-speech (TTS) applications. Already a decade ago, for example, a working model for Swahili TTS was developed. There is also current research on Yoruba TTS. In principle, one could have TTS to make text messages "speak" to users in any language. Could be developed for Mende and other languages of countries most affected by the ebola epidemic?

One could also combine text with video productions by using "same language subtitling" (SLS) to enhance the impact of such videos (and incidentally help increase literacy in those languages).

Such applications, as well as searching text, combining text from different documents of different origins or ages, and assuring that readers get the intended meaning, require consistent use of an orthography - in the case of Mende, the one briefly discussed above. This poses a challenge where speakers of a language like Mende may never have learned to write it (apparently Mende is an elective subject in the Sierra Leone education system, so not all first-language speakers learn in or study it in school). The good news is that a spelling correction utility could be designed to convert irregularly transcribed text into a standard format (some years ago, for instance, a utility was developed in Kenya to add tilde marks in Gikuyu text where they were missing).

Take for instance the Mende translation of an ebola message from the US Embassy/Freetown website, which I reposted on this blog three weeks ago. While it is a laudable effort and hopefully useful product, it clearly does not use the same alphabet as the examples cited above. A utility designed to "correct" such Mende text to the standard orthography would allow people not schooled in the language but literate in English to transcribe spoken/audio Mende text as best they can, and then render it in a form that can be more readily reviewed and re-used by others for expanded education and training efforts.

For further discussion on the utility of standardized orthographies, see on this blog: "More on standard orthographies of African languages."

Monday, October 06, 2014

On the Atlantic Council's "Combating the Ebola Outbreak"

Last Thursday, 2 October 2014, the Atlantic Council's Africa Center hosted an expert panel discussion on "Combating the Ebola Outbreak" in Washington, DC. Chaired by the Center's director, Dr. J. Peter Pham, the panel dealt with current efforts to address the crisis and longer-term considerations. (The full presentation, including presentations and Q&A, can be heard via a video link.)

From a review of the audio of the discussion, I'd like to highlight three areas where the issue of language was implicitly or, in one case, explicitly mentioned. Keep in mind that West Africa, including the three countries currently experiencing the epidemic - Liberia, Sierra Leone, and Guinea - is a multilingual region, so this is a very practical issue.

The first area was that of public education, which necessarily involves choice of language(s) for communication and development of materials. Panelist Donald Shriber, Deputy Director for Policy and Communication of the CDC's Center for Global Health referred in his comments to "health promotion and health communication, which means getting culturally appropriate messages out to people through trusted messengers." Language might be included under "culturally appropriate messages," and accuracy of messaging whatever the language would be implied - and both are arguably as important as cultural appropriateness (language and culture are dual considerations in localization, for instance) and trusted messengers (who one would hope would be able to convey clear, accurate, and consistent information in all languages used). As mentioned on a previous posting on this blog, the CDC has produced radio spots about ebola in 11 African languages of the region.

The issue of public education came up again in the Q&A period in comments by former ambassador Robert Gribbin, who also served a short term recently as chargé d'affaires in the US Embassy in Sierra Leone. Amb. Gribbin mentioned that "There's been a massive education effort on the part of the [Sierra Leone] government, with its international partners to teach people about ebola, about what to do. ... And people are really quite aware I think of the immediacy of this ..." A natural question would be how the various languages of the country were used - as presumably they must have been in a "massive" public education push. It is worth remembering in this context that the US Embassy/Freetown has five Sierra Leonean language translations of an English language notice about ebola available on its website.

Second, there were references to training of health workers, notably by Anne Witkowsky, US Department of Defense's (DoD) Deputy Assistant Secretary of Defense for Stability and Humanitarian Affairs.Ms. Witkowsky mentioned the DoD's plan to train up to 500 health workers per week. Here too, language becomes a big issue, both for training health workers, who may not have high levels in English or French, and who in any event, will have to communicate complex messages to patients and communities in their first languages. Will there be any training in or about messaging in various key languages where the health workers will ultimately work, or will it be assumed that the workers fully understand and can accurately interpret/translate on their own? (The latter is often the case in agricultural extension and development work in Africa - see the discussion and links here.)

And third, the issue of communication about complex topics came up in remarks by Col. Nelson Michael (M.D., Ph.D.), Director of the US Military HIV Research Program at the Walter Reed Army Institute of Research. Referencing a UN publication on guidelines for good participatory practice,* he discussed the importance of community engagement in eventual vaccine trials, meaning that "the community is involved during the entire lifecycle of the project, from sitting on study teams to actually thinking about the designs of these studies to make sure that when volunteers are given informed consent that they have tests of understanding that their own native language is used so there's a contact and transparency between research indivduals and between those volunteers." Language here is explicit in the case of informed consent, but would also be important in the other participatory aspects.

* I believe the publication is: "Guidelines on Citizens’ Engagement for Development Management and Public Governance," Development Management Branch, Division for Public Administration and Development Management, United Nations Department of Economic and Social Affairs, March 2011

NB- Various minor additions and corrections after posting. Ambassador Gribbin's name corrected, with apologies for the error and thanks to Dr. Pham  (7 Oct 2014).

Saturday, October 04, 2014

Fula and the letter H

The origin and uses of the letter "h" in various European languages was the subject of a 2008 essay by Coby Lubliner, professor emeritus of Engineering Science at the University of California at Berkeley. In "The Story of H," as he titled it, Prof. Lubliner admits it may not be the "whole story, but some interesting parts of it."

Here then is a quick complement to that essay, focusing on three hopefully interesting roles of the letter "h" in the historically more recent Latin-based orthography(ies) of the Fula language (Fulfulde, Pulaar, and Pular), and in foreign nomenclature for the Fula people and their language:
  1. The role of the letter in the language: First of all, in Fula, as in many languages, "h" represents a "voiceless glottal fricative." When in initial position in nouns and verbs, however, it typically alternates with "k," for the "voiceless velar plosive" sound, between singular and plural forms. (The system of consonant alternance or mutation of which this is a part is quite regular across the varieties of Fula spoken mainly in the West African Sahel, with the exception that in Pular - spoken in parts of Guinea, Guinea-Bissau, and Sierra-Leone - verbs do not feature the alternation.) 
  2. Orthography of Pular in Guinea, pre-1985: The old Latin-based orthography for languages of Guinea was established after the country's independence in order to facilitate production of materials in its various languages with existing typewriters. This orthography included various digraphs for: (1) sounds not present or carrying a meaningful difference in European languages, for which other conventions were being developed in neighboring countries; and (2) three common sounds which in written French (the language of education and government since the colonial period) represented by its own digraphs or trigraphs. In the case of Pular, the combinations, including four using "h," and equivalents in current Fula orthographies are: bh = ɓ ; dh = ɗ ; dy = j ; nh = ŋ ; ny = ñ or ɲ ; ty = c ; yh = ƴ. These conventions - particularly the ones with "h" - can sometimes be observed in use in written Pular today, even though this orthography ceased being in official use since the mid-1980s.
  3. "Random H": I've used this expression (not to be confused with the "random.h" utility) to describe the here-again gone-again use of "h" in English and French terms for the people and their language. 
    • The English "Fula" sometimes appears as "Fulah," especially in older literature. However the spelling with the "h" is used in language coding (see here and here) and hence in localization (note its use last week by The Economist in reference to computer terminology in Fula). However, one has yet to observe the random H applied to the alternative term for Fula in English, "Fulani."
    • The French "Peul" sometimes appears with an "h" before or after the terminal "l": "Peuhl" or "Peulh." I haven't been able to discern any particular pattern in use of these spellings other than that either use of the random H seems to be less common now. 

Addendum (8 Oct. 2014)

Since posting this, I've remembered another unusual instance of the "h" in Fula:

4. "Himɓe" for "yimɓe":  In most varieties of Fula, the word for "people" (as in persons) is pronounced "yimɓe." However, in Western Niger and apparently into Burkina Faso it is pronounced, and written, as "himɓe." The word yimɓe is techically the plural of "gimɗo," a term that I've never heard used, but which shows another of the consonant mutations discussed above - this one between "y" and "g." However it is usually the word "neɗɗo" that is used to refer to the singular, "person." Neɗɗo is actually derived from the root for being raised, educated. This sets up an atypical pair of singular/plural in common usage (and in materials for learners): neɗɗo / yimɓe. I first encountered the variant himɓe when working on a Fulfulde lexicon, and then later when in Niger. Without having researched the matter, I'm wondering if the "y" to "h" shift in local pronunciation was in effect permitted by the decoupling of the plural form from the rarely used singular form, gimɗo.