Thursday, September 08, 2016

International Literacy Day: Let them write!

One of the most common objections I have heard from international development colleagues about literacy training in African languages is "What will they read?" While it is true that relatively little is published in some African languages, and next to nothing in others, such a view has problems on several levels. For example, it's easier to learn in one's first language, literacy skills in one language facilitate learning other languages, and there is a cultural cost to always and only associating formal learning with a Europhone second language. But one of the most important in my opinion, and one that I have offered as a primary defense of literacy in first languages of Africa, is that neo-literates* can write - maybe just a little, like a ledger, or maybe a lot, in stories that express and communicate in their own way.

So it is a pleasure to see the theme for this year's International Literacy Day (ILD; 8 September 2016): "Reading the Past, Writing the Future."

Are there examples of newly literate people in Africa writing in African languages? Yes of course. One is the Senegalese organization Associates in Research and Education for Development (ARED), which has actually published writing by its students. I have also heard of literacy students just writing with this new tool. There are certainly many more.

With the association of literacy with goals of "lifelong learning" - per the 2030 Agenda for Sustainable Development - there should be a way to support and encourage neo-literate writing in first languages on a wider and more systematic basis. Not just for fun, though hopefully at least that, but for adding many diverse voices to writing the future.

Additional notes

Two African organizations were recognized this year with the UNESCO Confucius Prize for Literacy (which along with the King Sejong Literacy Prize are awarded annually on ILD):
  • the South African Department of Basic Education’s ‘Kha Ri Gude Mass Literacy Campaign
  • the Direction de l’alphabétisation et des langues nationales in Senegal for its ‘National Education Programme for Illiterate Youth and Adults through ICTs
Both programs sound interesting. I'd like to know more about how the Senegalese program used its national languages (and which ones) in ICT.

For a very interesting discussion of ILD from Malawi, see Steve Sharra's blog, Afrika Aphukira: Literacy, Language and Power: Thoughts on International Literacy Day 2016

* "A neo literate is an individual who has completed a basic literacy training programme and has demonstrated the ability and willingness to continue to learn on his or her own using the skills and knowledge attained without the direct guidance of a literacy teacher." APPEAL - Training Materials for Continuing Education Personnel (ATLP-CE) - Volume 2: Post-Literacy Programmes (APEID - UNESCO, 1993, 112 p.)

Tuesday, September 06, 2016

VOA Hausa Digital Content Editor

The Voice of America (VOA) is hiring a Digital Content Editor for its Hausa service. Normally I do not post jobs on Beyond Niamey, but rather do so occasionally on the Facebook African languages group. In this case I am making an exception since it seems that the person hired by VOA will be in a position to possibly help the organization finally move its Hausa web content from an ASCIIfied version to the Boko orthography - a topic that has been discussed previously on this blog.

Links to the position announcement are below, but first a quick review of the issue. The Latin-based "Boko" alphabet for Hausa includes several modified letters (technically called "extended characters") that stand for sounds not represented in the alphabet as used in English, French or other European languages. Sometimes called "hooked letters" they include: ɓ ; ɗ ; ƙ ; and in Niger, ƴ - in Nigeria 'y is written for the same sound as the last one. The capital letter forms of the four hooked letters are Ɓ Ɗ Ƙ Ƴ.

When VOA and other international radio services - notably BBC, CRI, and RDW - began websites for their respective Hausa services, the Unicode standard that facilitates display of extended Latin characters and diverse writing systems on the internet, was not in widespread use (RFI added its Hausa service later). Evidently this was the reason for resort to an ASCIIfied rendering of Hausa text (with b, d, k, and y instead of the hooked characters, which can change meanings) - older systems then in use among the audience may not have been able to handle the Unicode-encoded hooked letters.

That argument is losing credence, if it is not already meaningless. The number of systems in use old enough not to have Unicode fonts (now the norm but the earliest of them were already in systems over a decade ago) must be very few. Moreover all the 5 international radio Hausa sites use UTF-8, which displays Unicode.

So what is the current state of use of the Boko orthography (with the hooked letters) on the five sites - VOA, BBC, CRI, RDW, and RFI? I used a new way of evaluating them - actually bringing back an old trick - which is to search just the letters on the sites with Google. The best way is to use Google advanced search, or just put a sequence like this in the search window of the usual Google page:

ƙ OR ɓ OR ɗ OR ƴ

This pulls up all pages on the site with at least one of these hooked letters. You can substitute the domain of the site you want to evaluate. My results were: BBC 16 pages; RDW 7 pages; VOA, CRI, and RFI all 0. Not impressive.

What's holding them back? Inertia? Lack of a keyboard layout to easily type with the hooked letters? Lack of a spell checker for Hausa in Boko orthography?

In any event, the new Digital Content Editor for the VOA Hausa service would be in a position to make a significant contribution to that service's web content, with secondary effects on other Hausa language websites.

The position has two listings on the site: one for US citizens; and one for non-US citizens. (This sort of dual listing is normal; you see it also sometimes for internal candidates in an agency and for external candidates applying from outside the agency.) The position was announced today, 9/6/16, and closes 9/20/16.

Saturday, September 03, 2016

Facebook, ISOC, and A12n

In his recent visit to Lagos, Nigeria, Facebook founder and CEO Mark Zuckerberg indicated that Facebook will add more African language interfaces. Meanwhile, at the African Peering and Interconnection Forum (AfPIF2016) in Dar es Salaam, Tanzania, the Internet Society (ISOC) released a report entitled "Promoting Content in Africa," which highlights the importance of internet content in African language for full access by Africans.

These two developments concerning on the one hand localization of the software for a popular social media platform, and on the other hand the creation of content, highlight the dual aspects of Africanization (A12n) of information and communication technology in/for Africa. As these processes develop, it would be useful for to find ways to integrate them as appropriate, and foster collaboration among organizations and individuals involved in either or both. (That was the intent of the African Network for Localisation, ANLoc, albeit with a focus mainly on the software and enabling aspects.)

It is possible, as the ISOC report notes, for content to be developed or translated in a language even when the software on which it is created is not localized in it. And that certainly would be the case for the less widely spoken languages, at least in the near term. However, the availability of software interfaces - whether for social media like Facebook or for production software - in at least the major African languages, would probably help even for the less-spoken ones.

Facebook sign-up in Hausa. (Source:
Facebook currently is available in the following African languages (links are to Wikipedia articles): Afrikaans; Arabic; Hausa; Kinyarwanda; Malagasy; Somali; Swahili; and Tamazight

One of the contributors to the ISOC report, Dawit Bekele, who is ISOC's African Bureau Director, was a participant in the PanAfrican Localisation Workshop in Casablanca, June 2005, and the Pan African Research on L10N Workshop & Localization Blitz in Marrakech, February 2007.

Wednesday, August 31, 2016

Missing "macrolanguages" of Africa

Screenshot from VOA's Kinyarwanda/Kirundi site
The Voice of America (VOA) recently had a job opening for "International Broadcaster (Multimedia) (Kirundi/Kinyarwanda)." Kirundi and Kinyarwanda are the mother tongues, national languages, and co-official languages in, respectively, Burundi and Rwanda. And they are mutually intelligible, with only minor differences, such that apparently a fluent speaker of either could work on a program serving speakers of both. But there is no term covering both - unless one counts the hyphenated Rwanda-Rundi - and no language coding category to cover material designed for use across the two.

This is a situation encountered with many languages in Africa, and one for which there is at least one potential solution - the neologism and language coding category "macrolanguage." There are actually some macrolanguages defined in Africa, but these are few, and as I discuss below, kind of accidental. Is it time to systematically identify (and code) macrolanguages in Africa?

What defines a language?

For most of us, the distinction between languages seems pretty straightforward. But beyond the most spoken international languages - those used officially by the United Nations or ones you are likely to see on a school curriculum - the situation is often more complex. Sometimes two or more closely related languages are so similar that their speakers can understand each other, but sometimes variations within one language can make understanding difficult. An earlier posting on this blog looked at the notion of "neighbor languages" in Scandinavia and Africa. A broader consideration of these issues by Columbia University's John McWhorter suggests that we're really all speaking dialects, some of which benefit from written forms, and one might add, status, resources, and policy support. There is some truth to the saying that "A language is a dialect with an army and a navy."

However, the issues of what to call a "language" and where to draw the boundaries between it and another "language" are still of practical importance for communication (standardization, references, ICT use) and planning (government, business, education). There are two broad approaches in linguistics to doing this, corresponding with the splitter/lumper (or joiner) approaches to categorizing:  one focusing more on distinctions, and the other focusing more on commonalities.

Without going too deeply into that discussion, which gets more complicated when accounting for issues of identity, names, written forms, and national boundaries, suffice it to say that in considering African languages, there are many situations where one encounters the splitter/lumper choice.

The major reference of languages in the world, Ethnologue, takes a more splitter approach, which means that speech varieties that are closely related and interintelligible may be classified as separate languages. It is their estimate of the number of language in Africa (over 2000) that is most commonly cited, but there are other more conservative estimates.A good academic discussion of this issue entitled "How many languages are there in Africa?" was published in 2004 by Jouni Filip Maho (his estimate is under 1500).

What is a "macrolanguage"?

To make the story brief, the term "macrolanguage" is not a term that was used in linguistic description before the inauguration of the  ISO 639-3 system for encoding all languages in the late 2000s. Since that system is based on Ethnologue's "splitter" data, a new category was needed to accommodate existing codes in the earlier less comprehensive parts of ISO 639 (1&2) that in many cases were more "lumper" in approach. The term macrolanguage was in effect a "shim," to borrow someone else's term, to fit the two systems together.

There are by my count 14 macrolanguages listed for Africa (names linked to the Ethnologue macrolanguage pages): Akan; Arabic; Dinka; Fulah; Gbaya; Grebo; Kalenjin; Kanuri; Kongo; Kpelle; Malagasy; Mandingo; Oromo; and Swahili. There could be others.

That brings us back to Kinyarwanda and Kirundi. How is the relationship between them different - more distant - than any of the above established macrolanguages? One difference, as mentioned above, is no common name to make it easy, and another is that they are dominant in different countries - perhaps analogous to the situation of Scandinavian languages?

Another curious situation is that of Mandingo, which includes several western Manding languages, but not Bambara and Jula (Dyula). Even if the latter two were considered too different from the other Manding tongues, they are close enough that one could localize software for the two together. Keep in mind also that the emerging literary standard N'Ko covers all Manding languages (in a different alphabet). Should the Mandingo macrolanguage be extended to include them all?

The four languages of southwestern Uganda - Kiga, Nkore, Nyoro, ajd Tooro - are close enough to be covered by Runyakitara, a proposed (but not encoded) standard which is being used in various ways, including at least some teaching and a localization of the Google interface. Should these four be considered a macrolanguage under perhaps that same name, thus finally providing a code for localization in Runyakitara?

And there are other examples around the continent that could be discussed.

What good would more macrolanguages do?

The first benefit of identifying more macrolanguages would be in language coding - the very environment in which the term was first used. The language of VOA's website for its Kinyarwanda/Kirundi service - - is coded as "rw" (Kinyarwanda) since there is no macrolanguage code covering both languages. Likewise, in many cases, the grouping of very close and mutually intelligible languages as a macrolanguage could facilitate localization of software and apps to serve larger populations - and those larger markets could make it more likely that such localization would be pursued and maintained.

Another benefit would be to complement the tendency in language coding towards seeking more granularity, by recognizing natural groupings of languages (for more on this, see a message to the IETF-languages list last May). In effect providing more balance between splitting and lumping/joining.

In the broader picture, identifying macrolanguages could have benefits for policymaking and program development involving languages within macrolanguage groups, by calling attention to the closely related languages. Especially where foreigners are involved, projects may overlook such relationships and the potential resources they may provide. For example materials development for education, and various communication needs might benefit from tapping efforts and resources in closely related languages.

(Minor edits and image added, 2 Sep. 2016)

Sunday, June 19, 2016

TED talks in African languages?

Of all the TED and TEDx talks - a genre of knowledge sharing that began in the 1980s but went "viral" with the possibilities offered by YouTube - have any been given in any African language? The question is not so easy to answer as I'll get to below, but the process of trying to answer it gives rise to other questions such as: Could a TED talk or a TEDx event be given in one or several African languages?

Image source:

TED - "Ideas Worth Spreading"

TED, an acronym for Technology, Entertainment, Design, "is a global set of conferences run by the private nonprofit organization, Sapling Foundation." The idea of the conferences is sharing of ideas "usually in the form of short, powerful talks (18 minutes or less)."

The conferences have been held mainly in North America and Europe, with a handful in Asia and Latin America. One, in 2007, was held in Arusha, Tanzania with the theme, "Africa: The Next Chapter." Many, but not all, of the talks in these events become videos featured online.

The talks, which total some "2200+" according to the website, are apparently all given in English. (The program for the 2007 conference in Arusha is not available online to check.) Quite a number of talks are subtitled in other languages, as I'll discuss further on.

TEDx - "x = independently organized event"

Image adapted from:
TEDx events, of which there are several types, are licensed by TED but organized separately. The number of TEDx events around the world is not stated anywhere I looked, but one list includes 2967 events (number from the line count in my text editor), and a nice interactive map display includes some past events that are not on that list (I randomly checked some in Africa).

The total number of talks at these independent conferences must therefore be staggering. The drop-down list in the sidebar of the TEDx languages page lists 43 languages, of which the only African one is Arabic (to that extent, my first question in the opening paragraph above would be answered in the affirmative). However, given the large number of TEDxs that have been held in many diverse locations around the world, is it possible that there have been presentations in other languages not on that list?

From a rough count of TEDx events in Africa in 2015 on the map mentioned above, there were ~80 events, with well over half in diverse locations in sub-Saharan Africa. Were presentations in places like for example Kano, Nigeria, Dar es Salaam, Tanzania, and Addis Ababa, Ethiopia all English-only?

Subtitling of TED talks

According to the translation page on the TED site - there has been subtitling of talks in over 100 languages (the actual count on the page is 110, thanks again to copy-paste & line-count, but that number includes some varieties of the same languages, as well as English originals). The African languages among these, with their count of how many talks, include: Afrikaans (19); Amharic (13); Arabic (2091); Arabic, Algerian (9); Hausa (1); Igbo (1); Somali (20); and Swahili (33).

The one talk (in English) with Hausa subtitles - embedded below - was given in 2003 and with the subtitles evidently added in 2008. Worth noting that the Boko orthography is used, as you can see with the hooked consonants.

The one talk with Igbo subtitles does not appear to follow the standard orthography - the lack of subdot vowels is one giveaway, but also tone marks are absent. And there are untranslated English terms - the first instance I recall seeing of code-mixing in subtitles. The other language subtitles look polished, though I'm even less in the position to evaluate them.

TEDx talks, as noted above, come in various languages, and apparently some of them have same-language subtitling, although that term is not used (for example several dozen in French).

The translation/subtitling effort itself looks like a successful involvement of volunteer contributions for at least a number of languages.

TED or TEDx in African languages?

There are two ways to achieve more linguistic diversity relevant to Africa in TED talks. The first would be through expanding the translation program mentioned above.This might require some new approaches as the volunteer model may not work as well as in Northern countries. The benefit would be expanding access, particularly with some more widely spoken African languages.

The second would be to organize (more?) TEDx events that either allow presentations in African languages, or that explicitly invite presentations in one or more African language(s). This would seem to be an interesting way to bring in diverse presenters, and to develop recorded content that could be shared locally, nationally, or regionally (depending on the language demographics). Even for those without internet or mobile access to such TEDx recordings, it might be possible in some contexts to distribute video for TV and audio-only for national and community radio. And such content could of course be translated into other languages for wider dissemination.

Ideas for sharing, after all, can come in many languages.

Friday, June 17, 2016


This is the fourth in a string of posts on conferences and workshops relevant to, or specifically addressing, African languages. Only one event of all of those mentioned, however, is in Africa. More on that at the end of the post, but first the three upcoming conferences for which there are active CFPs (calls for papers/participation). The subject of the first, LESEWA, is on similarities between a number of West African and East Asian languages - a theme that has long interested me as a learner of Bambara and Chinese (Mandarin). The latter two deal with a broad set of languages of generally disadvantaged status and fewer speakers, among which many African languages can be counted. The first two events are the latest of long-running conference series; the third is brand new.


The International Conference on Languages of Far East, Southeast Asia and West Africa (LESEWA) will be held in Moscow, Russia on 16-17 November 2016. This is the latest in a series of biennial conferences that began in 1990 (I am told that the idea began with Prof. Vadim Kasevich and colleagues). The CFP deadline is 1 July 2016.

LESEWA "will focus on the remarkable far-reaching parallelism in syntactic and semantic structures of the languages of the Far East, Southeast Asia and West Africa, which can be explained neither by genealogical affinity, nor by aerial factors. Both individual language investigations and typological studies are encouraged. General phonetic and general linguistics themes are especially welcome."


FEL XX Hyderabad (the 20th conference of the Foundation for Endangered Languages) will be held at the University of Hyderabad in India on 2-5 December 2016. Its theme is "Language colonization and endangerment: Long-term effects, echoes and reactions." The CFP deadline has been moved back to 1 July 2016 (the conference date was also changed).

FEL XX "aims to examine language endangerment during the colonial era, and the impact of colonization on the subsequent efforts of the independent nations and communities to revitalize their language heritage. The conference will look at continuity and change in approaches to language use." The concept of "colonialism" is broad, including not only expansion of European rule, but also historically earlier periods of domination by one people over others.


The First International Conference on Revitalization of Indigenous and Minoritized Languages will be held in Barcelona, Spain, on 19-21 April 2017. It is co-sponsored by the Universitat de Barcelona, Universitat de Vic-Universitat Central de Catalunya, and Indiana University-Bloomington. The deadline for proposals is 30 July 2016.

"The mission of the conference is to bring together instructors, practitioners, activists, Indigenous leaders, scholars and learners who speak and study these [indigenous and minoritized] languages. This international conference includes research, pedagogy and practice about the diverse languages and cultures of Indigenous and minoritized populations worldwide."

Language conferences and Africa

As noted above, of the nine events spotlighted in this and the previous three posts, all but one are outside of Africa (that one is in South Africa). To be fair, not all of them deal specifically with African languages. But in general one may fairly ask how many conferences on languages and linguistics - be they Africa focused or global in scope - take place in Africa, one of the most multilingual continents. No clear answer to offer here, but if one were to do a count, it might help to go about the task with attention to types of conferences - academic vs. policy vs. workshop-type - and to the subjects - general or focused on Africa. I have the impression that quite a number of events - conferences, expert meetings, etc. - dealing with policy and practical aspects of African language use have been held in various parts of Africa, as one would expect. On the other hand, academic conferences, whatever the topic - even African languages and linguistics - are more frequent in the Northern countries due to the number of institutions and scholars, and resources available to them for convening such events. General conferences on topics like ICT4D or endangered languages might be located anywhere, and conference series with significant African content and potential participation do seem to try to alternate regional locations to include some in Africa.

All of which is to say that my unscientific sampling of nine events may not tell us much about the choices of location of conferences on or relating to African languages. Nevertheless, it seemed worth addressing the topic given the apparent discrepancy in geographic representation.

Friday, June 10, 2016

Upcoming events: Bantu 6, Borderland Linguistics, LSSA/SAALA/SAALT, and TripleA 3

Having spotlighted ICTD 2016 last week and the upcoming TALAf 2016 workshop, here are three four more conferences taking place over the next few weeks whose subjects are directly or indirectly relevant to African languages.

Bantu 6

The 6th International Conference on Bantu Languages, 20-22 June 2016, "brings together specialists in all aspects of the study of Bantu languages." It is being organized by the University of Helsinki in Finland with several partners and sponsors. The provisional program and abstracts are available on the conference site.

The series of linguistic conferences of which this event is a part considers the branch of the Niger-Congo language family known as Bantu. Bantu languages are spoken in large parts of Southern and Central Africa, as well as in East Africa.

The series, which has involved many prominent international scholars in African languages and linguistics, goes back several years with conferences in various locations in Europe (this incomplete list gleaned from several sources):
  • (First)
  • Bantu Languages: Analysis, Description and Theory, 4-7 October 2007, University of Götenborg, Sweden
  • Bantu 3, 25-27 March 2009, Tervuren, Belgium
  • "B4ntu," 7-9 April 2011, Berlin, Germany (Bantu 4 was originally scheduled for 22-26 March 2010 at Lancaster University, UK, but had to be postponed)
  • Bantu 5, 12-15 June 2013, INALCO, Paris, France

Borderland Linguistics Conference

The Borderland Linguistics Conference will be held on 27-28 June 2016 at the University of Bristol, UK. This is not specifically related to Africa, however, the program includes three presentations on languages in Africa. Also, given the attention in this blog to "cross-border languages" in Africa, it seems especially appropriate to mention this event.

The conference theme is described this way:
The notion of border is highly complex and problematic, whether it be an officially demarcated border between two states, or a less rigorously defined meeting space of somehow differentiated social or ethnic groups. Leading theorists have proposed that a broad-reaching 'theory' of borders may in fact be infelicitous, due to the contextual specificities of each different border area that may constitute an area of study. Nevertheless, borders remain fruitful sites for scholarly inquiry, and this conference invites contributions from linguistics researchers of all levels whose work focuses on borderlands.


The LSSA / SAALA / SAALT Joint Annual Conference for 2016 will be held at the University of the Western Cape in Bellville, South Africa on 4-7 July 2016.  The three organizations running the conference are: Linguistics Society of Southern Africa; Southern African Applied Linguistics Association; and South African Association for Language Teaching.

The conference theme - "Language and Linguistics in the Global South: Posing the Challenge" - is framed "within the current context of demands for radical changes to academic content and access at our universities" and encouraged contributors to address "issues of decoloniality and southern theory in linguistic research and teaching." The topics of the conference include: applied linguistics; language practice; language teaching; linguistics; sign language; sociolinguistics; multilingualism; discourse analysis; and linguistic landscapes.

TripleA 3

The Semantics of African, Asian and Austronesian Languages (TripleA 3), 6-8 July 2016, Tübingen, Germany, is the third in a "workshop series aims at providing a forum for semanticists doing fieldwork on understudied languages. Its focus is on languages from Africa, Asia, Australia and Oceania."

Semantics is a branch of linguistics concerned with the study of meaning. The TripleA 3 program includes a number of presentations on African languages.


The attentive reader will notice that three of thee four events or series take place in Europe. This is partly a function of chance in the time period chosen, although it is true that Northern institutions have the resources to sponsor such meetings.

Normally it is more useful to post the calls for participation/papers (CFPs), but these are published regularly on relevant sites including Linguist List. This blog is not intended as a reliable source for such news, but will hopefully continue to carry information about interesting meetings and events relating to African languages and the information society. (That said, an upcoming post will feature two CFPs that may be of interest.)

(The section on the Borderland Linguistics Conference was updated on 14 June 2016 with information provided by its organizer, Dr. James Hawkey. Information on the 2016 LSSA / SAALA / SAALT Joint Annual Conference was added on 17 June 2016.)