Friday, December 31, 2021

On the eve of IDIL, some reflection

Just a quick signal at the end of 2021 to say that I hope to resume posting here intermittently in 2022, which also will be the first year of the International Decade of Indigenous Languages (IDIL).

The place of African languages among indigenous or autochthonous languages - whether all of the first languages of the continent are included, or only some, and by what criteria - seems to me to still be an open question. I've explored that question in 4 posts on this blog, in case any of them might be of interest:

Although my schedule permits much less space than I would like for research and writing, and I am balancing pursuit of other interests as well, I remain keenly interested in African languages and their use in education, development programming, and advanced technologies. The next decade may be critical for the future of African languages, as language technology quickly advances with or without them, and the dynamics of inter-generational transmission of them change or even break down.

At the same time, I have been reflecting on what are and aren't appropriate roles for non-Africans in advocacy for, and pushing our ideas about, African languages. It is not as if these issues have never occurred to me, but observing the field and the general tendency of outsiders, especially of relatively privileged background, to populate all sides of any discourse on Africa, I needed to take a step back.

This is not to backhandedly call into question all people from one culture who propose to do work relating to another culture - in fact, all cultures, and the world in general, needs such diverse perspectives. However, between my part of the world and Africa, there are asymmetries of influence. So I am taking a moment to reflect on those, and how what I think I've been doing benefits from them, and what might be the inadvertent messages and unintended effects of that work.

Other than that, all is fine, Yerkoy sabu.

Best wishes for a Happy New Year 2022!

Sunday, February 21, 2021

IMLD 2021: What multilingualism means for inclusion

This year's International Mother Language Day (IMLD, 21 February 2021) has as its theme, "Fostering multilingualism for inclusion in education and society." Of the key terms in this expression, most need little if any explanation. But the meaning of one of them - inclusion - merits attention.

When we speak of - or speak - mother tongues and second (or additional) languages, as individuals, communities, and countries, that is "multilingualism." "Education" is often thought of as connected with schooling, but it also includes other modes of learning and sharing knowledge. These two concepts - multilingualism and education - along with specific attention to mother tongues, have been recurrent themes in IMLD observances over the years. And they are often discussed together - for example as "mother tongue based, multilingual education" (MTB/MLE).

"Society" seems impossibly broad, taking in pretty much everything we do. That's certainly appropriate in this context, as language and choice of languages are fundamental to our communication, interaction, and collective memory (not ignoring the tandem role of images and non-verbal.forms of communication). However I read a new and encouraging dimension to the mention of society in this year's IMLD theme: a link with sustainable development.

In an IMLD 2021 concept paper by the UNESCO Education Sector and the Organisation Internationale de la Francophonie, multilingualism is linked to the "Sustainable Development Goals’ focus on leaving no one behind" (which, by the way, actually has an acronym: LNOB). This IMLD concept paper also mentions the centrality of multilingualism for indigenous peoples' development - from which we can assume that where many languages are spoken, engaging constructively with that reality is fundamental for everyone's development.

Before coming to what I see as the key term in this year's IMLD theme - inclusion - note should be made of the action word at its beginning: "fostering." I hear this as calling out the importance of policies and planning, and the effective implementation of these. The issues of multilingualism need active attention from not only individuals, but also governments and organizations, without which we have words and no action (and in the end, no words either).

Whose inclusion?

"Inclusion" turns out to be a tricky concept. Although I accept that the intent of its use in the IMLD 2021 theme is positive - the notion that languages can facilitate education and full participation in society - this term can also carry some less positive meanings. Chief among these is the implication that there is an outside and an inside, and those inside define the terms for bringing the outsiders in. There's a potential inequality there that led one writer from a community development perspective to propose abandoning the word inclusion altogether.

In the context of schools, inclusion is often (at least in the West) used in the context of students with special educational needs (and within that setting, one writer identified eleven definitions!). This does not seem to be an appropriate analogue for promoting multiple languages in education.

In the context of language and languages in Africa, the question of inclusion seems to me to become more complex. Upon independence, most African states opted to keep the colonial languages as official, rather than promoting use of one or more among their indigenous languages. This in effect put various ethno-linguistic groups within their borders at the same disadvantage (no one "inside group" controlling power). However, it also gave those who were fluent in the official languages an advantage, which is maintained through a dynamic described by Prof. Carol Myers-Scotton as "elite closure" (in other words, an there's an inside group after all).

At the same time, as Prof. Ayo Bamgbose once observed, many states operated on the paradigm that "one language always unites and many languages always divide." So in effect inclusion or exclusion in education and society have been defined to a large degree (in linguistic terms) on the basis of use of the one official language.

It is possible to find "inclusion" on more shallow level, in a group or structure that does not fully value what one brings to it, or that requires higher sacrifice from some than of others. That's as true regarding mother languages as it is regarding other aspects of culture.

While acknowledging the utility of a common language (lingua franca), the fostering of multilingualism seems to me to have the potential to shift the basis for inclusion from something centrally controlled or defined by a limited group, to a dynamic with more entry points.

Thursday, December 31, 2020

A sabbatical, of sorts

The year 2020 has been a kind of Pandora's box of chickens coming home to roost. Unexpected in particulars, but not altogether unpredictable in terms of the kinds of problems we've seen. The COVID pandemic in particular has caused suffering and death, and then grief in the wake of those losses.

Against that backdrop - and thankful that my family and I have personally escaped the worst (Yerkoy saabu!) - I've taken a step back from posting here to reflect on what I am doing with this blog, and what are appropriate roles for a non-African and non-linguist like me in advocacy for African languages (of which I can claim some degree of mastery of only two).

To be honest, there have also been unanticipated changes in my schedule that have disrupted whatever rhythm I had in writing. And a number of interests competing for time and attention.

As we head into 2021, with hopes for better for everyone, one of my plans is to resume posting here, at least intermittently, and in an effort to contribute constructively and appropriately to discussion about the place of African languages in the global information society.

Friday, February 21, 2020

IMLD 2020: "Languages without borders"

The theme of this year's International Mother Language Day (IMLD2020; 21 February 2020) - "Languages without borders" - seems especially appropriate for Africa. So many (almost all?) African languages - or more accurately, populations having a common mother tongue - are divided by borders established during the colonial period (and prudently maintained in the interests of peace). And people on opposite sides of these borders continue to use their languages, which former Malian president Alpha Oumar Konaré once referred to as "sutures" linking neighboring African countries.

The African Academy of Languages (ACALAN), has had structures in place for some years to work on African "cross-border vehicular languages," which are larger in numbers of speakers and geographic extend of use. In her message on the occasion of IMLD2020, UNESCO Director-General Audrey Azoulay calls attention to all cross-border languages - all mother tongues that cross borders:
For IMLD 2020, UNESCO has chosen the theme of languages without borders to draw attention to the way in which all languages, including mother tongues, contribute to intercultural dialogue and peace. Indeed, throughout the world, numerous cross-border languages bring their speakers closer to one another, turning borders into bridges instead of barriers.

40% of people not educated in mother languages

In  her IMLD2020 message, Mme Azoulay also calls out another fact that is particularly relevant to Africa:
Moreover, mother tongues are valuable allies in our efforts to achieve quality education for all. In fact, as UNESCO studies have shown, studying in a language which is not one’s own interferes with learning and increases inequalities. Yet according to the most recent estimates, 40% of the world’s citizens find themselves in this situation. Bilingual or multilingual education based on students’ mother tongue not only encourages learning, but also contributes to understanding and dialogue among peoples.

As with all IMLDs, we hold out the hope that this observance may inspire more progress in use and development of the diverse languages that are productive and charished parts of our common human heritage.

Saturday, December 28, 2019

AfricaNLP2020 (Addis Ababa, 26-4-20) & related items

Quick post to call attention to an upcoming workshop on machine learning (ML) and natural language processing (NLP) in African languages, and its call for participation. Also a list of related initiatives, including the Machine Learning and Data Science in Africa (MLDS Africa) forum.

AfricaNLP2020 workshop - "Unlocking Local Languages"

The AfricaNLP2020 workshop will be held on 26 April 2020 as part of the Eighth International Conference on Learning Representations (ICLR) in Addis Ababa, Ethiopia. The workshop is describes as follows:
"The rise in ML community efforts on the African continent has led to a growing interest in Natural Language Processing, particularly for African languages which are typically low resource languages. This interest is manifesting in the form of national, regional, continental and even global collaborative efforts to build corpora, as well as the application of the aggregated corpora to various NLP tasks."

The workshop aims are described as:
"• to showcase this work being done by the African NLP community and provide a platform to share this expertise with a global audience interested in NLP techniques for low resource languages;
• to provide a platform for the groups involved with the various projects to meet, interact, share and forge closer collaboration;
• to provide a platform for junior researchers to present papers, solutions, and begin interacting with the wider NLP community;
• to present an opportunity for more experienced researchers to further publicize their work and inspire younger researchers through keynotes and invited talks."

Submissions "for oral and poster presentations on a wide variety of NLP tasks for Afrcan languages ...  will be evaluated and selected through a peer review process." Deadline: 1 February 2020. (They can be submitted via

Corpora-building, ML, MT, & NLP initiatives

The workshop page lists six collaborative effortson African languages, which I'll list below, as seen on their page, along with a seventh I learned about recently:
  • Niger-Volta Project - Speech Recognition, Language Identification, Machine Translation & Natural Language Processing for West African Languages 
  • - A Focus on Machine Translation for African Languages
  • - A crowdsourced dataset builder and community for NLP in underrepresented languages (apparently translating MS-COCO captions into Afrikaans, Amharic, Bukusu, Coptic, Fanti, Luganda, Luo, Masai, Meru, and Nandi)
  • - An initiative  to build a Natural Language Processing Platform for Kinyarwanda and to make it available to all developers and for all use cases 
  • EthioNLP - Ethiopian Natural Language Processing Research
  • AI4D - African Language Dataset Challenge - A community effort to help uncover and create African Language Datasets for improved representation in the field of NLP (see also an update on its "Dataset Challenge" from 23 Dec. 2019)
  • PidginUNMT -  Unsupervised Neural Machine Translation from West African Pidgin to English (this was written up on Techcabal on 16 Dec. 2019)
It's great to see this kind of activity related to language technology. I've often thought that multilingual Africa has the potential to lead and innovate in this area.

MLDS Africa

MLDS Africa is an online network with a Googlegroup for communication among research groups such as the above, and a webpage with info on upcoming conferences and workshops, like AfricaNLP2020.


The image above connected with the ICLR conference hosting AfricaNLP2020 came from a page on with details on papers accepted for the main conference as of 20 Dec 2019. (The workshops on the first day of ICLR, such as AfricaNLP2020, evidently have their own deadlines.)

Monday, December 16, 2019

Yahoogroups & African languages, follow-up

A quick follow-up to my previous message regarding the deletion of Yahoo Groups. In brief, the saga is still ongoing. This message will provide an update and mention other Africa - and especially African language - related groups (as maybe a missing dimension in this saga).

I've spent what time I could on the matter of saving the content of several groups of interest mentioned in the previous message. For AfrophoneWikis, Etienne Ruedin was particularly helpful in outlining possibilities and sharing downloaded backup files. And for Unicode-Afrique, Tafsir Baldé offered help and suggested a possible partnership with Idemi Africa. (See also below re Archive Team.)

In the meantime, Yahoo has extended the period during which users can request their data in downloadable form1 - and apparently also pushed back the date for pulling down the group pages - to 31 January 2020.2 The message content of the groups is no longer accessible now.

For the groups of interest, one question is what is the future of their content, and another is what future for those that are still viable. The data question is somewhat answered to the extent that individual users request their data, but Yahoo does not facilitate archiving elsewhere in such a way that would facilitate long-term access.

The larger question asked by many is why make this move in the first place. I won't explore that here, but it does highlight the problem of having such a big chunk of internet history subject to one corporation's short term interests.

The Archive Team's efforts

What's worse is that when a volunteer group - the Archive Team -  ramped up an effort to save Yahoo Groups, Verizon (Yahoo's owners) evidently worked to block such "mass-archiving" efforts. Hard to explain that.

It was only late during the scramble to do the necessary to save data (some of which had to be done manually) that I learned of the Archive Team's initiative for Yahoo Groups. And only now, digging deeper, did I realize that they had already archived a significant number of them on I was interested to see that those archives already included AfrophoneWikis, Unicode-Afrique, and AfricanLanguages.3 But they do not cover all groups, expecially not the smaller ones.

Overlooked African dimension of Yahoo Groups?

One aspect of Yahoo Groups that I haven't seen discussed, but which over the years I got the impression was important, is significant use by Africans. In other words, Yahoo Groups was never just a European and "core Anglophone" platform. Yet there's not much of an African voice I'm aware of in this discussion about the end of Yahoo Groups.

Among the most active Groups founded and participated in by Africans that I've noted are OmoOdua ("Yoruba socio-cultural discussion forum," with 1660 members and more than 160K messages since 2007) and Mwananchi ("A current affairs forum on Africa and the issues affecting the continent," with 1541 members and more than 180K messages since 2000).

However there are many that are more modest in size, such as Internet-Niger ("Internet au Niger," with 589 members and just under 16K messages since 1999). So basically I'm suggesting that among the stories of who loses with the deletion of Yahoo Groups, one that isn't getting much if any play would be the large user community in or of Africa.

African language related Yahoo Groups

Among the groups of personal interest are the five I listed in the previous posting that deal in one way or another with African languages. Three of those - AfricanLanguages, AfrophoneWikis and Unicode-Afrique - are ongoing lists, although like most other Groups, much less active than they once were. The other two - A12n-archives and PAL-archives - were set up as back-up archives for lists whose original archives were then deleted some time later. Fortuitous to have had the back-ups, but it is ironic that they too got the axe.

So here I'll list a few other African language related Yahoo Groups that I'm aware of - most of them tiny in terms both of membership and activity. However they reflect a range of interests, even if they were not always as successful in this medium as their creators evidently had hoped.
  • Ethiopic (2001, 19 members, ~100 messages). It was "established to begin dialogue and understanding regarding the computer keyboard layout for Ethiopian Languages." Its ambitious goals were somewhat obviated by work elsewhere on Ethiopic/Ge'ez keyboards.
  • Kiswahili (2001, 993 members, ~50K messages). "A forum about the latest Swahili news." I did not have much to do with this list, but had the impression it had a lot of activity and much of it was actually in English.
  • Linux2Igbo (2004, 20 members, 174 messages). "This is a project that aims to translate Linux GUI such as KDE and GNOME and popular software such as Open Office, Mozilla Firefox and Thunderbird into the Igbo language." I was asked to serve as a moderator of L2I, and made some contributions (even though I do not speak the language).
  • Mandenkan_sebeli (aka Mandenkan sebe web ka; 2003, 6 members, ? messages). Intended to promote writing of Mandenkan (Manding languages) on the web. I don't recall that this group had any archived messages - indeed the Yahoo data I received had only links for it.
  • Mzi_kaPhalo (2001, 71 members, ~150 messages). It was "set up mainly to serve the Xhosa Translators' Community, however other Xhosa language issues can also be discussed." My impression was that it was active for a brief time then mostly quiet.
  • WowlenPular (2002, 7 members, 15 messages). Set up to promote the "survival" and use of Fula, with accent on the Pular of Guinea. I posted several of the few messages on this group, including some excerpts of texts from books in Pular.

Next steps

The data I got from Yahoo included .mbox files with all the message, link, and file content of thegroups I was subscribed to, including all of those mentioned above. In theory I think one could with these files reconstruct any group if one had a reason to do so and a new host for it. So that is one possible discussion for certain groups.

On the technical side, I am not sure how the versions saved on might be used if one wanted to continue a group. This question might be useful to explore.

For most groups, the most one may want to do would be to present the data in an easily accessible, navigable, and readable format. This could include many Yahoo Groups about (or in) African languages, and that could be another discussion. 

1. See Barbara Krasnoff's helpful article in The Verge, "How to download your Yahoo Group data."
2. Per "Yahoo is Extending It’s Deadline To Delete The Content Of Yahoo Groups,", 13 Dec. 2019. The wording was not so clear, but one had the hope that the message content online might also endure a while longer. That hope was not borne out.
3. These archives are organized in batches of groups, and the files for each individual group are available in gzip format. These apparently were saved over the course of a few years - AfricanLanguages for instance in 2016, Unicode-Afrique in 2017, and AfrophoneWikis in 2018.

Thursday, October 31, 2019

Yahoogroups & African languages

Yahoo's decision earlier this month to cease hosting user-created content and archives of its Yahoo! Groups appears to mean a loss of another sliver of internet history, including substantive discussions about African languages. At a time when technology is being used to collect data in massive amounts, even on trivial matters, this seems hard to justify in the way it is being done, even if one understands the bottom-line calculations of Yahoo corporation.

Two immediate issues faced by the people connected with each Yahoo Group are whether and how first to preserve their group's message archives and other content, and second to continue the group as a list with another host. So I wanted to highlight a number of Yahoo Groups that deal (or dealt) with African languages in one way or another, and which will need help at least for permanent archiving before Yahoo pulls down the content in about 6 weeks (14 December 2019).

In this post I'll simply list a few of those that most interest me personally, with year of creation, number of subscribers, number of messages, and brief description. I may return to several of those and others in subsequent posts, all in the interest of preserving at least some of this history, and raising interest in archiving content in a way that makes it useful for future discussions and research. Keep in mind that the links in the list below will no longer function after 14 Dec.:
  • AfricanLanguages (1999; 134 subscribers; 1335 messages). About reading & writing African languages (in practice, includes many news items).
  • Unicode-Afrique (2002; 197 subscribers; 1599 messages). Discussion in French about Unicode, with accent on Africa.
  • AfrophoneWikis (2006; 99 subscribers; 756 messages). Wikipedia editions in African languages.
  • A12n-archives (2005; 2 subscribers; 1135 messages). Never an active list, this group archived traffic from several "A12n" lists with a total of about 200 subscribers.
  • PAL-archives (2006; 3 subscribers; 465 messages). Never an active list, this group archived traffic from 3 lists (in English, French & Portuguese, with cross translation) of the PanAfrican Localisation project.
I am aware of several possible ways of archiving Yahoo! Groups or transferring them to a new host. Any strong recommendations concerning one or another would be welcome.

(Image above by Jayson Willis, 2010, via