Saturday, October 31, 2015

Telling, writing and reading stories

mEducation Alliance logo
On Friday I had the chance to join the fifth annual mEducation Alliance Symposium and it got me thinking again about stories. Everyone has a story, as they say, and sometimes many. Technology may facilitate the telling of those stories, and it will be interesting to examine in more depth the work of organizations that are exploring this potential.

One of the things noted in some literacy efforts is how newly literate people will express themselves in the new medium - writing (literacy is not just learning how to read, a fact that is sometimes forgotten). Other efforts even going back some years have had interesting results putting video recorders in the hands of people who never had access to it and letting them document their livelihoods.

Now with smaller and more powerful devices in the hands of more people in more places, the issue becomes how to exploit the potential not only for delivering content from the usual centers and sources (with no disrespect...), but also for facilitating creation and collaboration by more people in more languages (including heretofore less-resourced ones), linking speech, text, and images in new ways. And then, how to bring rapidly evolving machine translation technology into the process.

Focus on the stories - from the autobiographic, to fiction, to community histories, to practical experiences...

I hope to come back to this soon with reference to at least a couple of current projects.

Tuesday, October 20, 2015

RFI's new Mandenkan service & webpage

RFI Mandenkan logo
Radio France Internationale (RFI) launched a new service in Manding - "RFI Mandenkan" - on Monday 19 October, along with a webpage for the service in French and Manding.

The bilingual webpage featured what appears to be Bambara (one of the Manding tongues - see below), and uses an orthography close to the official one used in Mali. (One minor difference is use of "ny" in place of the letter "ɲ".) It will be interesting to track how this content evolves (and whether and how VOA's "Mali Kura Bambara" service might respond).

Regarding the RFI Mandenkan broadcasts, there was apparently confusion among some (based on a message seen on an email list) as to what frequencies were to carry the first Mandenkan emissions - maybe this is typical for newly started shortwave radio services.

A comment about the broadcasts by Nafadji Sory Condé (on RFI's article announcing the service) called attention to issues with terminology in Mandenkan, and suggested that the broadcast staff consult with N'Ko specialists - this is an interesting perspecive, since evidently some in the N'Ko movement are really studying the language in its diversity as part of developing and teaching a literary standard for Manding. In other words, the language itself is not lacking, but knowledge of the range and depth of its vocabulary may not be known to all native speakers educated primarily in French-medium schools.

I took the chance to listen to some of the audio via the webpage (you can click on a couple of places at the head of the page to listen), and was suitably humbled (fast for my rusty L2 level), though I was able to get some of it. Overall, it sounded smooth, which I guess one should expect from this level of radio.

About Manding & Mandenkan

Manding is a term for a group of (mostly) mutually intelligible languages within the Mande language family of West Africa - notably Bambara, Jula (Dioula), Malinké, and Mandingo. It is spoken as a first or second language by significant populations in all or parts of several countries: Mali; Burkina Faso; Côte d'Ivoire; Guinea; Gambia; and Senegal.

Mandenkan is a way of saying "Manding language." Manding in English and Mandingue in French obviously originated from one or more of the Manding languages. However the term Mandenkan as a word in the language(s) to describe them together may be a recent construct (I recall hearing questions about it in the late 1980s, and a 2000 publication by Prof. Eric Charry seems to indicate that Mandenkan and another form, Mandekan, originated with linguists).

The kan suffix is interesting. While in English or especially French, "tongue" ("langue") is used to mean language or expression, in Manding, "throat" ("kan") is used to express language or voice. The endonyms for the Manding languages mentioned above are Bamanankan, Julakan, Maninkakan, and Mandinkakan.

RFI's choice of Mandenkan

RFI's choice of Manding is interesting for at least a couple of reasons. First, they deliberately chose a category that crossed many borders in West Africa (note the quote by Imogen Lamb in this piece from Mali Presses), as well as the standard linguistic demarcations (Bambara, Malinké, etc.). Compare with VOA's focus on Bambara in Mali. And second, this is the first African language service that RFI has undertaken in primarily a Francophone zone (demographically most of the potential audience for RFI Hausa is in Nigeria, not Niger). For a more complete discussion of these and related issues, see Coleman Donaldson's blog post from last year, when RFI's Manding project was still in formation, "RFI and Voice of America learn Manding."

(Of possible interest on a similar topic: "Hausa on the international radio websites.")

Sunday, October 18, 2015

International Mother Language Day 2015 in Africa

Poster for IMLD 2015
Yes, the 16th annual International Mother Language Day was observed several months ago (21 February), but it's still worth highlighting a few of the IMLD 2015 events and articles in Africa. In the absence (as far as I've seen) of any comprehensive summary of IMLD activities on the continent, perhaps this can serve as a small way of facilitating exchange of information about the different kinds of activities that governments, NGOs, and universities have organized. (It's only 4 months until the next observance, by the way.)

The below list is only a selection, and not in any way comprehensive. A note about this posting is included at the end.



"Culture Minister Calls Angolans for Natural Coexistence of Mother Tongues" (the Culture Minister, Rosa Cruz e Silva, called attention to the plurilingual nature of Angolan society. Theme of IMLD there was "United in the linguistic and cultural diversity in Angola, let's valorize our integrity")




"16ème journée internationale de la langue maternelle : Pour une éducation inclusive à travers et par la langue" (summary of observance in Kétou commune, which featured several government officials including the minister of culture, as well as the Nigerian ambassador)


"La Journée Internationale de la Langue Maternelle - Le 21 février 2015" (Ceremony at the Ministère de l’Éducation de Base in Yaoundé, including young students who were taking part in experimental bilingual classes of a project of ELAN-Afrique)

Congo (DRC):

"Le ministre de la Culture et des Arts Banza Mulakaly s’investit dans la promotion des langues maternelles" (the minister of Culture and Arts spoke at a ceremony organized by the Observatoire des langues)

From "International Mother Language Day 2015 Celebrations" compilation  (Shalom University of Bunia planned two half-day conferences with activities including presentations from eleven researchers involved with the region’s languages and speeches from academics and public officials about the value of the mother tongue.)



"Ghana joins the world to celebrate International Mother Language Day" (observance at University of Education, Winneba, Ajumako Campus, featuring a talk by Dr. Avea Ephraim Nosh; alternate link here)

"EDUCATION: UEW students mark Int’l Mother Language Day" [website now offline] (another story on the above event, mentioning contributions to the discussion by Dr. Paul Opoku-Mensah, Prof. Samuel Asiedu Addo, and Mr. Samuel Donkoh)

"Radio broadcast in Akan, by Peter Essien" (calls for the same amount of attention to using Akan correctly on the air as is given to using English correctly)

"How Bible saved 8 Ghanaian languages" (highlights discussion by Rev. Erasmus Odonkor that translations of the Bible have played a key role in maintaining vitality of and literacy in several Ghanaian languages)

"Government urged to revive minority languages" (very brief item on observance in Accra in which experts called for language policy to promote mother tongue education for minority languages)


"International Mother Language Day: How well do you know your mother tongue?" (TV feature on IMLD including visit to classroom and interviews with people in the street)

"UoN Hosts International Mother Language Day" (a 2-day event hosted by University of Nairobi and co-organized by the Dept. of Linguistics and Languages, UNESCO, and an organization called Bible Translation and Literacy)


"Tenin-dreny - La langue maternelle importe pour l'éducation" (highlights role of first lady Voahangy Rajaonarimampianina, the ministers of l'Artisanat, de la Culture et du Patrimoine [Brigitte Rasamoelina] and of Communication [Vonison Andrianjato] in the observance, which included, the awarding of the "Fantaro" prize; mention also of other events) 



"Langues maternelles : UN LEVIER PRIVILEGIE POUR L’INTEGRATION SOCIALE" (Observance in Ouelessebougou, organized by l'Académie malienne des langues [AMALAN] and presided over by minister of education Kénékouo "Barthélemy" Togo)



"Chinamibia celebrates International Mother Language day" (Children of Namibia - ChiNamibia - and the Franco-Namibian Cultural Centre held an event including dance, drama, music, and poetry that was intended to highlight the importance of mother tongues)



"NICO Supports International Mother Language Day Celebration 2015" (article on the National Institute for Cultural Orientation site about planning for a later observance of IMLD)

"Nwaozor - The Significance of Mother Tongue" (newspaper op-ed on IMLD in which the author discusses the importance of first languages and makes some policy recommendations)



"International Mother Language Day – Discovering poetry in Seychelles" (Seychelles Writers’ Association, known locally as " Lardwaz" held a short ceremony on IMLD, in which they revealed their proposed calendar for 2015)




"International Mother Language Day marked in Mogadishu" (the president of Djibouti and government representatives from Ethiopia and Kenya joined the observance in Mogadishu; article also mentions laying the cornerstone of the Somali Language Academy, about which more here)

South Africa:

"Deputy Minister Rejoice Mabudafhasi observes International Mother Language Day, 21 Feb" (scheduled meeting with "future language practitioners" at University of Limpopo, Turf loop Campus, in Mankweng, Limpopo Province)


"International Mother Language Day 2015" (blog posting by a Welsh person working in Tanzania, making connection between most children not speaking at home a language other than the Swahili used in school, and their lower scores in Standard 3 exams than native Swahili speakers)


From "International Mother Language Day 2015 Celebrations" compilation (Literacy and Adult Basic Education [LABE], a local NGO, planned language activities including reading and storytelling competitions in 6 post-conflict districts of northern Uganda)


"Celebrate International Mother Language Day" (about preparations for celebrating IMLD by the San people in Tsholotsho)

Notes about this post & holding reserve posts

I actually started this post just after IMLD in February, but put it aside to focus on some other items. I added to it on a couple of occasions, and ultimately decided it was time to finish and publish.

As such this is a special case in an aspect of blogging I discovered over the years - having a number of drafts in the cooker, as it were. Some of those eventually I return to and complete more or less as planned, some change, and for others, the material in the draft might get repurposed. In this case, there was an obvious time limit on material related to an annual event (yours to decide if it already passed a more functional limit of utility). It turned out as intended, except that there was a second section to the original draft of this post that ultimately was not necessary, but that I may use later.

Monday, October 12, 2015

The secret life of Bambara Arial

Quite unexpectedly, I heard yesterday the names of two old 8-bit Malian computer fonts - Bambara Arial and Bambara Times - from two different people. Matt Heberger, who is coordinating work on the forthcoming Bambara translation of Where There is No Doctor mentioned that one of the translators provided him with copies of these fonts (as .ttf files), and Sam Samake, an old friend and former Peace Corps/Mali staff member, asked how I got "Bambara Arial" to work on the internet.

It wasn't supposed to be this way once Unicode rendered such special fonts unnecessary.

The history

In the 1990s, a joint project of the Malian Ministry of Education and the French Agence de Coopération Culturelle et Technique (ACCT; precursor to the Organisation Internationale de la Francophonie), produced two modified/hacked versions of the Arial and Times fonts, replacing characters not used in the orthographies of Malian national languages with characters not present in those original fonts or other commercially available fonts of that era. Thus for example, the "q" was replaced with "ɛ"* (type "q" get "ɛ" instead).

Table from Enguehard & Mbodj 2004.* Caractère affiché is what you get;
Caractère initial is original character & the key you tap to get the new one.
This approach for adapting fonts for writing systems not supported by the pre-Unicode standards was fairly common 1980s and 1990s, including in a number of African countries like Mali. Fonts like these achieved what they were primarily intended for - being able to compose and print documents with the extended alphabets. One could also share digital copies of documents, but those could be properly read only with the same fonts. Different modified fonts - and Mali had several - were mtually incompatible. That was the whole reason, of course, for Unicode, which also makes it possible to share documents in any alphabet across the internet on any browser, wordprocessor, etc.

But while Unicode became the international standard, evidently at least some people in Mali kept using Bambara Arial and perhaps other similar "special fonts." In 2005, USAID-funded "Community Learning and Information Centers" (CLICs) relied on these fonts for anything done in Malian national languages (apparently not that often). It may be that technicians in these telecenters did not have Unicode explained to them in their project training or prior study of computers.

The word about obsolescence of 8-bit fonts like Bambara Arial may not have gotten too far, or maybe the notion of a need for a "special font" to process text in languages like Bambara just was too ingrained. At this point I'm just wondering how after almost 2 decades, these old fonts are still in circulation and conversation. Just two years ago, there were these references to Bambara Arial online (thanks to a Google search):
  • N'oublie pas de les ecrire en vrai Bambara ''Bambara Arial''Dans Microsoft Word (Facebook, 2013-1-11)
  • I would like to arrange for volunteer translators for Bambara. How can I access fonts for the Bambara alphabet (Bambara Arial for example)? (Google code Khan Academy issue tracker, 2013-3-23)


More to it?

Adapted from an image by Denis Jacquerye.

There's a twist to the story though. It seems that in one respect, these old fonts follow Malian orthographies better than the Unicode fonts. The letter ɲ has two forms of upper case, one like the lower case letter but bigger ("n-form") and the other like the capital N with a tail on the left leg ("N-form") on the left and right sides respectively of the image on the right. The "n-form" is most used in Mali and apparently in the hacked 8-bit "special fonts" like Bambara Arial; most Unicode fonts use the "N-form" (thanks to Matt Heberger for his observations on that.). I doubt that this alone could account for the persistence of the old fonts, but it might be a factor.

Maybe a new life could be given to the old fonts that people are still using by reencoding them to Unicode and releasing them under recognizable names.

* Two articles in French mention modified 8-bit fonts used in Mali, showing which different characters were changed to extended Latin characters:  
Chantal Enguehard,et Chérif Mbodj. "Correcteurs orthographiques pour les langues africaines." Bulag 29, 2004, pp. 51-68.
Chantal Enguehard, et Soumana Kané. "Langues africaines et communication électronique : développement de correcteurs orthographiques." Agence universitaire de la Francophonie. Actes des Premières Journées scientifiques communes des réseaux de chercheurs concernant la langue, 31mai-1 juin 2004, Ouagadougou, Burkina Faso. pp.59-75.

Saturday, October 10, 2015

Facebook page & the PanAfriL10n wiki

Profile picture for Beyond Niamey's
new Facebook page.

Two quick items:
  • There is now a Facebook page for Beyond Niamey: Please have a visit and let us know what you think.
  • Am still working on getting the PanAfriL10n wiki into more respectable shape so as to make it again available for reference and contributions.
New entries to this blog are still being posted to Twitter (@donosborn) via RSS.

Thursday, October 08, 2015

Access gap in the "Connectivity Declaration"?

History sometimes moves in spirals - after years have passed and many things change, you suddenly find yourself back in a same place, but maybe on another level. I had that sense of déjà vu in Bamako in 1999, discovering that projects were still hacking mutually incompatible fonts to be able to display characters used in Malian languages on computers, much like what Prof. David Dwyer and I had to do in 1989 in order to print the first draft of the Fulfulde lexicon. The intervening decade had of course seen some remarkable advances in information technology including the unfolding of the world wide web, but there was no perceptible change in terms of ability to use African (and many other) languages on computers.

I'm getting a similar sense of already-been-there today, reading about Facebook's plan to beam the internet to remote parts of Africa and related news about the benefits of universal access to the internet including the "Connectivity Declaration."

"Access" means more than one thing

The current focus on universal internet access has me thinking back to that same period in Bamako mentioned above, when there was a great deal of enthusiasm on international development-focused email lists about how information technology and the WWW were going to transform, among other things, rural development in Africa. But while working in Mali it was clear that even if by some miracle one could get the technology to farmers, it still wouldn't speak their languages, nor even be able to properly display text in those languages, due to the font issue. (This observation was one of several leading to the Bisharat initiative, which is another story.)

From this point of view it was clear that "access" had more than one aspect, even though most discussions treated it as one issue. It was one thing to have a computer, and another to power and connect it, and another to pay for all that, and then ... with physical access secured, for a farmer or extension agent in, say, rural Mali, to actually make full (or any) use of it. The latter can be called "soft access," a term first used in contrast to "physical access" by an organization called TeleCommons (quoted here), which covers localization of content and interfaces in languages of the anticipated users. (Dwayne Bailey's reference to "last inch limitations" is another way of thinking about lack of soft access.)

Fast forward to the present, where the Connectivity Declaration mentions "access" five times and "accessible" once, but again as one undifferentiated problem to be solved and good to be achieved. And although one might interpret the intent more broadly, the focus does seem to again be technical. The Declaration certainly does not hint at the soft access issues inherent in diverse peoples taking advantage of "the tools and knowledge of the internet."

A focus on internet access as a connectivity issue is inadequate when discussing use by marginalized populations that are diverse in terms of their languages, cultures, and education levels (addressed by soft access), and in terms of other socio-economic factors (that may affect overall access). The initiative to its credit does mention "Content isn’t available in the local language" (in response to "Why aren't more people connected") as a "barrier" it wishes to address, but this is only part of the equation - there are also technical language support and interface localization needs that fit under soft access.

Current issues and potential in soft access

So we're back at more or less the same place with regard to "access," but on another level with some changes:
  • Unicode has largely resolved the font incompatibility issues mentioned above (I'd like to say completely resolved but there are still hiccups in implementation)
  • Localization of software and apps in less-resourced languages is incomplete and sometimes uneven, but it is happening
  • Input methods (hard & virtual keyboards/pads, STT) are still problematic for may extended Latin scripts and non-Latin writing systems, but the main issues now are arguably less technical than policy-based (lack of standards)
  • Human language technology / natural language processing offers the potential to 
    • bridge text ⇄ speech for speakers of less-resourced languages with new forms of soft access as well as content deliver
    • bridge languages by machine translation of content
  • Content localization and creation, which like software localization is incomplete and sometimes uneven, extends the concept to include soft access to knowledge
So we're much better situated to enable soft access today than we were 10-15 years ago, but there is still not the same focus on these issues as on the simpler (which is not to say easy) challenge of technical connectivity. Soft access means more and can be facilitated by more than was previously the case, but it will not come in the same access "box" as internet access without explicit attention. Part of the reason is that "universal internet access" is implemented on a wide scale (national, regional, continental), while soft access in the case of most less-resourced languages is particular to smaller areas and populations.

Soft access needs attention from people and organizations closer to the need - governments, NGOs, companies, and communities on the national and local levels, as well as regional and continental organizations (such as ACALAN). However I would suggest it also needs advocacy from the same people in positions of influence and power - like Mark Zuckerberg, Bill and Melinda Gates, and the other originators of the Connectivity Declaration - who are promoting universal internet access.

Whose "tools and knowledge"?

A blog post by Dr. Mark Graham of the Oxford Internet Institute in response to a tweet by Jimmy Wales (another one of the originators of the Connectivity Declaration) ...
... raises some issues about the impact of corporate-sponsored internet access that are worth considering in this context (these do not represent all of the analysis in the article). What kind of internet service or choice of content, and under what conditions (ads?) would Facebook's, or Google's Free Zone (another free concept) provide? And what might be the impact on local internet markets? Dr. Graham suggests that
In much the same way that food aid as a development strategy harmed local farmers and markets in Africa, “connectivity aid” could similarly destroy the evolution of local content, local innovation and local alternatives.
This is an interesting point, though it bears noting that approaches to food aid have evolved, and in any event, research on its impact has reached diverse conclusions. It is however a legitimate question whether "connectivity aid" would be a simple "Pareto improvement" (some gain, no one else loses) for the societies receiving it.

The impact of free internet on soft access and local(ized) content is another interesting unknown. It may cut either way, with the potential for support for languages down the long tail* if the corporations decide (perhaps in facilitating work of local initiatives?), or for an economic decision on their part to limit focus to languages at the head of the tail (i.e., the ones the most spoken). For some languages at least, this might become somewhat complicated if you take into consideration the technical and content dimensions that must be managed (note those bulleted under "Current issues and potential in soft access," above). For languages that lack standard orthography and/or have high dialectal differences without one dialect being accepted as a standard variety, a free internet operation would probably not be well placed to engage the issues necessary to have - so I imagine there could be a risk of corporate sponsors deciding to limit commitments in this area.

More on "access"

All that said, there's more to the topic of access. The physical/soft distinction is a useful but basic one. In African Languages in a Digital Age I discuss this matter in more depth (although with reference more to computers than mobile devices), including reference to one organization that disaggregated 12 dimensions of "real access."

A factor related to soft access is user skills (which may include basic literacy, computer literacy, experience with other technology), and there is some trade-off between the two, in that greater user skills require less attention to soft access, whereas lower user skills require greater attention to soft access by those providing physical access/internet access.

Soft access is not so much an issue for those of us speaking dominant languages, and who have a lot of computer experience. Another aspect of interaction with devices that normally wouldn't affect access - the quality of user experience - may get more attention in this case.

* The distribution of languages spoken in a society is always asymptotic. See earlier discussions on the long tail of languages.

Friday, October 02, 2015

Thoughts on linking L10n and ICT4D in Africa

It is striking that in Africa - multilingual as it is - discussions of uses of information and communication technology (ICT) for development (ICT4D) generally pay little attention to how one might optimally adapt the technology and the content to national linguistic realities. There are explanations for this, including for example:
  • An outward-looking focus on the internet as way to tap international markets and knowledge (contrasted with a relatively limited local market for internet content, although that is changing somewhat with increased access to smart mobile devices)
  • The roles of English / French / Portuguese as official languages, and higher status of these Europhone languages relative to African languages
  • Attitudes about language (the notion that African languages are not adapted to science or technology, the impression that there are too many African languages to deal with)
  • Issues relating to the written form of languages (in some cases lacking or not standardized, or standard orthographies may not be widely taught; computer systems from abroad may not have fonts for the alphabets used to write the languages)
However, explicit attention to potential uses of the first languages and local lingua francas in ICTs could address such issues and lead to various benefits: some in areas on which traditional ICT and ICT4D policies focus (such as enhancing user skills in ICTs; increasing relevant web content); and some in other areas important to development that fall outside the usual ICT discussions (language development, links with indigenous knowledge, new kinds of creativity).

Discussions of localization (L10n) and multilingual computing in the context of ICTs for national development could be built around several themes, such as:
  1. Dialogue between ICT policy and language policy processes. There is apparently little overlap between ICT policy and language policy in most countries of Africa, hence little institutional support for discussion of "localization policy" or, as it is called in some Asian countries, "local language computing policy." This structural divide can go all the way down to the level of computer technicians and linguists. Bringing together discussions of language policy and planning, and education policy as regards languages on the one hand, and ICT policy as concerns expansion of use of the technologies and their contribution to development on the other, could yield some new insights and foster collaborations among efforts to promote ICT4D and develop languages.
  2. Approaching localization as a way of adding value to the impact and potential of ICT for national development. Localization is not about replacing one language with another, of course, but rather adding diverse language capacities and content to computer systems and the way we use them. ICT is in many ways inherently additive or positive sum - adding one more option in terms of language interface or content need not take away from another, but rather adds to the benefits potential users can get out of ICT, and what they can use the technology to accomplish. In a multilingual context, promoting ways of enhancing language capacities of computer systems (localizing software, developing tools to make that possible) and increasing diverse language content (through localizing content and creating local content) are relatively inexpensive ways of giving ICT more dimensions, (soft) access points, and meaning - hence value.
  3. Linking with broadcast technologies. Broadcast media - the older forms of "ICTs" in effect - have been "localized" from the beginning, so the interface with technology that is not localized will be limited. Localization of web content about important topics in health and livelihoods, for example, could be used directly by community radio stations, without the need for locally (and potentially inaccurately) translating that information.
  4. Leveraging the benefits of internationalization of ICT and of language technology for Africa. A lot of work has been going into making computer systems and software more able to handle diverse languages and scripts: Unicode's accommodation of all writing systems; tools to develop language resources (from keyboards to corpora); open-source opportunities for localization; research on advanced applications (such as computer assisted translation), etc. Multilingual Africa stands to benefit a lot from these advances (and contribute in turn to them), but it has not, on official levels, been systematically seeking either to take advantage of the opportunities or to place itself in a position to participate more fully in the future (e.g., training of computer technicians in aspects of language technologies). There are some initiatives on NGO, academic, or project levels, but official support has in most countries appears to be minimal. This is an area that ICT policy can address.

(Adapted from a message to the Afrik-IT list entitled "Re: Ghana and other open access network model and SAT3," 23 August 2007. This blog entry was originally posted on 30 September 2015 and inadvertently deleted.)