Tuesday, December 17, 2013

ICT4D and L10n programs: Shall e'er the twain meet?

Graduate programs in information and communication technologies for development (ICT4D) was the subject of recent postings on the Web2forDev list. There are also graduate programs in localization (L10n). Is there any overlap, any treatment of L10n approaches and technologies in ICT4D programs? Any discussion of work on ICT4D in L10n programs?

These are two emerging interdisciplinary fields that would seem to have a lot of potential links on the technology and communication sides, especially as concerns work in multilingual societies of Africa and much of the rest of the world.

In any event, here is a quick, selected, and certainly incomplete list of graduate programs in each:

ICT for Development
Localization
It would be interesting to know of other programs in either, and especially to know of any crossover or combined programs.offerings. Do any institutions have programs in both ICT4D and L10n, and if so, what happens there?

Addendum

If you take ICTs out of the equation, the larger, longer term issue of connections between applied linguistics and the study of languages on the one hand, and the field of development (study and practice) on the other becomes clear. This is a topic I have raised before and hope to come back to again.

The field of localization (L10n) is broad with what I'd characterize as two main areas of focus: content and communication; and software and interface. Localization is often treated together with translation, and from the outside sometimes looks like a sub-field of the latter. This is understandable as it almost always involves translation of terms and text. However localization also involves cultural considerations. Technically, in some contexts, localization might not involve what we usually consider translation at all, if it is simply a matter of adapting content and software for speakers/readers of the same language in different cultural areas.

This cultural dimension, if you will, of localization would potentially be another connecting point of localization and ICT4D, though hopefully without omitting the essential linguistic dimension. (17-12-13)

Thursday, December 12, 2013

The "eng" times for unified capital ŋ?

Perhaps the most widely used "extended Latin" character, the letter ŋ (pronounced "eng" or "engma"), has two different upper case forms that are not used interchangeably, but used alternately by different groups of languages. One of these resembles an N but with a descending hook on the right leg ("N-form"), and the other resembling a larger version of the ŋ ("n-form"). The latter, in turn, has stylistic variations in which the right leg either descends below the line, or stays above it.
Forms of letter "eng"*

The current status and future of these dual forms of capital, and how best to handle them technically for displays were the subject of a brief discussion last month (Nov. 2013) in the wake of proposed change in Dejavu fonts that was first brought up on the developers list for the latter:
In fact, this is a potential issue that has long been known, since different regions tend to use different forms, and different fonts have one or the other form. Consequently,there are many situations where doesn't know what form the capital will take. A larger issue is whether there needs to be a new Unicode character for one of the uppercase forms - what would be called "disunification" of the existing capital letter.

Background

In linguistic terms, the letter ŋ stands for a "velar n," which is pronounced as "ng" in the English word, "king." If it were used in standard English spelling, you might come across something like "siŋiŋ a soŋ." It is used in the orthographies of a range of languages from Saami in northern Europe to a number of African languages (mainly in the west and central regions, but also Dinka in South Sudan and Karamojong in Uganda), to some Aboriginal languages in Australia. (It also figures in the International Phonetic Alphabet, which of course does not need an upper case).

In many languages, the eng is distinguished from "ng" which is a prenasalized "g," pronounced "n-g," and in any event is especially useful in the beginning of words.** In the Fula language, for instance, the difference between "ng" and "ŋ" at the beginning of a number of words is meaningful. The root ŋor- at least in Maasinankoore has to do with a riverbank, while ngor- is derived from the root for male, wor-. A hippopotamus might be referred to as ngabbu, but the root ŋabb- has to do with climbing something or mounting a horse. Ngari is came or arrived; ŋari is beauty. There are other such examples.

Personally it was in Togo that I first encountered use of the letter ŋ, in the Ewe name for Peace Corps and when learning some of the Ewe and Kabiye languages. Then later in Mali when learning Fula and Bambara. The capital letter was always in the "n-form" in those places and in all I ever saw in African languages.

Later I found that a reason for that consistency of usage probably had to do with efforts to standardize letter forms, notably with the African Reference Alphabet proposed 35 years ago in Niamey by the Meeting of Experts on the Transcription and Harmonization of African Languages. (The glyph used in the pre-Unicode African special character standard ISO 6438 [1983] varies for some reason, with an earlier version having the n-form, and later versions from the 1990s showing the N-form.)
Rotated G

One aspect of the graphical history of the letter ŋ is worth noting before moving on: Apparently early printers would sometimes rotate a capital G to produce this character. So in effect the so-called "n-form" capital ŋ actually also looks like it could be called "turned-G form" capital ŋ. (I've produced the one at the right for comparison purposes only.***)

What's the problem then?

The problem with the two main forms (or "glyphs") of the capital ŋ - "n-form" and "N-form" - boils down to not being sure which form you are going to get since different fonts have one or the other form, and with the alternative forms being preferred or required in different places for different languages. This is because the two main forms are treated as the same character in Unicode, with the same "code point" (which a computer software uses to call up the appropriate symbol from the selected or default font).

These are not new issues, but now that they are getting more attention (which may actually be a good sign to the extent that more is going into print in the languages concerned).

From where we are now, there appear to be two options:

  1. Continue as is, but develop means for locales or language preferences to select the appropriate form ("glyph") of upper case ŋ from fonts that have the desired glyph. However the technical feasibility is apparently an issue. 
  2. "Disunify" the capital ŋ into two characters, with one of the major forms being given a new Unicode code point. This would be disruptive, but extremely so if it also required a new code point for a paired lowercase ŋ (with the exact same appearance as the one used throughout this posting) - all kinds of existing digital texts, fonts, and software would have to adjust for the change in some significant set of languages.

Unicode in principle calls for a separate code point for each character so one question is, that with two very different forms/glyphs being historically used and preferred (with varying degrees of intensity) in in different regions, how was the decision made to treat these as variants?

I'm actually looking over some past discussions to see how the issue and alternative approaches were treated. A 20+ message exchange on A12n-collaboration on 4-6 April 2002 among Peter Constable, John Hudson, Andrew Cunningham, and me dealt mainly with forms used in Africa and to a lesser degree Australia, with mention of Saami. (I am reconstituting the 2002-2004 archive of this list to post on A12n-archive.) However that treated all forms as variants.

Ultimately however the main question is the best way forward for all concerned. It is worth noting that Sjur Moshagen's otherwise well-framed proposal to disunify (at the end of the recent email discussions cited above) would put all the burden of change on Africa and anyone working with the numerous African languages which have the ŋ in their orthographies. Disunification the other way would similarly cost those using Saami and Australian Aboriginal languages - so it's a difficult set of choices.

A Niger exception?

A quick note about Denis Jacquereye's statement in the recent email discussions that in Niger, the N-form capital ŋ is more common - this despite the n-form being established in Niger's orthographies and in the "harmonized" orthographies used across the region. It would be of interest to see any examples, but one wonders if a limited choice of fonts might have been a major factor. A larger issue in terms of planning would be the cost of introducing or establishing such a variation ("dis-harmonization"?) in a wider regional usage, and how that might impact font development, software localization (Fula is a regional cross-border language; Zarma is part of the cross-border Sonrai cluster, for which localization is being done), etc. This would be even more problematic if Unicode were to decide to "disunify" the character.


* Source of illustration: Wikimedia Commons
** In the orthographies of many East African languages, such as that for standard Swahili, an apostrophe after ng is used to indicate this difference: ng' = ŋ.
*** "Turned-g" is actually a character used to transliterate text in the Georgian language script.

Tuesday, December 10, 2013

Apprendre le bambara à Paris

Je viens de decouvrir deux initiatives pour enseigner la langue bambara (Bamanankan) à Paris :

  • Donniyakadi, une association créée en 2008 en France, qui a pour but "le développement des échanges culturels entre la France et le Mali, notamment dans le domaine des langues nationales maliennes." Ils depense des cours de soir en bambara depuis trois ans déjà. (Dɔnniya ka di en Bambara veut dire "le savoir est bon, agréable, plaisant.")
  • INALCO, l'Institut National des Langues et Civilisations Orientales, dont les programmes sur le Mandingue (qui comprend le Bambara) sont signalés par le blog Apprendre le bambara. L'INALCO ( Langues O'), existe depuis longtemps, bien sur. Il enseigne plus d’une centaine de langues, et délivre des diplômes divers.
Ce qui m'intéresse est de savoir les dimensions d'utilisation et d'instruction des langues africaines en France, surtout quand il s'agit des initiatives entreprises par des associations privées ou ONGs telles que Donniyakadi. Et aussi de telles initiatives d'instruction en langues africaines (bambara / mandingue, ou autres) qui existent ailleurs dans le monde hors de l'Afrique. Une autre question intéressante est comment les communautés africaines dans les lieux comme Paris organisent et/ou s'associent avec des programmes d'enseignement et apprentissage des langues africaines - soit par des associations ou par des institutions telles que l'INALCO.

On voit sur le site web de l'INALCO, qu'il offre les langues africaines suivantes (en plus que le Mandigue) : Amharique, Arabe maghrébin, Berbère, Comorien, Haoussa, Peul, Soninké, Swahili, Tigrigna, Wolof, et Yoruba.

A noter que les sites (blogs) d' Apprendre le Bambara et Donniyakadi donnent des liens divers qui seraient utiles à ceux qui veulent apprendre le Bambara.

Sunday, December 08, 2013

Quick notes: ANLoc offline & Yahoogroups RSS problem

The African Network for Localisation (ANLoc) website is still down. Although I am not personally involved in its management, hopefully there will be some update to share soon. Hence the site RSS (and that for the subsidiary PanAfriL10n wiki), as seen in the right-hand column/sidebar, are not working.

Also still haven't sorted out how to revive the RSS feeds for Yahoogroups also featured in the sidebar. Hope to resolve or reformat soon. (Per discussions online, this apparently relates to a group setting changed when Yahoo updated the groups features.)

The presentation of sites and groups on Beyond Niamey was discussed more fully in a recent post.

I've also started building some short pages on this blog site, which are linked via tabs below the header.

Friday, December 06, 2013

Nelson Mandela and African languages

If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart. (Nelson Mandela)

As South Africa, and indeed the world, mourns the passing of Nelson Mandela, a remarkable leader who the New York Times called the "Peaceful Liberator of a Torn South Africa," here is a quick look at his legacy as concerns languages in Africa. Actually a very quick look, as I don't personally know much on the topic  and find relatively little on line, other than the well-known quote above and some details below (so hopefully more information can be filled in via comments or a future post).

Mandela's first language was Xhosa, an Nguni language very close to Zulu, which he apparently also spoke. He learned English in school, and it was in school where his biography says he met his first non-Xhosa friend, a Sesotho speaker. Later he learned Afrikaans while a prisoner on Robben Island. It seems Mandela was as much a product of a multilingual society as he was of a multiracial and multiethnic one..

The process* leading to the inclusion in the new constitution of all 11 of South Africa's main languages as "official" happened during the country's transition to majority rule. While I don't find any discussion of Mandela's direct involvement in that process, a fellow former prisoner at Robben Island who he knew well - Neville Alexander - was prominent in it. I'm not sure how South Africans would see it, but from afar, it seems that the officially multilingual policy of the country is part of Mandela's legacy.

Other thoughts...

A couple of other thoughts not related specifically to language upon reflecting about aspects of Nelson Mandela's legacy. First, the NY Times obit (referenced above) has this passage and quote in the context of discussing leadership style:
In his autobiography, Mr. Mandela recalled eavesdropping on the endless consensus-seeking deliberations of the tribal council and noticing that the chief worked "like a shepherd."
"He stays behind the flock," he continued, "letting the most nimble go out ahead, whereupon the others follow, not realizing that all along they are being directed from behind."
Reading the analogy, I am reminded on a literal level about a dismissive comment about Fulani pastoralists I heard from some American development experts while in Mali in the 1980s - to the effect that the herders spent their lives looking at the rear ends of cattle. There are a lot of problems with a statement like that of course (misunderstanding of pastoralism and herd behavior, attitudes regarding development,etc.), but taking it all back to the level of analogy and leadership, it has me thinking about how outsiders (primarily Westerners) conceive of or misunderstand community leadership and decision-making processes, not only in Africa but also in Asia.

Second, I came across an article about how in April 1994, the then new South African president Mandela surprised the outgoing president DeKlerk's official staff when he arrived and they did not know what to expect. The article continues, "he drifted to one end of the room and started shaking hands with every single person present. ... Many a staffer who never had the opportunity to speak to a president was dumbfounded by the personal attention they received from the living legend." Yet this seems, from my limited experience elsewhere on the continent to be considered good form when joining a meeting** - although not expected of a high status person. But as the article implies, this was an example of Mandela's style of leadership.

One wonders if, in the brief conversations Mandela had with the staffers that day, there were any exchanges in the diverse languages of the country...


* The following paper has a lot of this history: Beukes, Anne-Marie, "Language policy implementation in South Africa: How Kempton Park's great expectations are dashed in Tshwane," Stellenbosch Papers in Linguistics, Vol. 38, 2008, pp. 1-26.
** It's a habit I personally got comfortable with after several years in West Africa. I remember in one gathering I joined shortly after returning to the US, realizing folks were dumbfounded as to why I, a living stranger, was shaking each person's hand.

Thursday, December 05, 2013

Ethnologue and "national languages" in Africa

In this post I'll discuss the second of two aspects of Ethnologue's presentation that seem to me to detract from its overall quality. The previous one was on cross-border languages being titled simply as "a language of" a single country. This one deals with how the new, 17th edition of Ethnologue* uses the term "national language" in presenting summary information about countries.

The Country information/summary pages in the current version of Ethnologue appear as the first tab in a set of pages for each country. These Country information pages feature a table that in the third row indicates "National languages." This replaces "National or official language(s)" in the 16th edition** (there is also a significant redesign of the presentation). This seeming simplification actually is problematic in the case of many African countries which use the term national language in a way different than that in the current Ethnologue. .

For example, if one goes to Ethnologue's current page on Niger, one sees a single language - French - listed as the national language (compare the page in the previous edition) - the same as for France. However in Niger and by Nigeriens, French is not called the national language, but rather the official language. "National language" (langue nationale in French) is a legal and widely understood category for the endogenous languages, that is separate and distinguished from "official language." Same with perhaps 20 other countries in Africa. The choice of terms by these countries was (is) deliberate and meaningful, but was it taken into account when revising Ethnologue's use of terms?

Although one appreciates the challenges of finding terms that work in a reference that seeks to cover all languages and all countries, this particular choice of term on a summary page does not seem at all fortuitous from the point of view of information on Africa.

Official, national, and local languages: An example

When I was on the Peace Corps staff in Niger in the early 2000s, a somewhat similar question arose. The then new regional training officer for West Africa - an American - referred to the consideration of language training approaches in terms of a choice of emphasis on "national language or local languages." However, "national language" in most of the countries we were talking about actually means what in Peace Corps is often referred to as "local language." The real distinction for our use, I suggested, was actually between "official language" and "local languages" (although those terms also have some shortcomings).

The issue was communication, not just semantics or formality of usage. If we started using the term "national language" in a way different from our host country colleagues and counterparts, it could create unnecessary confusion (aside from appearing to ignore what is effectively a common regional usage). And by the same token, since many American staff would tend to hear "national language" as "nationwide" language, it would not make sense to oblige everyone to conform to host countries' use of the term. Better simply to avoid the term "national language" in our planning in favor of alternatives that were known and clear to all.

The choice in this case was fairly straightforward, but it may be worth keeping in mind when considering the more complicated set of terminology choices facing Ethnologue.

"National language" in Ethnologue

Ethnologue defines its current use of the term "national languages" on the country information pages in this way:
"Languages which have been categorized as national languages at EGIDS [Expanded Graded Intergenerational Disruption Scale] level 1 are listed here. This includes all the languages that are actually used for education, work, mass media, and government at the nationwide level, regardless of how they are classified in legislation."
Ethnologue's use of the term national languages on the initial country information page therefore comes out of a broader system for classifying languages and helping to understand their status and condition. This is important work and the schema they have developed is of great value. The issue here however is terminology.

In that system, references to "official language" have effectively been eliminated. Editor Paul Lewis, in discussing the terminology changes in his Ethnoblog posting, "Functions of Languages in Countries" (31 October 2013), gave the example of the Turkish language, which is listed as a "statutory national language," even though in the Turkish constitution reference cited (article 3) it is actually called the official language.

However this schema also relies on various modifications of "national language" (and terms relating to nation and nationality):
  • Under EGIDS, "national" (meaning effectively "nationwide" but overlapping a lot with common and language policy use of "official language")
  • Under "Official recognition categories and definitions," which is how Ethnologue now deals with official status, several descriptors:
    • Statutory national language
    • Statutory national working language
    • Statutory language of national identity
    • De facto national language
    • De facto national working language
    • De facto language of national identity
    • Language of recognized nationality
As it stands, it does not seem that Ethnologue has a way of describing the language situation in many African countries that does not collide with established local and indeed regional usage. The official language is called "national" and the national languages seem to fall mainly under the somewhat sterile rubric "recognized."

Nation, national, nationality ...

A lot revolves around the meanings assigned to and understood by the term "nation" and its derivatives. These can on the one hand refer to a country or nation-state (nationwide, or in/by all of the country), which is how Ethnologue appears to use them. On the other hand, they may also have more visceral and identity-related meanings, which is how I understand the main African use of the term. Ethnologue's categories relating to languages of national idendity are also along the latter lines.

Part of  the problem is that Ethnologue uses "national" across this range of meanings, even if primarily in the "nationwide" sense. And in the case of many countries of Africa it has in effect switched the potentially more identity-related term "national" with the more formal term "official" for some languages, and the more formal term "recognized" for "national" in the case of other languages.

To a certain extent, one can in academic and reference publications use a term in a particular way by defining it clearly, as Ethnologue attempts with the above. However, when one's chosen usage is at variance with an existing legal and common use of the term, and the term itself has many applications and colors of meaning, it is often worth looking for alternatives.

Distinct meanings of "national  language"

The term "national language" actually turns out to have more than just the two uses discussed above (per "nationwide" and legal status as "national language"). Conrad Brann, who has taught and researched on language policy and multilingual societies for decades in Nigeria, suggests that in Africa there are actually "four quite distinctive meanings" of the term*** (which I've numbered for ease of reference):
  1. "Territorial language" (chthonolect or chtonolect) of a particular people
  2. "Regional language" (choralect)
  3. "Language-in-common or community language" (demolect) used throughout a country
  4. "Central language" (politolect) used by government and perhaps having a symbolic value.
The African usage that I highlight above is probably mainly under #1, but depending on the country, nos. 2 (DRC, Ethiopia) and/or 3 (see below). In a few African countries it corresponds with #4 and with Ethnologue's usage (Lesotho, Burundi).

Ethnologue's definition of "national language" seems to overlap #4 and in some cases #3 in Brann's typology. However, in some countries more than others, the official languages Ethologue lists as national languages may be less "languages-in-common" (#3) than some languages that those countries call national: The case of Wolof in Senegal, used widely as a first or second language, comes to mind.

Looked at from this perspective, "national language" seems to be a conglomerate of concepts, which requires clarification as to intent based on the context and intent. As such, it seems ill-suited for using in a quick reference page of any sort. Add to that the fact that many African countries use the term in a particular way, it would seem less than fortuitous for Ethnologue to choose it to use as a simple category for all countries.

Another look at Ethnologue's Country information pages

Another problem with Ethnologue's current approach to listing "National languages" is that for countries like Niger or Senegal, the top of the language summary gives the impression that French is the only language of importance. On the Senegal page, Wolof, for instance, does not appear anywhere. Nor do any of the other main languages of the country. On the other hand, there is a list of "immigrant languages" by name. As such the country information page - the first place a user would come in this reference to find information on languages in a country - seems very selective in the information it highlights.

A related question: Paul Lewis's Ethnoblog posting referred to above refers to continued use of the category "National or Official Languages" on the Country information tab, under which one would find all languages that they "have identified as national languages, or national working languages, or national languages of identity whether statutory or de facto." Is it possible that the issues discussed in my posting here may have arisen from a recent change within Ethnologue 17th edition's presentation of information on the Country information pages?

Alternatives

For a reference like Ethnologue that aims to cover all countries and all languages, a term that means different things to different people by itself would seem to make it a problematic choice for an information heading. One simple way to resolve this would be to return to the use of the former rubric - "National or Official Languages"- or something similarly broad, which allows naming of principal languages on the Country information pages.

Considering the issues I've raised with "national language" as well as the range of uses of the term highlighted by Prof. Brann and indeed by Ethnologue's own schema, it may also be worth considering whether and how to avoid using "national language" altogether in the schema and the titling, just as Ethnologue now omits "official language."


* Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig (eds.). 2013. Ethnologue: Languages of the World, 17th ed. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com
** Lewis, M. Paul (ed.), 2009. Ethnologue: Languages of the World, 16th ed. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com/16
*** Brann, C.M.B. 1994. "The National Language Question: Concepts and Terminology." Logos [University of Namibia, Windhoek] Vol 14: 125–134.