Saturday, December 28, 2019

AfricaNLP2020 (Addis Ababa, 26-4-20) & related items

Quick post to call attention to an upcoming workshop on machine learning (ML) and natural language processing (NLP) in African languages, and its call for participation. Also a list of related initiatives, including the Machine Learning and Data Science in Africa (MLDS Africa) forum.

AfricaNLP2020 workshop - "Unlocking Local Languages"


The AfricaNLP2020 workshop will be held on 26 April 2020 as part of the Eighth International Conference on Learning Representations (ICLR) in Addis Ababa, Ethiopia. The workshop is describes as follows:
"The rise in ML community efforts on the African continent has led to a growing interest in Natural Language Processing, particularly for African languages which are typically low resource languages. This interest is manifesting in the form of national, regional, continental and even global collaborative efforts to build corpora, as well as the application of the aggregated corpora to various NLP tasks."

The workshop aims are described as:
"• to showcase this work being done by the African NLP community and provide a platform to share this expertise with a global audience interested in NLP techniques for low resource languages;
• to provide a platform for the groups involved with the various projects to meet, interact, share and forge closer collaboration;
• to provide a platform for junior researchers to present papers, solutions, and begin interacting with the wider NLP community;
• to present an opportunity for more experienced researchers to further publicize their work and inspire younger researchers through keynotes and invited talks."

Submissions "for oral and poster presentations on a wide variety of NLP tasks for Afrcan languages ...  will be evaluated and selected through a peer review process." Deadline: 1 February 2020. (They can be submitted via EasyChair.org.)

Corpora-building, ML, MT, & NLP initiatives


The workshop page lists six collaborative effortson African languages, which I'll list below, as seen on their page, along with a seventh I learned about recently:
  • Niger-Volta Project - Speech Recognition, Language Identification, Machine Translation & Natural Language Processing for West African Languages 
  • Masakhane.io - A Focus on Machine Translation for African Languages
  • Cocohub.cc - A crowdsourced dataset builder and community for NLP in underrepresented languages (apparently translating MS-COCO captions into Afrikaans, Amharic, Bukusu, Coptic, Fanti, Luganda, Luo, Masai, Meru, and Nandi)
  • Umva.ai - An initiative  to build a Natural Language Processing Platform for Kinyarwanda and to make it available to all developers and for all use cases 
  • EthioNLP - Ethiopian Natural Language Processing Research
  • AI4D - African Language Dataset Challenge - A community effort to help uncover and create African Language Datasets for improved representation in the field of NLP (see also an update on its "Dataset Challenge" from 23 Dec. 2019)
  • PidginUNMT -  Unsupervised Neural Machine Translation from West African Pidgin to English (this was written up on Techcabal on 16 Dec. 2019)
It's great to see this kind of activity related to language technology. I've often thought that multilingual Africa has the potential to lead and innovate in this area.

MLDS Africa


MLDS Africa is an online network with a Googlegroup for communication among research groups such as the above, and a webpage with info on upcoming conferences and workshops, like AfricaNLP2020.

ICLR


The image above connected with the ICLR conference hosting AfricaNLP2020 came from a page on SyncedReview.com with details on papers accepted for the main conference as of 20 Dec 2019. (The workshops on the first day of ICLR, such as AfricaNLP2020, evidently have their own deadlines.)

No comments: