Description

The Ganda language or Luganda is a Bantu language spoken in the African Great Lakes region. It is one of the major languages in Uganda, spoken by more than eight million Baganda and other people principally in central Uganda, including the capital Kampala of Uganda. It belongs to the Bantu branch of the Niger-Congo language family. Typologically, it is a highly-agglutinating, tonal language with subject-verb-object, word order, and nominative-accusative morphosyntactic alignment [1].

Language profile: Luganda

Building a database for Luganda language in Africa
Building a database for Luganda language in Africa

With about six million first-language speakers in the Buganda region and a million others fluent elsewhere, it is the most widely spoken Ugandan language. As a second language, it follows English and precedes Swahili [1].

The language is used in all domains: education, media, telecommunication, trade, entertainment and in religious centres [2]. It has been used at lower institutions as pupils begin to learn English and until the 1960s, Luganda was also the official language of instruction in primary schools in Eastern Uganda [1]. It is also among the languages that have been tabled in the East African parliament to be selected as the official language for the East Africa Community [2].

Existing Work

As the use of the Luganda language is drastically growing across the different sectors from formal to informal, there has been work done on building a Luganda corpus and developing NLP models such as a  Luganda text to speech machine [3]; English noun phrase to Luganda translator [4], smart Luganda language translator – given a source text in English it translates it to Luganda automatically [5]. To broaden access to search, a Luganda interface was launched for Google web search [6]. However, some of these applications have been developed based on minimal data.

In terms of language resources, there exists the Luganda Bible [7] which is an online Bible from the Word Project [13] and other religious books from the Jehovah’s Witnesses [12]. There exist some good online Luganda dictionaries like the globe dictionary [10], learn Luganda [9], Luganda phrasebook [9], learn Luganda concise [11] dictionary and Luganda Dictionary [8]. However, most of the available dictionaries are copyrighted and contain just a few word extracts in the language which results in a small representation for a language like Luganda where new words are being created and spoken every year.

Quite recently, a drive has been made by a team of researchers from the Makerere AI Lab [15] to add Luganda to the Common voice platform [14] and it is anticipated that through this project, a large voice dataset for building voice recognition models for Luganda will be generated.

Example of Sentence in Luganda

Luganda: Aboomukyalo bafuna nnyo mu by’obulimi.

English: The people in rural areas benefit a lot from agriculture.

Conclusion

With the regional integration of the East African Community in place, the use of the Luganda language has stretched boundaries from Uganda to the East African community, because most of the native speakers of this language are actively participating in this cooperation. It has been used to support inter-ethnic communication [2].  This, however, stretches beyond East Africa on the other hand.

Therefore, there is a need to build a robust Luganda dataset, which can be made publicly available so that different researchers can use it to build downstream applications such as machine translators, speech recognition machines, chatbots, virtual assistants, sentiment analytic models, in ensuring that information is accessible to all and also addressing some of the local-contextual problems with in the society.

Researcher Profile: Joyce Nakatumba-Nabende

Joyce Nakatumba-Nabende is a lecturer in the Department of Computer Science in Makerere University. She is also the head of the Makerere Artificial Intelligence and Data Science Lab in the College of Computing and Information Sciences. She obtained a PhD in Computer Science from Eindhoven University of Technology, The Netherlands. Her current research interests include Natural Languages Processing, Machine Learning, and Process Mining and Business Process Management. She is co-author of more than 20+ papers published in peer-reviewed international journals and conferences. She has supervised several PhD and Masters students in the field of Computer Science and Information Systems. She is a member of several international AI bodies that include Open for Good Alliance, Feministic AI Network and UN Expert Group Recommendation 3C Group on Artificial Intelligence.

Researcher Profile: Andrew Katumba

Andrew Katumba is a Lecturer in the Department of Electrical and Computer Engineering as well as a senior researcher with netLabs!UG, a research Center of Excellence in  Telecommunications and Networking  both in the College of Engineering, Design, Art & Technology (CEDAT), Makerere University. Andrew champions the research and applied Artificial Intelligence (AI) activities at netLabs!UG as the lead for the Marconi Society Machine Learning Lab. Andrew holds a PhD in Photonics and Machine Learning from the Gent University, Belgium.  He has co-authored 50+ publications in peer-reviewed international journals and conferences and holds 2 patents in neuromorphic computing.

Researcher Profile: Jonathan Mukiibi

Jonathan Mukiibi is a computer science practitioner with a background in software engineering,linguistics, machine learning, big data and natural language processing. Over the past years he has been involved in artificial intelligence based projects like satellite image analysis, radio mining, social media mining, ambulance tracking and traffic which have been successfully implemented to solve real world problems in developing communities. Currently, he is pursuing a Masters in Computer Science at Makerere University where he is also doing research work at the AI and Data Science Research Lab. He is actively working on different NLP tasks but majorly doing research in end-to-end topic classification models for crop pests and disease surveillance from radio recordings.

Researcher Profile: Claire Babirye

Claire Babirye is a computer science professional with vast experience in different computing modules: from computer networks, computer security, network monitoring to machine learning, data science, natural language processing, deep learning technologies and use of technology for improved service delivery. She is a Research Assistant at the AI and Data Science Research Lab Makerere University and her role is to: tap into the revolution to obtain more and better data so as to support development work and humanitarian; support data analytics and visualization to generate patterns on insights on the data;  develop machine learning models for classification. Within the domain of NLP, she has worked on tasks that involve: sentiment analysis on social media data and text classification to identify topics of interest from the farmer agricultural data.

References

  1. https://en.wikipedia.org/wiki/Luganda
  2. https://www.open.edu/openlearn/languages/more-languages/linguistics/english-squeezing-out-local-languages-uganda
  3. Nandutu, I., & Mwebaze, E. (2020). Luganda Text-to-Speech Machine. arXiv preprint arXiv:2005.05447.
  4. https://www.researchgate.net/publication/338036914_Model_for_Translation_of_English_Language_Noun_Phrases_to_Luganda
  5. https://www.researchgate.net/publication/323682143_Smart_Luganda_Language_Translator
  6. https://africa.googleblog.com/2009/07/how-volunteer-translators-impact-local.html
  7. https://play.google.com/store/apps/details?id=com.LugandaBible&hl=en
  8. https://web.archive.org/web/20080120211744/http://www.cbold.ddl.ish-lyon.cnrs.fr/CBOLD_Lexicons/Ganda.Snoxall1967/Text/Ganda.Snoxall1967.txt
  9. https://learn-luganda.com/
  10. https://glosbe.com/lg/en/
  11. https://learnluganda.com/concise
  12. https://wol.jw.org/lg/wol/h/r138/lp-lu
  13. https://www.wordproject.org/bibles/lug/index.htm
  14. https://commonvoice.mozilla.org/lg
  15. http://www.air.ug/natural-language-processing

Partners

Partners in Cracking the Language Barrier for a Multilingual Africa
Partners in Cracking the Language Barrier for a Multilingual Africa

Disclaimer

The designations employed and the presentation of material on these map do not imply the expression of any opinion whatsoever on the part of the Secretariat of the United Nations concerning the legal status of any country, territory, city or any area or of its authorities, or concerning the delimitation of its frontiers or boundaries. Final boundary between the Republic of Sudan and the Republic of South Sudan has not yet been determined. Final status of the Abyei area is not yet determined.