Tag Archives: Language

Using Open Source Tools to Create a Mobile Optimized, Crowdsourcing Translation Tool

By Molly Sherman, Jared Fair, Nicole Josephson, Dee Ann Huihui

Link to article: http://journal.code4lib.org/articles/9496

Article synopsis and core research question(s):
One of the main questions Oregon State University Libraries and Press (OSULP) described in this article was the need for a cost-effective solution to publish books in local languages and dialects using crowdsourcing. OSULP intends to create a tool that will benefit children ages two to five years old, while still allowing older children to benefit as well. OSULP found that children are taught to read later in their lives and in some cases, such as in Busia, are taught Swahili up until about third grade and then are taught in English from then on. With so many dialects, it is difficult to find a tool allowing for many dialects to be available.

This article discusses how to find or create a tool allowing for users, initially in Africa, to vote on the correct dialect of children’s books in order to make them available. With creating a tool such as this there are many aspects that must be considered such as the tools simplicity of its interface and usability, is there the ability to access it offline, what are the screen limitations, how quick can users vote, can users edit the books’ pages to add a new translation, and also the need for a system that takes the highest voted sections and combines them to create a completely translated text. OSULP created their tool in GitLab in order to keep the tool and its beginnings open only to their staff; hoping to later release it through GitHub for others to tailor it to their own needs.

Methods used to answer research questions:
In addressing how to create a tool that allows multilanguage translation in promotion of literacy for children in Africa, the following methods and applications of practice were used. First a network of libraries called Maria’s Libraries came together with OSULP to discuss technologies for consideration in a resource poor environment. It was decided that the testing methods would be delivered by mobile device interactions through a website using the following open source libraries: Wink Toolkit and Globalize3. Language translations were accessed as multiple database entries into an application called Ruby on Rails.

In creating this crowdsourcing platform the goal was to have a gateway tool for enabling users to translate folk tales and existing children’s books into their own languages and dialects. This practice was implemented with a simple interface offering a choice of languages. A username and password approach for users along with icons symbols and a voting process were part of a simple user setup. In order to support the most accurate translations of single languages and dialects, user abilities were set to allow for new dialect identification.
The following items were also included in the method design:
· Tablet Computers for easy interactivity.
· A simple interface utilizing language choices.
· Display carousel of books marked for translation.
· Easy navigation between book pages.
· A User voting process for each translation.
· A User editing process with original translation comparisons.
As part of the methods used, a voter incentive process based on comparison of translations was set as default. The simultaneously experience for users is to see all translations that have been made and be able to make the most accurate choice.

Findings:
As the article assigned to us; Using Open Source Tools to Create a Mobile Optimized, Crowdsourced Translation Tool, was not written within a standard format for academic literature and captured only the first phase of an ongoing project, the findings were less formal and more linear than might be expected.
That said, the project explored in this article found that there were three vital parts that presented significant problems and needed to be addressed: administration; additional crowdsourcing and offline functionality.
The administration woes were two fold: 1) the participants were unable to use the interface to upload stories that contained both images and texts, and, 2) the administrators had no ability to override the participants intentions and this fault left them open to “poisoned” or unreliable records.
Additionally, without further crowdsourcing, many users were unable to recommend new languages and dialects for stories or books to be translated into, rendering some of the work inaccessible.
Finally, authors also suggested that without improved offline capabilities the stories and participants work would be greatly diminished.

Conclusion:
The broad, main intent of this project began as an attempt to help literacy in Africa. They discovered that their tool and research were relevant beyond this intent. Indeed, it held exciting potential to do much more; “We are very excited about the possibilities of the usefulness of the platform as a way of publishing books in lesser-known languages or in regions where dialectic publishing is cost prohibitive.”

Unanswered questions you have and what future research might address:
After reading this article, there are a few questions that could be addressed. There is the potential to build a database of sub-dialects with this technology. Will the text be available for export after local translation has taken place? If enough members of a community participate in the translation process and if the texts could be exported into a central database, the potential for published works in each sub-dialect increases.

Will the translations move across the geography in a wave, so each new area has the closest possible translation available from neighboring villages? Or will each linguistic geographical area begin with the same base text? The linguistic preferences of each user will be remembered and set as the default when they log in, but once a book is translated into a specific dialect, will the rest of the tablets in the same geographic area automatically set their defaults to that dialect? Can a mother and child share default settings, while maintaining their own accounts?

What are the consequences of exposing sub-dialect communities to the written text of other areas? Could this tool be used to encourage the building of a shared national language? Are there possible negative consequences of this action, such as the diminishment of local dialects? Are there cultural and ethical questions that must be asked before proceeding? Future research could address the cultural shifts that will take place as a result of introducing written texts for early readers in local dialects to areas that have not previously experienced it.

A thoughtful attempt to answer your own questions:
The authors’ stated purpose of fostering early literacy in order to prepare children for school success raises the question of how this tool will actually support students entering classes that are not taught in their local language. One possible solution to investigate may be designing the product as a bilingual tool that displays the stories with text in multiple languages simultaneously — for example, in both the learner’s local language and the official language. In addition, the anticipated use of this product in language studies programs leads to the question of how it may be used to facilitate aural learning. In this respect, further development and enhancement to consider may be the integration of programming that allows contributors to upload audio recordings of translations along with text. And finally, considering the product’s potential as a tool to increase literacy, aid in language learning, and enable wider access to multicultural children’s literature, additional research and development is justifiable. Few things are free, however, and one must ask how future R&D is to be funded. Using the tool itself as an example, perhaps the answer lies in crowdsourced funding.

Using Open Source Tools to Create a Mobile Optimized, Crowdsourcing Translation Tool

By Kira Painchaud, Molly Sherman, Jared Fair, Nicole Josephsen, and Dee Ann Huihui

Link to article: http://journal.code4lib.org/articles/9496

Article synopsis and core research question(s):
One of the main questions Oregon State University Libraries and Press (OSULP) described in this article was the need for a cost-effective solution to publish books in local languages and dialects using crowdsourcing. OSULP intends to create a tool that will benefit children ages two to five years old, while still allowing older children to benefit as well. OSULP found that children are taught to read later in their lives and in some cases, such as in Busia, are taught Swahili up until about third grade and then are taught in English from then on. With so many dialects, it is difficult to find a tool allowing for many dialects to be available.

This article discusses how to find or create a tool allowing for users, initially in Africa, to vote on the correct dialect of children’s books in order to make them available. With creating a tool such as this there are many aspects that must be considered such as the tools simplicity of its interface and usability, is there the ability to access it offline, what are the screen limitations, how quick can users vote, can users edit the books’ pages to add a new translation, and also the need for a system that takes the highest voted sections and combines them to create a completely translated text. OSULP created their tool in GitLab in order to keep the tool and its beginnings open only to their staff; hoping to later release it through GitHub for others to tailor it to their own needs.

Methods used to answer research questions:
In addressing how to create a tool that allows multilanguage translation in promotion of literacy for children in Africa, the following methods and applications of practice were used. First a network of libraries called Maria’s Libraries came together with OSULP to discuss technologies for consideration in a resource poor environment. It was decided that the testing methods would be delivered by mobile device interactions through a website using the following open source libraries: Wink Toolkit and Globalize3. Language translations were accessed as multiple database entries into an application called Ruby on Rails.

In creating this crowdsourcing platform the goal was to have a gateway tool for enabling users to translate folk tales and existing children’s books into their own languages and dialects. This practice was implemented with a simple interface offering a choice of languages. A username and password approach for users along with icons symbols and a voting process were part of a simple user setup. In order to support the most accurate translations of single languages and dialects, user abilities were set to allow for new dialect identification.
The following items were also included in the method design:
· Tablet Computers for easy interactivity.
· A simple interface utilizing language choices.
· Display carousel of books marked for translation.
· Easy navigation between book pages.
· A User voting process for each translation.
· A User editing process with original translation comparisons.
As part of the methods used, a voter incentive process based on comparison of translations was set as default. The simultaneously experience for users is to see all translations that have been made and be able to make the most accurate choice.

Findings:
As the article assigned to us; Using Open Source Tools to Create a Mobile Optimized, Crowdsourced Translation Tool, was not written within a standard format for academic literature and captured only the first phase of an ongoing project, the findings were less formal and more linear than might be expected.
That said, the project explored in this article found that there were three vital parts that presented significant problems and needed to be addressed: administration; additional crowdsourcing and offline functionality.
The administration woes were two fold: 1) the participants were unable to use the interface to upload stories that contained both images and texts, and, 2) the administrators had no ability to override the participants intentions and this fault left them open to “poisoned” or unreliable records.
Additionally, without further crowdsourcing, many users were unable to recommend new languages and dialects for stories or books to be translated into, rendering some of the work inaccessible.
Finally, authors also suggested that without improved offline capabilities the stories and participants work would be greatly diminished.

Conclusion:
The broad, main intent of this project began as an attempt to help literacy in Africa. They discovered that their tool and research were relevant beyond this intent. Indeed, it held exciting potential to do much more; “We are very excited about the possibilities of the usefulness of the platform as a way of publishing books in lesser-known languages or in regions where dialectic publishing is cost prohibitive.”

Unanswered questions you have and what future research might address:
After reading this article, there are a few questions that could be addressed. There is the potential to build a database of sub-dialects with this technology. Will the text be available for export after local translation has taken place? If enough members of a community participate in the translation process and if the texts could be exported into a central database, the potential for published works in each sub-dialect increases.

Will the translations move across the geography in a wave, so each new area has the closest possible translation available from neighboring villages? Or will each linguistic geographical area begin with the same base text? The linguistic preferences of each user will be remembered and set as the default when they log in, but once a book is translated into a specific dialect, will the rest of the tablets in the same geographic area automatically set their defaults to that dialect? Can a mother and child share default settings, while maintaining their own accounts?

What are the consequences of exposing sub-dialect communities to the written text of other areas? Could this tool be used to encourage the building of a shared national language? Are there possible negative consequences of this action, such as the diminishment of local dialects? Are there cultural and ethical questions that must be asked before proceeding? Future research could address the cultural shifts that will take place as a result of introducing written texts for early readers in local dialects to areas that have not previously experienced it.

A thoughtful attempt to answer your own questions:
The authors’ stated purpose of fostering early literacy in order to prepare children for school success raises the question of how this tool will actually support students entering classes that are not taught in their local language. One possible solution to investigate may be designing the product as a bilingual tool that displays the stories with text in multiple languages simultaneously — for example, in both the learner’s local language and the official language. In addition, the anticipated use of this product in language studies programs leads to the question of how it may be used to facilitate aural learning. In this respect, further development and enhancement to consider may be the integration of programming that allows contributors to upload audio recordings of translations along with text. And finally, considering the product’s potential as a tool to increase literacy, aid in language learning, and enable wider access to multicultural children’s literature, additional research and development is justifiable. Few things are free, however, and one must ask how future R&D is to be funded. Using the tool itself as an example, perhaps the answer lies in crowdsourced funding.