The research center in Indonesia is part of a series on the people and innovations behind the democratization of mobile AI
As Samsung continues to pioneer premium AI mobile experiences, we visited Samsung’s research centers around the world to find out how it’s done. Galaxy artificial intelligence It allows more users to maximize their potential. Galaxy AI now supports 16 languages, allowing more people to expand their language capabilities, even offline, thanks to on-device translation features such as Live Translate, Instant Translator, Notes Assist, and Navigation Assist. But what does the development of AI language consist of? This series covers the challenges of working with mobile AI and how to overcome them. To start, we head to Indonesia to find out where you’re starting to teach AI to speak a new language.
According to the Samsung Research and Development Institute Indonesia (SRIN) team, the first step is setting goals.Great AI starts with good quality, relevant data. Each language requires a different way to address this, so we delve deeper into understanding the language needs and unique circumstances of our country.says Junaidullah Fazlil, head of AI at SRIN, whose team recently added support for Bahasa Indonesia (the Indonesian language) to Galaxy AI. “Local language development must be driven by data and science, so every process of adding languages to Galaxy AI starts with mapping out what information we need and can obtain legally and ethically.“.
Galaxy AI features, like Live Translate, perform three main operations: automatic speech recognition (ASR), neural machine translation (NMT), and text-to-speech (TTS). Each process needs a different set of information.
For example, ASR requires extensive audio recordings in many environments, each accompanied by an accurate transcription of text. Different background noise levels help take different environments into account. “It’s not enough to add noise to recordings“Explains Mochlesin Adi Saputra, ASR head of the team.In addition to the linguistic data we obtain from certified third-party partners, we must go out to coffee shops or work environments to record our voices. This allows us to authentically capture unique real-world sounds, such as people’s voices or keyboard noise.“.
You also have to take into account the changing nature of languages. Saputra adds:We have to stay up to date with the latest slang and how to use it, and most of the time we find it on social media!
Next, NMT requires training data for translation. “Translating Bahasa Indonesia is a challengesays Mohammed Faisal, NMT team leader. “Its extensive use of contextual and implicit meanings relies on social and situational cues, so we need many translated texts that the AI can refer to for new words, foreign words, proper names, idioms, and any information that helps the AI understand the context and rules of communication.“.
Text-to-speech then requires recordings that include a variety of sounds and tones, with additional context about how parts of words are pronounced in different circumstances. “Good audio recordings can do half the work and cover all the phonemes required (sound units of speech) for an AI modeladds Haritz Abdul Rahman, leader of TTS. “If the voice actor did a great job in the previous phase, the focus will shift to perfecting the AI model to pronounce specific words clearly“.
With us we are stronger
Different resources are needed to map a lot of the data, and SRIN has worked closely with linguistics experts. “This challenge requires creativity, ingenuity, and expertise in both Bahasa Indonesia and machine learning“, reflects Fadl.”Samsung’s open collaboration philosophy played a big role in getting the job done, as did the scale of our operations and our history in AI development.“.
By working with other Samsung research centers around the world, the SRIN team was able to quickly adopt best practices and overcome the complexities of setting data targets. Moreover, cooperation has contributed to advances not only in technology, but also in culture. When the SRIN team joined their counterparts in Bangalore, India, they observed local fasting customs, creating deeper connections and expanding their understanding of different cultures.
For the team, the Galaxy AI language expansion project took on a new meaning. “We are particularly proud of our achievements here, as this was our first AI project, and it will not be our last as we continue to refine our models and improve the quality of the results.“Fazal concludes.”This expansion not only reflects our values of openness, but also respects and integrates our cultural identities through language.“.
In the next episode of The Learning Curve, we’ll head to the Jordanian Research and Development Institute to talk with the team that led the Arabic language project at Galaxy AI. Follow us to learn the intricacies of creating and training a multi-dialectal language AI model.
“Beeraholic. Friend of animals everywhere. Evil web scholar. Zombie maven.”