Using AI to preserve the Choctaw language


Thousands of languages ​​are on the brink of extinction. In fact, of the 7,000 languages ​​spoken in the world today, nearly half are likely to disappear during this century, according to UNESCO. For Lina Brixey, a member of the Choctaw Nation of Oklahoma, it’s personal.

A linguistics graduate and polyglot who speaks French, Spanish and Portuguese, Brixey only started learning Choctaw after moving to Los Angeles in 2016 to pursue her doctorate. in computer science at USC Viterbi. “I always came back to a question,” Brixey said. “Why don’t I speak my own language? “

She is not alone. Like hundreds of native American languages, Choctaw is endangered, which means that without intervention it is likely to become extinct in the near future. Despite being the third largest tribe in the United States, recent estimates suggest that there are only 7,000 Choctaw speakers left. Crucially, as Brixey discovered, when a language dies, we lose more than words: we lose unique cultures, traditions and global perspectives.

“Growing up, I had my tribal registration card, which is kind of a pedigree, but I never really felt Choctaw until I could speak the language to a degree,” Brixey said. “When I started learning the language and meeting other Choctaw, I realized the urgency of the situation. “

So she decided to do something. At USC’s Institute for Creative Technologies (ICT), Brixey created the world’s first Choctaw linguistic corpus – a collection of written and spoken texts essential to the study of languages ​​- a bilingual chatbot and dialogue system for linguistic documentation.

See a timeline of the Choctaw people and language. Click here.

Survival mindset
Brixey is Choctaw on his father’s side. (His mother is of Irish descent.) His great-grandfather, Oklahoma pastor and farmer Noah Frazier, was the last person in his family to speak Choctaw fluently.

“From the 1800s on, there was social pressure in schools not to have Choctaw and Indigenous people speak their language,” Brixey said. “I think there was also a survival mentality – maybe my great-grandparents thought it was more important for my grandmother to learn English so that she had access to more of opportunities. “

Nonetheless, Brixey was curious about her ancestral language. When she was 12, her sister received a special gift from her grandmother: a Choctaw dictionary. Lina and her sister practiced secret conversations in Choctaw, but with no fluent speakers to teach, their enthusiasm eventually waned.

“This is something that many of us who learn Choctaw and other Indigenous languages ​​lack: we just don’t have access to fluent speakers,” Brixey said.

Living avatars
In the decades that followed, Brixey earned an undergraduate degree in journalism; studied abroad in Argentina, Brazil and Belgium; taught English in Spain and France; and earned a master’s degree in linguistics and computer science from the University of Texas, El Paso.

It has found its place in natural language processing, a subfield of artificial intelligence that aims to enable computers to process and understand human language. After arriving at USC, Brixey put his language and computer skills to work in the development of a Choctaw linguistic corpus. Named Choco, it now comprises over 300,000 Choctaw words and phrases painstakingly collected by Brixey from written and spoken archival documents.

At the same time, Brixey worked on the USC Shoah Foundation’s Dimensions in Testimony, which allows visitors to have one-on-one conversations with “living avatars” of Holocaust survivors. She tinkered with the back-end of the system, developed at ICT, and found that it worked a bit like a chatbot: when asked a question, the system scans a database of potential answers to select the answer. more appropriate, simulating a real conversation.

This gave Brixey an idea: if people didn’t have access to Choctaw speakers fluently, could she just invent one?

Beautiful sky ahead
It turned out she could. Using the same back-end system, Brixey developed a chatbot called Masheli, Choctaw for “fair sky”. Working under the supervision of Professor David Traum of USC Viterbi, Brixey selected 17 stories to form the chatbot’s responses. The conversational chatbot can “speak” in English or Choctaw and read stories in both languages. Brixey hopes it will serve as a resource for school children and adults interested in learning Choctaw.

But practicing the language is only part of the preservation equation; the other half is documentation. Brixey therefore created a dialogue system that encourages speakers of endangered languages ​​to converse and tell stories, creating audio recordings to support language research and revitalization. In 2019, she presented the system to the United Nations General Assembly for the International Year of Indigenous Languages.

Brixey is currently working on an automatic speech recognition system, much like the system used for Holocaust survivors. She also archives her corpus in museums based in Oklahoma for use by other researchers and language learners. Beyond that?

“The sky is the limit,” she said. “As this is the first and only corpus for Choctaw, I am delighted to have laid the groundwork to help other Choctaw researchers. I hope that one day we can have a living avatar system for indigenous peoples to preserve our languages ​​and our histories. There are technical challenges to overcome, of course, but that is the goal.

Seven generations
It’s an Indigenous perspective to talk about seven generations: what can I do today that will positively impact people in seven generations? “I guess that’s something that I embodied,” Brixey said.

She and members of the Los Angeles Choctaw Language Community Class together translated five children’s books. His dream? To see films translated into Choctaw, and even Choctaw podcasts. Brixey, along with countless other speakers of endangered languages, is not ready to let their ancestral language go down in history, no matter how bumpy the road to renewal.

“It sounds like a layoff when someone says it’s not worth working on these languages,” Brixey said. “Yes, our languages ​​are in danger. But the point is that our languages ​​are not yet dead, they are very much alive. There is hope. If conservation efforts can bring wolves back from the threat, I think this is also true for languages.

Using AI to preserve the Choctaw language

By Caitlin Dawson

Source link


About Author

Comments are closed.