While the goal is to make the model as proficient as possible in all languages, attaining equality across the board could also be challenging.
“That is the goal, but how successful we can be with languages with scarce digital resources is the question,” Hajič said. “But that’s also why we want to have true benchmarks for these languages, and not to be swayed toward benchmarks which are perhaps not representative of the languages and the culture behind them.“