You are viewing a single comment's thread from:

RE: LeoThread 2025-11-09 22-46

in LeoFinance20 days ago

Part 6/11:

Arya has been tested across numerous benchmarks, where it consistently demonstrates competitive performance. It has outperformed other open-source models like Pixol 12B and Llama 3.2 111B. More notably, against proprietary giants such as GPT-4 and Claude 3.5, Arya has held its ground:

  • In document question-answering tasks, Arya scored 92.6%, surpassing many larger models.

  • It scored 66.8% on long video understanding benchmarks and 72.1% across related video tasks.

  • Its long context window—capable of processing 64,000 tokens simultaneously—allows it to analyze lengthy documents or videos without losing focus, providing a significant edge over models with more limited context capacities.

The Road to Development