You are viewing a single comment's thread from:

RE: LeoThread 2025-11-05 15-48

in LeoFinance21 days ago

Part 7/15:

Data scarcity remains a bottleneck. Only a fraction of human knowledge resides online, with much in private datasets, books, proprietary content, and non-digitized media. This creates a gap: to achieve true Artificial General Intelligence (AGI), models might need access to datasets beyond the internet, including academic journals, books, proprietary datasets.

Why this matters:

  • Currently, the scaling law indicates that more data yields better model performance—but only if the data is relevant and high-quality.

  • OpenAI's recent projects, like Whisper, point to a strategy of converting diverse media types into text data, aiming for richer multimodal capabilities.

Modality: Text, Images, and Audio