From Meta blog:
We just released our new Llama 3.2 models. Llama 3.2 is our first major vision model, which means it understands both images and text. To add image support to Llama, we trained a set of adapter weights that integrate with the existing 8B and 70B parameter text-only models to create 11B and 90B parameter models that understand images too. And we continue to improve intelligence—especially on reasoning—making these the most advanced models we’ve released to date.
We’re also releasing super-small 1B and 3B parameter models that are optimized to run on devices, like a smartphone or, eventually, a pair of glasses.
We believe open source AI is the right path forward. It’s more cost-effective, customizable, trustworthy, and arguably performant than the alternative. And we’ll continue to drive Llama forward responsibly with continuous improvements and new capabilities. Learn more about Llama 3.2.