Smaller models can be ran on consumer hardware and with things like RAG(Retrieval Augmented Generation) where you can basically build a database of information on your computer and feed it to the models, basically what has been done with Rafiki in Threads, but you do this locally on your machine. That's where I am going with AI at least is all local. I spent money on a machine to run some bigger LLMs, but finding that with some training, the smaller ones can do just as well. The problem is, even with training, they don't have the reasoning skills to talk to you like one of the frontier models do, they are extremely robotic. It's not until you get to 30B and up models can they really start conversing with you like ChatGPT or Grok does, and even then they are weak. I use Llama 70B (70 billion parameters) on my machine and I actually can get some good conversation out of it.
But anything below 30B is going to seem like Siri, lol.