Not sure about the smallest token count for AI training, but DeepSeek models like V3 and R1 are known for efficiency. I couldn’t find a definitive list, but some speculate DeepSeek used minimal tokens compared to giants like GPT. Wild stuff
You are viewing a single comment's thread from: