Lean Machines: Small Language Models
One mantra regarding large language models (LLMs) is that bigger is better. Parameters are the learned weights of a model, while tokens represent the pieces of text used to train it. The more training tokens an LLM sees, the more fluent the answers it produces. This assumption that bigger is better is what has driven…
Details











