Databricks Launches DBRX General-purpose LLM
- By Paul Mah
- April 03, 2024
Databricks last week announced the launch of DBRX, a general-purpose large language model (LLM) that it claims outperforms all established open-source models on standard benchmarks.
According to Databricks, DBRX democratizes the training and tuning of custom, high-performing LLMs for every enterprise, so they no longer need to rely on a small handful of closed models. This enables organizations to cost-effectively build, train, and serve their own custom LLMs.
Outperforms open LLMs
Databricks says DBRX outperforms LLMs like Meta’s Llama 2 70B and Mixtral-8x7B on standard industry benchmarks around language understanding, programming, math, and logic.
Behind the scenes, DBRX was developed by Mosaic AI and trained on Nvidia’s DGX Cloud. Mosaic AI is a startup that was acquired by Databricks in a blockbuster USD1.3 billion deal last year.
According to its announcement, Databricks optimized DBRX for efficiency with a mixture-of-experts (MoE) architecture, built on Databrick’s MegaBlocks open-source project.
MegaBlocks enables end-to-end training speedups of up to 40% over MoEs trained with the state-of-the-art Tutel library and over 2.4 times over DNNs trained with the Megatron -LM framework. Indeed, Databricks says its DBRX model has leading performance and is up to twice as compute-efficient as other available leading LLMs.
Unlike OpenAI’s GPT-4 or Google’s Gemini, DBRX is open-source, which means enterprises can download and fine-tune it for their use. A recent survey from Andreessen Horowitz found that nearly 60% of AI leaders are interested in increasing open source usage or switching when fine-tuned open source models roughly match the performance of close source models.
"At Databricks, our vision has always been to democratize data and AI. We're doing that by delivering data intelligence to every enterprise — helping them understand and use their private data to build their own AI systems. DBRX is the result of that aim," said Ali Ghodsi, co-founder and CEO at Databricks.
“DBRX uses a mixture-of-experts architecture, making the model extremely fast in terms of tokens per second, as well as being cost-effective to serve. All in all, DBRX is setting a new standard for open source LLMs – it gives enterprises a platform to build customized reasoning capabilities based on their own data," said Ghodsi.
Image credit: iStock/hapabapa
Paul Mah
Paul Mah is the editor of DSAITrends, where he report on the latest developments in data science and AI. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose.