New open-source ‘Falcon’ AI language model beats Meta and Google

The artificial intelligence community has a new wing with the release of Falcon 180B, an open-source Large Language Model (LLM), boasting 180 billion parameters trained on mountains of data. This powerful innovator has surpassed earlier open-source LLMs on several fronts.

The Hugging Face AI community announced in a blog post, Falcon 180B has been released on the Hugging Face Hub. The latest-model architecture builds on the previous Falcon series of open source LLMs, leveraging innovations like multiquery attention to scale up to 180 billion parameters trained on 3.5 trillion tokens.

This represents the longest single-epoch pretraining ever for an open source model. To achieve such scores, 4,096 GPUs were used simultaneously for nearly 7 million GPU hours using Amazon SageMaker for training and refinement.

To put the Falcon 180B’s size in perspective, its parameters are 2.5 times larger than Meta’s LLaMA 2 model. LLaMA 2, previously considered the most capable open-source LLM, boasted 70 billion parameters trained on 2 trillion tokens, upon its launch earlier this year.

The Falcon 180B outperforms the LLaMA 2 and other models in both scale and benchmark performance across a range of natural language processing (NLP) tasks. It ranks 68.74 points on the leaderboard for open access models and approaches parity with commercial models such as Google’s PaLM-2 on evaluations such as the Hellaswag benchmark.

Image: Hugging Face

In particular, the Falcon 180B matches or exceeds the PaLM-2 medium on commonly used benchmarks, including HellaSwag, LAMBADA, WebQuestions, Winogrande, and more. It is basically equivalent to Google’s PaLM-2 Large. This represents an extremely strong performance for an open-source model, even when compared to solutions developed by industry giants.

Compared to ChatGPT, the model is more powerful than the free version but slightly less capable than the paid “Plus” service.

“Falcon 180B typically relies on evaluation benchmarks somewhere between GPT 3.5 and GPT4, and further fine-tuning from the community will be very interesting now that it’s openly released.” The blog says.

The release of the Falcon 180B marks a new leap in the rapid progress made in LLM recently. Beyond just increasing parameters, techniques like LoRAs, weight randomization, and Nvidia’s Perfusion have enabled dramatically more efficient training of large AI models.

With the Falcon 180B now freely available on Hugging Face, researchers hope the model will see additional benefits with further improvements developed by the community. However, its demonstration of advanced natural language capabilities right out of the gate represents an exciting development for open-source AI.

Leave a Comment