Generative Data Intelligence

Grok-1 chatbot code released into the wild

Date:

As promised, Elon Musk has released the model behind the xAI chatbot, Grok-1.

Released under the Apache 2.0 license, the base model weights and network architecture are now available. The model has 314 billion parameters and needs hardware with enough GPU memory to run it. It is fine-tuned for applications such as natural language dialog, and represents the raw base model checkpoint from the pre-training phase, which concluded in October 2023.

Critics have pointed to relatively lackluster performance in benchmarks; while Grok is a big model, it does poorly from what some engineers have seen. “Considering how poor it is compared to other models, it really emphasises how important fine tuning is. Models with MUCH smaller parameter counts are outperforming it in many metrics,” said one poster on the Hacker News forums last night.

You might find that the smaller Mistral performs just as well as Grok-1, for instance.

To put its size in perspective, even at 314 billion parameters, it still has some catching up to do with OpenAI’s GPT-4, which had 1.76 trillion params at last count.

Notably, unlike existing LLMs, which are trained on data with a cutoff point in time, Grok has access to real time data corpus of everyone’s Xeets via X.com, which should make for some interesting experiments in the days to come, although as another commenter noted: “Twitter tweet data in itself is both highly idiosyncratic and short by design, which alone is not conducive towards training a LLM.”

Grok will be familiar to users of Musk’s social media platform, X, and subscribers have been able to ask the chatbot questions and receive answers. According to xAI, Grok was modeled after The Hitchhiker’s Guide to the Galaxy. “It is intended to answer almost anything and, far harder, even suggest what questions to ask.”

If a user flicks through a dog-eared copy of The Hitchhiker’s Guide to the Galaxy radio scripts, the following definition can be found lurking in Fit the Tenth: “The Hitchhiker’s Guide to the Galaxy is an indispensable companion to all those who are keen to make sense of life in an infinitely complex and confusing universe, for though it cannot hope to be useful or informative on all matters, it does make the reassuring claim that where it is inaccurate, it is at least definitively inaccurate.

“In case of major discrepancy it is always reality that’s got it wrong.”

The release comes on the first anniversary of the launch of OpenAI’s GPT-4 model, and Musk’s legal spat with his former AI pals remains in the background. At the beginning of this month, Musk sued OpenAI, alleging there was little open about the company, despite its name. OpenAI responded by releasing a trove of emails, claiming Musk was fully aware of its plans and wanted it folded into Tesla.

Patrik Backman, general partner at OpenOcean, said of Grok-1 being released: “For once, Elon Musk is putting his principles into action. If you sue OpenAI for transforming into a profit-driven organization, you must be prepared to adhere to the same ideals.”

What hasn’t been released by xAI is also of note. The Grok-1 weights are out there, yet the data used for training is not available under the same license, prompting AI expert Gary Marcus to quip: “PartlyOpenAI.”

Open sourcing generative AI tools has proven controversial. Some developers worry that making the technology available risks abuse and others point to the inherent benefits of transparency.

Meta shared – sort of – its Llama 2 models last year, and other companies have followed suit. OpenAI, on the other hand, has most definitely not.

By opening up the weights behind Grok-1, Musk is attempting to plant a flag in the opposite camp to the proprietary world of OpenAI.

As for its ultimate performance, like everything Musk touches, it could go either way. ®

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?