生成數據智能

Apple 發布 OpenELM,一個稍微更準確的 LLM

日期:

Apple, not normally known for its openness, has released a generative AI model called OpenELM which apparently outperforms a set of other language models trained on public data sets.

It’s not by much – compared to OLMo, which debuted in February, 開放ELM is 2.36 percent more accurate while using 2x fewer pretraining tokens. But it’s perhaps enough to remind people that Apple is no longer content to be the wallflower at the industry AI rave.

Apple’s claim to openness comes from its decision to release not just the model, but its training and evaluation framework.

“Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations,” explain eleven Apple researchers in the associated 技術論文.

And diverging from academic practice, the authors’ email addresses are not listed. Chalk it up to Apple’s interpretation of openness, which is somewhat comparable to the not-very-open OpenAI.

隨行 軟件發布 is not a recognized open source license. It’s not unduly restrictive, but it does make clear that Apple reserves the right to file a patent claim if any derivative work based on OpenELM is deemed to infringe on its rights.

OpenELM utilizes a technique called layer-wise scaling to allocate parameters more efficiently in the transformer model. So instead of each layer having the same set of parameters, OpenELM’s transformer layers have different configurations and parameters. The result is better 準確性, shown in the percentage of correct predictions from the model in benchmark tests.

We’re told that OpenELM was pre-trained using the 紅色睡衣 dataset from GitHub, a ton of books, Wikipedia, StackExchange posts, ArXiv papers, and more, and the 多爾瑪 set from Reddit, Wikibooks, Project Gutenberg, and more. The model can be used as you might expect: You give it a prompt, and it attempts to answer or auto-complete it.

One noteworthy aspect of the release is that it is accompanied by “code to convert models to MLX library for inference and fine-tuning on Apple devices.”

MLX is a framework released last year for running machine learning on Apple silicon. The ability to operate locally on Apple devices, rather than over the network, should make OpenELM more interesting to developers.

“Apple’s OpenELM release marks a significant advancement for the AI community, offering efficient, on-device AI processing ideal for mobile apps and IoT devices with limited computing power,” Shahar Chen, CEO and co-founder of AI service biz Aquant, told 註冊. “This enables quick, local decision-making essential for everything from smartphones to smart home devices, expanding the potential for AI in everyday technology.”

Apple is keen to show the merits of its homegrown chip architecture for machine learning, specifically supported in hardware since Cupertino introduced its 神經引擎 in 2017. Nonetheless OpenELM, while it may score higher on accuracy benchmarks, comes up short in terms of performance.

“Despite OpenELM’s higher accuracy for a similar parameter count, we observe that it is slower than OLMo,” the paper explains, citing tests run using Nvidia’s CUDA on Linux as well as the MLX version of OpenELM on Apple Silicon.

The reason for the less than victorious showing, Apple’s boffins say, is their “naive implementation of 均方根標準值,” a technique for normalizing data in machine learning. In the future, they plan to explore further optimizations.

OpenELM is available in pretrained and instruction tuned models with 270 million, 450 million, 1.1 billion and 3 billion parameters. Those using it are warned to exercise due diligence before trying the model for anything meaningful.

“The release of OpenELM models aims to empower and enrich the open research community by providing access to state-of-the-art language models,” the paper says. “Trained on publicly available datasets, these models are made available without any safety guarantees.” ®

現貨圖片

最新情報

現貨圖片

和我們線上諮詢

你好呀!我怎麼幫你?