Apple Releases OpenELM, A Slightly More Accurate LLM

Apple, not normally known for its openness, has released a generative AI model called OpenELM which apparently outperforms a set of other language models trained on public data sets.

It’s not by much – compared to OLMo, which debuted in February, BukaELM is 2.36 percent more accurate while using 2x fewer pretraining tokens. But it’s perhaps enough to remind people that Apple is no longer content to be the wallflower at the industry AI rave.

Apple’s claim to openness comes from its decision to release not just the model, but its training and evaluation framework.

“Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations,” explain eleven Apple researchers in the associated makalah teknis.

And diverging from academic practice, the authors’ email addresses are not listed. Chalk it up to Apple’s interpretation of openness, which is somewhat comparable to the not-very-open OpenAI.

Yang menyertainya rilis perangkat lunak is not a recognized open source license. It’s not unduly restrictive, but it does make clear that Apple reserves the right to file a patent claim if any derivative work based on OpenELM is deemed to infringe on its rights.

OpenELM utilizes a technique called layer-wise scaling to allocate parameters more efficiently in the transformer model. So instead of each layer having the same set of parameters, OpenELM’s transformer layers have different configurations and parameters. The result is better ketepatan, shown in the percentage of correct predictions from the model in benchmark tests.

We’re told that OpenELM was pre-trained using the Piyama Merah dataset from GitHub, a ton of books, Wikipedia, StackExchange posts, ArXiv papers, and more, and the diisi set from Reddit, Wikibooks, Project Gutenberg, and more. The model can be used as you might expect: You give it a prompt, and it attempts to answer or auto-complete it.

One noteworthy aspect of the release is that it is accompanied by “code to convert models to MLX library for inference and fine-tuning on Apple devices.”

MLX is a framework released last year for running machine learning on Apple silicon. The ability to operate locally on Apple devices, rather than over the network, should make OpenELM more interesting to developers.

“Apple’s OpenELM release marks a significant advancement for the AI community, offering efficient, on-device AI processing ideal for mobile apps and IoT devices with limited computing power,” Shahar Chen, CEO and co-founder of AI service biz Aquant, told Pendaftaran. “This enables quick, local decision-making essential for everything from smartphones to smart home devices, expanding the potential for AI in everyday technology.”

Apple is keen to show the merits of its homegrown chip architecture for machine learning, specifically supported in hardware since Cupertino introduced its Mesin syaraf tiruan in 2017. Nonetheless OpenELM, while it may score higher on accuracy benchmarks, comes up short in terms of performance.

“Despite OpenELM’s higher accuracy for a similar parameter count, we observe that it is slower than OLMo,” the paper explains, citing tests run using Nvidia’s CUDA on Linux as well as the MLX version of OpenELM on Apple Silicon.

The reason for the less than victorious showing, Apple’s boffins say, is their “naive implementation of RMSNorm,” a technique for normalizing data in machine learning. In the future, they plan to explore further optimizations.

OpenELM is available in pretrained and instruction tuned models with 270 million, 450 million, 1.1 billion and 3 billion parameters. Those using it are warned to exercise due diligence before trying the model for anything meaningful.

“The release of OpenELM models aims to empower and enrich the open research community by providing access to state-of-the-art language models,” the paper says. “Trained on publicly available datasets, these models are made available without any safety guarantees.” ®

Konten Bertenaga SEO & Distribusi PR. Dapatkan Amplifikasi Hari Ini.
PlatoData.Jaringan Vertikal Generatif Ai. Berdayakan Diri Anda. Akses Di Sini.
PlatoAiStream. Intelijen Web3. Pengetahuan Diperkuat. Akses Di Sini.
PlatoESG. Karbon, teknologi bersih, energi, Lingkungan Hidup, Tenaga surya, Penanganan limbah. Akses Di Sini.
PlatoHealth. Kecerdasan Uji Coba Biotek dan Klinis. Akses Di Sini.
Sumber: https://go.theregister.com/feed/www.theregister.com/2024/04/24/apple_openelm_ai/

Kecerdasan Data Generatif

Apple merilis OpenELM, LLM yang sedikit lebih akurat

PEPE Siap Untuk Lompatan Besar: Kenaikan Harga 80% Masuk?

Bitcoin Melonjak Di Atas $64,000 Setelah GBTC Grayscale Menghentikan Aliran Keluar 78 Hari Beruntun Dengan Uang Baru $63M

Intelijen Terbaru

Kontroversi Hak Cipta Moonbirds Mengungkap Kelemahan dalam Obsesi IP Crypto – Dekripsi

WienerAI Mencapai Tonggak Penting Saat Pemegang Saham Bergegas Mempertaruhkan WAI mereka

Mei 2024: Momentum Kripto Bitgert Coin yang Tak Terhentikan | Berita Bitcoin Langsung

Harga BEFE Coin: Memetakan Arah Menuju Proyeksi Pompa | Berita Bitcoin Langsung

Binance Memperkenalkan Staking Terkunci OM dengan APR hingga 19.9%.

Penambang Bitcoin Meningkatkan Aktivitas Penjualan karena Pertumbuhan Permintaan BTC Melambat: CryptoQuant

Hubungi kami