Generative Data Intelligence

Tag: AWS Inferentia

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock – Part 2 | Amazon Web Services

In Part 1 of this series, we presented a solution that used the Amazon Titan Multimodal Embeddings model to convert individual slides from a...

Top News

Generative AI roadshow in North America with AWS and Hugging Face | Amazon Web Services

In 2023, AWS announced an expanded collaboration with Hugging Face to accelerate our customers’ generative artificial intelligence (AI) journey. Hugging Face, founded in 2016,...

Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia | Amazon Web Services

This is a guest post co-written with Michael Feil at Gradient. Evaluating the performance of large language...

Best practices to build generative AI applications on AWS | Amazon Web Services

Generative AI applications driven by foundational models (FMs) are enabling organizations with significant business value in customer experience, productivity, process optimization, and innovations. However,...

Run ML inference on unplanned and spiky traffic using Amazon SageMaker multi-model endpoints | Amazon Web Services

Amazon SageMaker multi-model endpoints (MMEs) are a fully managed capability of SageMaker inference that allows you to deploy thousands of models on a single...

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 1 | Amazon Web Services

With the advent of generative AI, today’s foundation models (FMs), such as the large language models (LLMs) Claude 2 and Llama 2, can perform...

Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium | Amazon Web Services

Today, we’re excited to announce the availability of Llama 2 inference and fine-tuning support on AWS Trainium and AWS Inferentia instances in Amazon SageMaker...

Fine-tune Llama 2 using QLoRA and Deploy it on Amazon SageMaker with AWS Inferentia2 | Amazon Web Services

In this post, we showcase fine-tuning a Llama 2 model using a Parameter-Efficient Fine-Tuning (PEFT) method and deploy the fine-tuned model on AWS Inferentia2....

Welcome to a New Era of Building in the Cloud with Generative AI on AWS | Amazon Web Services

We believe generative AI has the potential over time to transform virtually every customer experience we know. The number of companies launching generative AI...

Scale foundation model inference to hundreds of models with Amazon SageMaker – Part 1 | Amazon Web Services

As democratization of foundation models (FMs) becomes more prevalent and demand for AI-augmented services increases, software as a service (SaaS) providers are looking to...

Reduce model deployment costs by 50% on average using the latest features of Amazon SageMaker | Amazon Web Services

As organizations deploy models to production, they are constantly looking for ways to optimize the performance of their foundation models (FMs) running on the...

Minimize real-time inference latency by using Amazon SageMaker routing strategies | Amazon Web Services

Amazon SageMaker makes it straightforward to deploy machine learning (ML) models for real-time inference and offers a broad selection of ML instances spanning CPUs...

How Amazon Search M5 saved 30% for LLM training cost by using AWS Trainium | Amazon Web Services

For decades, Amazon has pioneered and innovated machine learning (ML), bringing delightful experiences to its customers. From the earliest days, Amazon has used ML...

Latest Intelligence

spot_img
spot_img
spot_img

Chat with us

Hi there! How can I help you?