AWS Inferentia - Plato Data Intelligence

Generative AI roadshow in North America with AWS and Hugging Face | Amazon Web Services

AIApril 2, 2024

In 2023, AWS announced an expanded collaboration with Hugging Face to accelerate our customers’ generative artificial intelligence (AI) journey. Hugging Face, founded in 2016,...

Gradient makes LLM benchmarking cost-effective and effortless with AWS Inferentia | Amazon Web Services

AIApril 2, 2024

This is a guest post co-written with Michael Feil at Gradient. Evaluating the performance of large language...

Best practices to build generative AI applications on AWS | Amazon Web Services

AIMarch 14, 2024

Generative AI applications driven by foundational models (FMs) are enabling organizations with significant business value in customer experience, productivity, process optimization, and innovations. However,...

Run ML inference on unplanned and spiky traffic using Amazon SageMaker multi-model endpoints | Amazon Web Services

AIFebruary 19, 2024

Amazon SageMaker multi-model endpoints (MMEs) are a fully managed capability of SageMaker inference that allows you to deploy thousands of models on a single...

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock and Amazon SageMaker – Part 1 | Amazon Web Services

AIJanuary 30, 2024

With the advent of generative AI, today’s foundation models (FMs), such as the large language models (LLMs) Claude 2 and Llama 2, can perform...

Fine-tune and deploy Llama 2 models cost-effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium | Amazon Web Services

AIJanuary 17, 2024

Today, we’re excited to announce the availability of Llama 2 inference and fine-tuning support on AWS Trainium and AWS Inferentia instances in Amazon SageMaker...

12 3 4 Page 1 of 4

Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances | Amazon Web Services

AI July 24, 2023

Reduce energy consumption of your machine learning workloads by up to 90% with AWS purpose-built accelerators | Amazon Web Services

AI June 20, 2023

Machine learning with decentralized training data using federated learning on Amazon SageMaker | Amazon Web Services

AI August 22, 2023

Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances | Amazon Web Services

AI July 24, 2023

Reduce energy consumption of your machine learning workloads by up to 90% with AWS purpose-built accelerators | Amazon Web Services

AI June 20, 2023

Generative Data Intelligence

Tag: AWS Inferentia

Talk to your slide deck using multimodal foundation models hosted on Amazon Bedrock – Part 2 | Amazon Web Services

Top News

Open source observability for AWS Inferentia nodes within Amazon EKS clusters | Amazon Web Services

A secure approach to generative AI with AWS | Amazon Web Services

AWS and Mistral AI commit to democratizing generative AI with a strengthened collaboration | Amazon Web Services

Latest Intelligence

Intuitivo achieves higher throughput while saving on AI/ML costs using AWS Inferentia and PyTorch | Amazon Web Services

Retrieval-Augmented Generation & RAG Workflows

Optimize generative AI workloads for environmental sustainability | Amazon Web Services

Train and deploy ML models in a multicloud environment using Amazon SageMaker | Amazon Web Services

Machine learning with decentralized training data using federated learning on Amazon SageMaker | Amazon Web Services

Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances | Amazon Web Services

Reduce energy consumption of your machine learning workloads by up to 90% with AWS purpose-built accelerators | Amazon Web Services

Machine learning with decentralized training data using federated learning on Amazon SageMaker | Amazon Web Services

Optimize AWS Inferentia utilization with FastAPI and PyTorch models on Amazon EC2 Inf1 & Inf2 instances | Amazon Web Services

Reduce energy consumption of your machine learning workloads by up to 90% with AWS purpose-built accelerators | Amazon Web Services

Chat with us