AWS Inferentia - Plato Data Intelligence

Scale your machine learning workloads on Amazon ECS powered by AWS Trainium instances | Amazon Web Services

AIMay 31, 2023

Running machine learning (ML) workloads with containers is becoming a common practice. Containers can fully encapsulate not just your training code, but the entire...

Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 5: Hosting | Amazon Web Services

AIMay 30, 2023

In 2021, we launched AWS Support Proactive Services as part of the AWS Enterprise Support plan. Since its introduction, we have helped hundreds of...

Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 1 | Amazon Web Services

AIMay 30, 2023

Cost optimization is one of the pillars of the AWS Well-Architected Framework, and it’s a continual process of refinement and improvement over the span...

Build a serverless meeting summarization backend with large language models on Amazon SageMaker JumpStart | Amazon Web Services

AIMay 17, 2023

AWS delivers services that meet customers’ artificial intelligence (AI) and machine learning (ML) needs with services ranging from custom hardware like AWS Trainium and...

Host ML models on Amazon SageMaker using Triton: Python backend | Amazon Web Services

AIMay 9, 2023

Amazon SageMaker provides a number of options for users who are looking for a solution to host their machine learning (ML) models. Of these...

Achieve high performance with lowest cost for generative AI inference using AWS Inferentia2 and AWS Trainium on Amazon SageMaker

AIMay 4, 2023

The world of artificial intelligence (AI) and machine learning (ML) has been witnessing a paradigm shift with the rise of generative AI models that...

How to extend the functionality of AWS Trainium with custom operators

AIApril 27, 2023

Deep learning (DL) is a fast-evolving field, and practitioners are constantly innovating DL models and inventing ways to speed them up. Custom operators are...

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

AIApril 19, 2023

This is a guest post co-written with Fred Wu from Sportradar. Sportradar is the world’s leading sports technology company, at the intersection between sports,...

1 234 Page 3 of 4

Generative Data Intelligence

Tag: AWS Inferentia

Latest Intelligence

Scale foundation model inference to hundreds of models with Amazon SageMaker – Part 1 | Amazon Web Services

Reduce model deployment costs by 50% on average using the latest features of Amazon SageMaker | Amazon Web Services

Minimize real-time inference latency by using Amazon SageMaker routing strategies | Amazon Web Services

How Amazon Search M5 saved 30% for LLM training cost by using AWS Trainium | Amazon Web Services

Intuitivo achieves higher throughput while saving on AI/ML costs using AWS Inferentia and PyTorch | Amazon Web Services

Retrieval-Augmented Generation & RAG Workflows

Optimize generative AI workloads for environmental sustainability | Amazon Web Services

Intuitivo achieves higher throughput while saving on AI/ML costs using AWS Inferentia and PyTorch | Amazon Web Services

Retrieval-Augmented Generation & RAG Workflows

Optimize generative AI workloads for environmental sustainability | Amazon Web Services

Chat with us