Generative Data Intelligence

Tag: AWS Inferentia

Scale your machine learning workloads on Amazon ECS powered by AWS Trainium instances | Amazon Web Services

Running machine learning (ML) workloads with containers is becoming a common practice. Containers can fully encapsulate not just your training code, but the entire...

Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 5: Hosting | Amazon Web Services

In 2021, we launched AWS Support Proactive Services as part of the AWS Enterprise Support plan. Since its introduction, we have helped hundreds of...

Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 1 | Amazon Web Services

Cost optimization is one of the pillars of the AWS Well-Architected Framework, and it’s a continual process of refinement and improvement over the span...

Build a serverless meeting summarization backend with large language models on Amazon SageMaker JumpStart | Amazon Web Services

AWS delivers services that meet customers’ artificial intelligence (AI) and machine learning (ML) needs with services ranging from custom hardware like AWS Trainium and...

Host ML models on Amazon SageMaker using Triton: Python backend | Amazon Web Services

Amazon SageMaker provides a number of options for users who are looking for a solution to host their machine learning (ML) models. Of these...

Achieve high performance with lowest cost for generative AI inference using AWS Inferentia2 and AWS Trainium on Amazon SageMaker

The world of artificial intelligence (AI) and machine learning (ML) has been witnessing a paradigm shift with the rise of generative AI models that...

How to extend the functionality of AWS Trainium with custom operators

Deep learning (DL) is a fast-evolving field, and practitioners are constantly innovating DL models and inventing ways to speed them up. Custom operators are...

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

This is a guest post co-written with Fred Wu from Sportradar. Sportradar is the world’s leading sports technology company, at the intersection between sports,...

Deploy large models at high performance using FasterTransformer on Amazon SageMaker

Sparked by the release of large AI models like AlexaTM, GPT, OpenChatKit, BLOOM, GPT-J, GPT-NeoX, FLAN-T5, OPT, Stable Diffusion, and ControlNet, the popularity of...

Announcing New Tools for Building with Generative AI on AWS

The seeds of a machine learning (ML) paradigm shift have existed for decades, but with the ready availability of scalable compute capacity, a massive...

Deploy large language models on AWS Inferentia2 using large model inference containers

You don’t have to be an expert in machine learning (ML) to appreciate the value of large language models (LLMs). Better search results, image...

Maximize performance and reduce your deep learning training cost with AWS Trainium and Amazon SageMaker

Today, tens of thousands of customers are building, training, and deploying machine learning (ML) models using Amazon SageMaker to power applications that have the...

Latest Intelligence

spot_img
spot_img
spot_img

Chat with us

Hi there! How can I help you?