Tag: Multi-Model Endpoint

Scale foundation model inference to hundreds of models with Amazon SageMaker – Part 1 | Amazon Web Services

AI November 30, 2023

As democratization of foundation models (FMs) becomes more prevalent and demand for AI-augmented services increases, software as a service (SaaS) providers are looking to...

Build a medical imaging AI inference pipeline with MONAI Deploy on AWS | Amazon Web Services

AI November 8, 2023

How Veriff decreased deployment time by 80% using Amazon SageMaker multi-model endpoints | Amazon Web Services

AI October 16, 2023

Run multiple generative AI models on GPU using Amazon SageMaker multi-model endpoints with TorchServe and save up to 75% in inference costs | Amazon...

AI September 6, 2023

Efficiently train, tune, and deploy custom ensembles using Amazon SageMaker | Amazon Web Services

AIJuly 20, 2023

Artificial intelligence (AI) has become an important and popular topic in the technology community. As AI has evolved, we have seen different types of...

How Forethought saves over 66% in costs for generative AI models using Amazon SageMaker | Amazon Web Services

AIJune 13, 2023

This post is co-written with Jad Chamoun, Director of Engineering at Forethought Technologies, Inc. and Salina Wu, Senior ML Engineer at Forethought Technologies, Inc....

Host ML models on Amazon SageMaker using Triton: ONNX Models | Amazon Web Services

AIJune 9, 2023

ONNX (Open Neural Network Exchange) is an open-source standard for representing deep learning models widely supported by many providers. ONNX provides tools for optimizing...

Host ML models on Amazon SageMaker using Triton: CV model with PyTorch backend | Amazon Web Services

AIMay 31, 2023

PyTorch is a machine learning (ML) framework based on the Torch library, used for applications such as computer vision and natural language processing. One...

Analyze Amazon SageMaker spend and determine cost optimization opportunities based on usage, Part 5: Hosting | Amazon Web Services

AIMay 30, 2023

In 2021, we launched AWS Support Proactive Services as part of the AWS Enterprise Support plan. Since its introduction, we have helped hundreds of...

Create high-quality images with Stable Diffusion models and deploy them cost-efficiently with Amazon SageMaker | Amazon Web Services

AIMay 26, 2023

Text-to-image generation is a task in which a machine learning (ML) model generates an image from a textual description. The goal is to generate...

Host ML models on Amazon SageMaker using Triton: Python backend | Amazon Web Services

AIMay 9, 2023

Amazon SageMaker provides a number of options for users who are looking for a solution to host their machine learning (ML) models. Of these...

Host ML models on Amazon SageMaker using Triton: TensorRT models

AIMay 8, 2023

Sometimes it can be very beneficial to use tools such as compilers that can modify and compile your models for optimal inference performance. In...

Hosting ML Models on Amazon SageMaker using Triton: XGBoost, LightGBM, and Treelite Models

AIMay 2, 2023

One of the most popular models available today is XGBoost. With the ability to solve various problems such as classification and regression, XGBoost has...

How Sportradar used the Deep Java Library to build production-scale ML platforms for increased performance and efficiency

AIApril 19, 2023

This is a guest post co-written with Fred Wu from Sportradar. Sportradar is the world’s leading sports technology company, at the intersection between sports,...

Architect personalized generative AI SaaS applications on Amazon SageMaker

AIMarch 9, 2023

The AI landscape is being reshaped by the rise of generative models capable of synthesizing high-quality data, such as text, images, music, and videos....

Accelerate disaster response with computer vision for satellite imagery using Amazon SageMaker and Amazon Augmented AI

AIFebruary 24, 2023

In recent years, advances in computer vision have enabled researchers, first responders, and governments to tackle the challenging problem of processing global satellite...

12 Page 1 of 2

Latest Intelligence

Achieve high performance at scale for model serving using Amazon SageMaker multi-model endpoints with GPU

AI February 24, 2023

Model hosting patterns in Amazon SageMaker, Part 1: Common design patterns for building ML applications on Amazon SageMaker

AI January 9, 2023

Run and optimize multi-model inference with Amazon SageMaker multi-model endpoints

AI October 14, 2022

Generative Data Intelligence