I-Generative Data Intelligence

Sebenzisa amamodeli e-Amazon Titan ukwenza izithombe, ukuhlela, nokusesha | Izinsizakalo Zewebhu ze-Amazon

Usuku:

I-Amazon Bedrock inikeza uhla olubanzi lwamamodeli esisekelo asebenza kahle kakhulu avela e-Amazon nezinye izinkampani ezihamba phambili ze-AI, okuhlanganisa I-Anthropic, I-AI21, Meta, Cohere, Futhi Ukuzinza kwe-AI, futhi ihlanganisa izimo ezibanzi zokusebenzisa, okuhlanganisa ukukhiqizwa kombhalo nezithombe, ukusesha, ukuxoxa, ukucabanga kanye nama-agent, nokunye. Okusha I-Amazon Titan Image Generator imodeli ivumela abadali bokuqukethwe ukuthi bakhiqize ngokushesha izithombe zekhwalithi ephezulu, ezingokoqobo besebenzisa ukwaziswa kombhalo wesiNgisi olula. Imodeli ye-AI ethuthukisiwe iqonda imiyalo eyinkimbinkimbi enezinto eziningi futhi ibuyisela izithombe zekhwalithi yesitudiyo ezifanele ukukhangisa, i-ecommerce, kanye ukuzijabulisa. Izici ezibalulekile zifaka ikhono lokwenza ngcono izithombe ngokuphindaphinda ekwazisweni, ukuhlela okungemuva okuzenzakalelayo, nokukhiqiza ukuhlukahluka okuningi kwesigcawu esifanayo. Abadali bangakwazi futhi ukwenza imodeli ngendlela oyifisayo ngedatha yabo ukuze bakhiphe izithombe zomkhiqizo ngesitayela esithile. Okubalulekile, i-Titan Image Generator inezivikelo ezakhelwe ngaphakathi, njengama-watermark angabonakali kuzo zonke izithombe ezikhiqizwe yi-AI, ukukhuthaza ukusetshenziswa ngendlela efanele futhi unciphise ukusatshalaliswa kolwazi oluyi-disinformation. Lobu buchwepheshe obusha benza ukukhiqiza izithombe zangokwezifiso ngevolumu enkulu noma iyiphi imboni kufinyeleleke kakhudlwana nangempumelelo.

The new I-Amazon Titan Multimodal Embeddings imodeli isiza ukwakha usesho olunembe kakhudlwana nezincomo ngokuqonda umbhalo, izithombe, noma kokubili. Iguqula izithombe nombhalo wesiNgisi kube ama-semantic vectors, ithwebula incazelo nobudlelwano kudatha yakho. Ungakwazi ukuhlanganisa umbhalo nezithombe njengezincazelo zomkhiqizo nezithombe ukuze uhlonze izinto ngempumelelo kakhudlwana. Ama-vector anika amandla umuzwa wokusesha osheshayo, onembile. Ukushumeka kwe-Titan Multimodal kuyavumelana nezimo kubukhulu be-vector, okunika amandla ukulungiselelwa kwezidingo zokusebenza. I-asynchronous API kanye Isevisi ye-Amazon OpenSearch Isixhumi sikwenza kube lula ukuhlanganisa imodeli kuzinhlelo zakho zokusebenza zosesho lwe-neural.

Kulokhu okuthunyelwe, sihamba ngendlela yokusebenzisa i-Titan Image Generator kanye namamodeli we-Titan Multimodal Embeddings nge-AWS Python SDK.

Ukukhiqiza nokuhlela isithombe

Kulesi sigaba, sibonisa amaphethini ekhodi ayisisekelo okusebenzisa i-AWS SDK ukuze ukhiqize izithombe ezintsha futhi wenze ukuhlela okunamandla e-AI ezithombeni ezikhona. Izibonelo zekhodi zinikezwe kuPython, futhi iJavaScript (Node.js) nayo iyatholakala kulokhu IGitHub repository.

Ngaphambi kokuthi ubhale izikripthi ezisebenzisa i-Amazon Bedrock API, udinga ukufaka inguqulo efanele ye-AWS SDK endaweni yangakini. Ngemibhalo yePython, ungasebenzisa i I-AWS SDK yePython (Boto3). Abasebenzisi bePython bangase futhi bafune ukufaka ifayela le- Imojula yomcamelo, esiza ukusebenza kwesithombe njengokulayisha nokugcina izithombe. Ukuze uthole imiyalelo yokusetha, bheka ku- IGitHub repository.

Ukwengeza, vumela ukufinyelela ku-Amazon Titan Image Generator kanye namamodeli we-Titan Multimodal Embeddings. Ukuze uthole ukwaziswa okwengeziwe, bheka Ukufinyelela imodeli.

Imisebenzi yomsizi

Umsebenzi olandelayo usetha iklayenti lesikhathi sokusebenza se-Amazon Bedrock Boto3 futhi ukhiqize izithombe ngokuthatha imithwalo yemisebenzi ehlukahlukene (esixoxa ngayo kamuva kulokhu okuthunyelwe):

import boto3
import json, base64, io
from random import randint
from PIL import Image

bedrock_runtime_client = boto3.client("bedrock-runtime")


def titan_image(
    payload: dict,
    num_image: int = 2,
    cfg: float = 10.0,
    seed: int = None,
    modelId: str = "amazon.titan-image-generator-v1",
) -> list:
    #   ImageGenerationConfig Options:
    #   - numberOfImages: Number of images to be generated
    #   - quality: Quality of generated images, can be standard or premium
    #   - height: Height of output image(s)
    #   - width: Width of output image(s)
    #   - cfgScale: Scale for classifier-free guidance
    #   - seed: The seed to use for reproducibility
    seed = seed if seed is not None else randint(0, 214783647)
    body = json.dumps(
        {
            **payload,
            "imageGenerationConfig": {
                "numberOfImages": num_image,  # Range: 1 to 5
                "quality": "premium",  # Options: standard/premium
                "height": 1024,  # Supported height list above
                "width": 1024,  # Supported width list above
                "cfgScale": cfg,  # Range: 1.0 (exclusive) to 10.0
                "seed": seed,  # Range: 0 to 214783647
            },
        }
    )

    response = bedrock_runtime_client.invoke_model(
        body=body,
        modelId=modelId,
        accept="application/json",
        contentType="application/json",
    )

    response_body = json.loads(response.get("body").read())
    images = [
        Image.open(io.BytesIO(base64.b64decode(base64_image)))
        for base64_image in response_body.get("images")
    ]
    return images
        

Khiqiza izithombe ngombhalo

Imibhalo ekhiqiza isithombe esisha ekwazisweni kombhalo ilandela le phethini yokuqalisa:

  1. Lungiselela ukwaziswa kombhalo kanye nokwaziswa kombhalo ongakhetha kukho okunegethivu.
  2. Sebenzisa le BedrockRuntime iklayenti ukunxenxa imodeli ye-Titan Image Generator.
  3. Hlaziya futhi unqume impendulo.
  4. Londoloza izithombe eziwumphumela kudiski.

Umbhalo ukuya esithombeni

Okulandelayo iskripthi esijwayelekile sokukhiqiza isithombe semodeli ye-Titan Image Generator:

# Text Variation
# textToImageParams Options:
#   text: prompt to guide the model on how to generate variations
#   negativeText: prompts to guide the model on what you don't want in image
images = titan_image(
    {
        "taskType": "TEXT_IMAGE",
        "textToImageParams": {
            "text": "two dogs walking down an urban street, facing the camera",  # Required
            "negativeText": "cars",  # Optional
        },
    }
)

Lokhu kuzokhiqiza izithombe ezifana nalezi ezilandelayo.

Isithombe sempendulo 1 Isithombe sempendulo 2
2 izinja ezihamba emgwaqeni 2 izinja ezihamba emgwaqeni

Okuhlukile kwesithombe

Ukuhluka kwesithombe kunikeza indlela yokwenza okuhlukile okucashile kwesithombe esikhona. Amazwibela ekhodi alandelayo asebenzisa esinye sezithombe ezikhiqizwe esibonelweni sangaphambilini ukudala izithombe ezihlukile:

# Import an input image like this (only PNG/JPEG supported):
with open("<YOUR_IMAGE_FILE_PATH>", "rb") as image_file:
    input_image = base64.b64encode(image_file.read()).decode("utf8")

# Image Variation
# ImageVariationParams Options:
#   text: prompt to guide the model on how to generate variations
#   negativeText: prompts to guide the model on what you don't want in image
#   images: base64 string representation of the input image, only 1 is supported
images = titan_image(
    {
        "taskType": "IMAGE_VARIATION",
        "imageVariationParams": {
            "text": "two dogs walking down an urban street, facing the camera",  # Required
            "images": [input_image],  # One image is required
            "negativeText": "cars",  # Optional
        },
    },
)

Lokhu kuzokhiqiza izithombe ezifana nalezi ezilandelayo.

Isithombe sangempela Isithombe sempendulo 1 Isithombe sempendulo 2
2 izinja ezihamba emgwaqeni

Hlela isithombe esikhona

Imodeli ye-Titan Image Generator ikuvumela ukuthi wengeze, ususe, noma ushintshe ama-elementi noma izindawo ngaphakathi kwesithombe esikhona. Ucacisa ukuthi iyiphi indawo okufanele uyithinte ngokunikeza okukodwa kokulandelayo:

  • Isithombe semaski – Isithombe semaski siyisithombe esinambambili lapho amaphikseli enani elingu-0 amele indawo ofuna ukuyithinta kanye namaphikseli enani angu-255 amele indawo okufanele ihlale ingashintshiwe.
  • Imaski ngokushesha - Ukwaziswa kwemaski kuyincazelo yombhalo wemvelo yolimi lwezakhi ofuna ukuzithinta, esebenzisa imodeli yangaphakathi yokuhlukanisa umbhalo ube izingxenye.

Ukuze uthole ukwaziswa okwengeziwe, bheka Imihlahlandlela yobunjiniyela esheshayo.

Imibhalo esebenza ngokuhlelwa esithombeni ilandela le phethini yokusetshenziswa:

  1. Layisha isithombe esizohlelwa kudiski.
  2. Guqula isithombe sibe iyunithi yezinhlamvu enekhodi engu-base64.
  3. Lungiselela imaski ngokusebenzisa enye yezindlela ezilandelayo:
    1. Layisha isithombe semaski kusuka kudiski, usifake njenge-base64 bese usibeka njengefayela le- maskImage ipharamitha.
    2. Setha maskText ipharamitha encazelweni yombhalo yezinto ezizothinta.
  4. Cacisa okuqukethwe okusha okuzokwenziwa kusetshenziswa enye yezinketho ezilandelayo:
    1. Ukwengeza noma ukushintsha i-elementi, setha i- text ipharamitha encazelweni yokuqukethwe okusha.
    2. Ukuze ususe i-elementi, shiya i- text ipharamitha ngokuphelele.
  5. Sebenzisa le BedrockRuntime iklayenti ukunxenxa imodeli ye-Titan Image Generator.
  6. Hlaziya futhi unqume impendulo.
  7. Londoloza izithombe eziwumphumela kudiski.

Ukuhlela into: Ukupenda ngesithombe semaski

Okulandelayo iskripthi esijwayelekile sokuhlela isithombe semodeli ye-Titan Image Generator esetshenziswa maskImage. Sithatha esinye sezithombe ezikhiqizwe ngaphambilini futhi sinikeze isithombe semaski, lapho amaphikseli enani elingu-0 ahunyushwa ngokuthi amaphikseli amnyama namaphikiseli angu-255 njengokumhlophe. Siphinde simiselele enye yezinja esithombeni ikati sisebenzisa ukwaziswa kombhalo.

with open("<YOUR_MASK_IMAGE_FILE_PATH>", "rb") as image_file:
    mask_image = base64.b64encode(image_file.read()).decode("utf8")

# Import an input image like this (only PNG/JPEG supported):
with open("<YOUR_ORIGINAL_IMAGE_FILE_PATH>", "rb") as image_file:
    input_image = base64.b64encode(image_file.read()).decode("utf8")

# Inpainting
# inPaintingParams Options:
#   text: prompt to guide inpainting
#   negativeText: prompts to guide the model on what you don't want in image
#   image: base64 string representation of the input image
#   maskImage: base64 string representation of the input mask image
#   maskPrompt: prompt used for auto editing to generate mask

images = titan_image(
    {
        "taskType": "INPAINTING",
        "inPaintingParams": {
            "text": "a cat",  # Optional
            "negativeText": "bad quality, low res",  # Optional
            "image": input_image,  # Required
            "maskImage": mask_image,
        },
    },
    num_image=3,
)

Lokhu kuzokhiqiza izithombe ezifana nalezi ezilandelayo.

Isithombe sangempela Isithombe Semaski Isithombe Esihleliwe
2 izinja ezihamba emgwaqeni ikati nenja ehamba emgwaqeni

Ukususwa kwento: Ukupenda ngomyalo wemaski

Kwesinye isibonelo, sisebenzisa maskPrompt ukucacisa into esesithombeni, ethathwe ezinyathelweni zangaphambili, ukuze ihlelwe. Ngokukhipha ukwaziswa kombhalo, into izosuswa:

# Import an input image like this (only PNG/JPEG supported):
with open("<YOUR_IMAGE_FILE_PATH>", "rb") as image_file:
    input_image = base64.b64encode(image_file.read()).decode("utf8")

images = titan_image(
    {
        "taskType": "INPAINTING",
        "inPaintingParams": {
            "negativeText": "bad quality, low res",  # Optional
            "image": input_image,  # Required
            "maskPrompt": "white dog",  # One of "maskImage" or "maskPrompt" is required
        },
    },
)

Lokhu kuzokhiqiza izithombe ezifana nalezi ezilandelayo.

Isithombe sangempela Isithombe sokuphendula
2 izinja ezihamba emgwaqeni inja eyodwa ehamba emgwaqeni

Ukuhlela ingemuva: Ukupenda ngaphandle

Ukupenda ngaphandle kuyasiza uma ufuna ukushintsha ingemuva lesithombe. Ungakwazi futhi ukunweba imingcele yesithombe ukuze uthole umphumela wokuhlehlisa. Kuskripthi sesibonelo esilandelayo, sisebenzisa maskPrompt ukucacisa ukuthi iyiphi into okufanele igcinwe; ungasebenzisa futhi maskImage. Ipharamitha outPaintingMode icacisa ukuthi kuvunyelwe yini ukuguqulwa kwamaphikseli ngaphakathi kwemaski. Uma isethwe njenge DEFAULT, amaphikseli angaphakathi kwemaski avunyelwe ukuthi ashintshwe ukuze isithombe esakhiwe kabusha silingane sisonke. Le nketho iyanconywa uma i maskImage enikeziwe ayimeli into enezinga lephikseli ngokunemba. Uma isethwe njenge PRECISE, ukuguqulwa kwamaphikseli ngaphakathi kwemaski kuyavinjelwa. Le nketho iyanconywa uma usebenzisa i-a maskPrompt noma maskImage elimele into enezinga le-pixel ngokunemba.

# Import an input image like this (only PNG/JPEG supported):
with open("<YOUR_IMAGE_FILE_PATH>", "rb") as image_file:
    input_image = base64.b64encode(image_file.read()).decode("utf8")

# OutPaintingParams Options:
#   text: prompt to guide outpainting
#   negativeText: prompts to guide the model on what you don't want in image
#   image: base64 string representation of the input image
#   maskImage: base64 string representation of the input mask image
#   maskPrompt: prompt used for auto editing to generate mask
#   outPaintingMode: DEFAULT | PRECISE
images = titan_image(
    {
        "taskType": "OUTPAINTING",
        "outPaintingParams": {
            "text": "forest",  # Required
            "image": input_image,  # Required
            "maskPrompt": "dogs",  # One of "maskImage" or "maskPrompt" is required
            "outPaintingMode": "PRECISE",  # One of "PRECISE" or "DEFAULT"
        },
    },
    num_image=3,
)

Lokhu kuzokhiqiza izithombe ezifana nalezi ezilandelayo.

Isithombe sangempela Umbhalo Isithombe sokuphendula
Izinja ezi-2 ezihamba emgwaqeni “ibhishi” inja eyodwa ehamba ogwini lolwandle
2 izinja ezihamba emgwaqeni "ihlathi"

Ngaphezu kwalokho, imiphumela yamanani ahlukene we outPaintingMode, Nge maskImage ezingabonisi into ngokunemba kwezinga le-pixel, zimi kanje.

Lesi sigaba sikunikeze isifinyezo semisebenzi ongayenza ngemodeli ye-Titan Image Generator. Ngokukhethekile, lezi zikripthi zibonisa umbhalo uye esithombeni, ukuhluka kwesithombe, ukupenda, kanye nemisebenzi yokupenda ngaphandle. Kufanele ukwazi ukulungisa amaphethini ezinhlelo zakho zokusebenza ngokubhekisela imininingwane yepharamitha yalezo zinhlobo zomsebenzi ezinemininingwane ku. Imibhalo ye-Amazon Titan Image Generator.

Ukushumeka kwe-Multimodal nokusesha

Ungasebenzisa imodeli ye-Amazon Titan Multimodal Embeddings ngemisebenzi yebhizinisi efana nosesho lwezithombe nesincomo esisekelwe kokufana, futhi inokunciphisa okwakhelwe ngaphakathi okusiza ukunciphisa ukuchema emiphumeleni yokusesha. Kukhona osayizi abaningi bokushumeka bobukhulu bokuhwebelana okungcono kakhulu kokubambezeleka/ukunemba kwezidingo ezihlukene, futhi konke kungenziwa ngendlela oyifisayo nge-API elula ukuze ivumelane nedatha yakho kuyilapho uphikelela ekuvikelekeni kwedatha nobumfihlo. I-Amazon Titan Multimodal Embeddings inikezwa njengama-API alula wesikhathi sangempela noma i-asynchronous batch yokuguqula ukusesha kanye nezinhlelo zokusebenza zokuncoma, futhi ingaxhunywa kumininingwane egciniwe ye-vector ehlukene, okuhlanganisa. Isevisi ye-Amazon OpenSearch.

Imisebenzi yomsizi

Umsebenzi olandelayo uguqula isithombe, bese ubhala ngokuzikhethela, ube ukushumeka kwezindlela eziningi:

def titan_multimodal_embedding(
    image_path: str = None,  # maximum 2048 x 2048 pixels
    description: str = None,  # English only and max input tokens 128
    dimension: int = 1024,  # 1,024 (default), 384, 256
    model_id: str = "amazon.titan-embed-image-v1",
):
    payload_body = {}
    embedding_config: dict = {"embeddingConfig": {"outputEmbeddingLength": dimension}}

    # You can specify either text or image or both
    if image_path:
        # Maximum image size supported is 2048 x 2048 pixels
        with open(image_path, "rb") as image_file:
            payload_body["inputImage"] = base64.b64encode(image_file.read()).decode(
                "utf8"
            )
    if description:
        payload_body["inputText"] = description

    assert payload_body, "please provide either an image and/or a text description"
    print("n".join(payload_body.keys()))

    response = bedrock_runtime_client.invoke_model(
        body=json.dumps({**payload_body, **embedding_config}),
        modelId=model_id,
        accept="application/json",
        contentType="application/json",
    )

    return json.loads(response.get("body").read())

Umsebenzi olandelayo ubuyisela okushumekiwe okuphezulu okufanayo kwe-multimodal uma kunikezwe umbuzo wokushumekwa kwe-multimodal. Qaphela ukuthi ngokusebenza, ungasebenzisa isizindalwazi se-vector esiphethwe, njenge-OpenSearch Service. Isibonelo esilandelayo ngesezinjongo zemifanekiso:

from scipy.spatial.distance import cdist
import numpy as np

def search(query_emb: np.array, indexes: np.array, top_k: int = 1):
    dist = cdist(query_emb, indexes, metric="cosine")
    return dist.argsort(axis=-1)[0, :top_k], np.sort(dist, axis=-1)[:top_k]

Isethi yedatha yokwenziwa

Ngezinjongo zemifanekiso, sisebenzisa Imodeli ka-Anthropic ka-Claude 2.1 ku-Amazon Bedrock ukukhiqiza ngokungahleliwe imikhiqizo eyisikhombisa eyahlukene, ngayinye inezinhlobonhlobo ezintathu, usebenzisa umyalo olandelayo:

Generate a list of 7 items description for an online e-commerce shop, each comes with 3 variants of color or type. All with separate full sentence description.

Okulandelayo uhlu lwemiphumela ebuyisiwe:

1. T-shirt
- A red cotton t-shirt with a crew neck and short sleeves.
- A blue cotton t-shirt with a v-neck and short sleeves.
- A black polyester t-shirt with a scoop neck and cap sleeves.

2. Jeans
- Classic blue relaxed fit denim jeans with a mid-rise waist.
- Black skinny fit denim jeans with a high-rise waist and ripped details at the knees.
- Stonewash straight leg denim jeans with a standard waist and front pockets.

3. Sneakers
- White leather low-top sneakers with an almond toe cap and thick rubber outsole.
- Gray mesh high-top sneakers with neon green laces and a padded ankle collar.
- Tan suede mid-top sneakers with a round toe and ivory rubber cupsole.

4. Backpack
- A purple nylon backpack with padded shoulder straps, front zipper pocket and laptop sleeve.
- A gray canvas backpack with brown leather trims, side water bottle pockets and drawstring top closure.
- A black leather backpack with multiple interior pockets, top carry handle and adjustable padded straps.

5. Smartwatch
- A silver stainless steel smartwatch with heart rate monitor, GPS tracker and sleep analysis.
- A space gray aluminum smartwatch with step counter, phone notifications and calendar syncing.
- A rose gold smartwatch with activity tracking, music controls and customizable watch faces.

6. Coffee maker
- A 12-cup programmable coffee maker in brushed steel with removable water tank and keep warm plate.
- A compact 5-cup single serve coffee maker in matt black with travel mug auto-dispensing feature.
- A retro style stovetop percolator coffee pot in speckled enamel with stay-cool handle and glass knob lid.

7. Yoga mat
- A teal 4mm thick yoga mat made of natural tree rubber with moisture-wicking microfiber top.
- A purple 6mm thick yoga mat made of eco-friendly TPE material with integrated carrying strap.
- A patterned 5mm thick yoga mat made of PVC-free material with towel cover included.

Yabela impendulo engenhla kokuguquguqukayo response_cat. Bese sisebenzisa imodeli ye-Titan Image Generator ukuze sakhe izithombe zomkhiqizo wento ngayinye:

import re

def extract_text(input_string):
    pattern = r"- (.*?)($|n)"
    matches = re.findall(pattern, input_string)
    extracted_texts = [match[0] for match in matches]
    return extracted_texts

product_description = extract_text(response_cat)

titles = []
for prompt in product_description:
    images = titan_image(
        {
            "taskType": "TEXT_IMAGE",
            "textToImageParams": {
                "text": prompt,  # Required
            },
        },
        num_image=1,
    )
    title = "_".join(prompt.split()[:4]).lower()
    titles.append(title)
    images[0].save(f"{title}.png", format="png")

Zonke izithombe ezikhiqiziwe zingatholakala ku-appendix ekupheleni kwalokhu okuthunyelwe.

Inkomba yesethi yedatha ye-Multimodal

Sebenzisa ikhodi elandelayo ukuze uthole inkomba yedathasethi ye-multimodal:

multimodal_embeddings = []
for image_filename, description in zip(titles, product_description):
    embedding = titan_multimodal_embedding(f"{image_filename}.png", dimension=1024)["embedding"]
    multimodal_embeddings.append(embedding)

Ukusesha kwe-Multimodal

Sebenzisa ikhodi elandelayo ukuze useshe izindlela eziningi:

query_prompt = "<YOUR_QUERY_TEXT>"
query_embedding = titan_multimodal_embedding(description=query_prompt, dimension=1024)["embedding"]
# If searching via Image
# query_image_filename = "<YOUR_QUERY_IMAGE>"
# query_emb = titan_multimodal_embedding(image_path=query_image_filename, dimension=1024)["embedding"]
idx_returned, dist = search(np.array(query_embedding)[None], np.array(multimodal_embeddings))

Okulandelayo eminye imiphumela yosesho.

Isiphetho

Okuthunyelwe kwethula i-Amazon Titan Image Generator kanye namamodeli we-Amazon Titan Multimodal Embeddings. I-Titan Image Generator ikuvumela ukuthi udale ngokwezifiso, izithombe zekhwalithi ephezulu kusuka emiyalweni yombhalo. Izici ezibalulekile zifaka ukuphindaphinda ekwazisweni, ukuhlela okungemuva okuzenzakalelayo, nokwenza ngokwezifiso idatha. Inezivikelo ezifana nama-watermark angabonakali ukukhuthaza ukusetshenziswa okufanele. Ukushumeka kwe-Titan Multimodal kuguqula umbhalo, izithombe, noma kokubili kube ama-semantic vectors ukuze kunikwe amandla ukusesha okunembile nezincomo. Sibe sesihlinzeka ngamasampula ekhodi ye-Python ukuze sisebenzise lezi zinsizakalo, futhi sabonisa ukukhiqiza izithombe kusuka emiyalweni yombhalo kanye nokuphindaphinda kulezo zithombe; ukuhlela izithombe ezikhona ngokungeza, ukususa, noma ukufaka esikhundleni sezakhi ezicaciswe yizithombe zemaski noma umbhalo wemaski; ukudala ukushumeka kwe-multimodal kusuka kumbhalo, izithombe, noma kokubili; kanye nokusesha okufanayo okushumekiwe kwe-multimodal embuzweni. Siphinde sabonisa sisebenzisa idathasethi yokwenziwa ye-e-commerce ekhonjwe futhi yaseshwa kusetshenziswa Ukushumeka kwe-Titan Multimodal. Inhloso yalokhu okuthunyelwe ukunika amandla onjiniyela ukuthi baqale ukusebenzisa lezi zinsizakalo ezintsha ze-AI ezinhlelweni zabo zokusebenza. Amaphethini ekhodi angasebenza njengezifanekiso zokusetshenziswa ngokwezifiso.

Wonke amakhodi ayatholakala ku- IGitHub repository. Ukuze uthole olunye ulwazi, bheka ku- I-Amazon Bedrock User Guide.


Mayelana Ababhali

Rohit Mittal unguMphathi Womkhiqizo Oyinhloko e-Amazon AI eyakha amamodeli esisekelo esinezimo eziningi. Usanda kuhola ukwethulwa kwemodeli ye-Amazon Titan Image Generator njengengxenye yenkonzo ye-Amazon Bedrock. Unolwazi ku-AI/ML, NLP, kanye Nosesho, unentshisekelo yokwakha imikhiqizo exazulula amaphuzu obuhlungu bekhasimende ngobuchwepheshe obusha.

UDkt. Ashwin Swaminathan ungumcwaningi weComputer Vision kanye Nomshini Wokufunda, unjiniyela, kanye nomphathi oneminyaka engu-12+ yesipiliyoni sembonini kanye neminyaka engu-5+ yesipiliyoni socwaningo lwezemfundo. Izisekelo eziqinile kanye nekhono elifakazelwe lokuthola ulwazi ngokushesha nokuba negalelo ezindaweni ezintsha nezisafufusa.

UDkt. Yusheng Xie Ungu-Principal Applied Scientist e-Amazon AGI. Umsebenzi wakhe ugxile ekwakheni amamodeli esisekelo se-multi-modal. Ngaphambi kokujoyina i-AGI, ubehola intuthuko ehlukahlukene ye-AI enezimo eziningi kwa-AWS njenge-Amazon Titan Image Generator kanye ne-Amazon Textract Queries.

UDkt. Hao Yang uyiPrincipal Applied Scientist e-Amazon. Izithakazelo zakhe eziyinhloko zocwaningo ukuthola izinto nokufunda ngezichasiselo ezinomkhawulo. Ngaphandle komsebenzi, u-Hao uthanda ukubukela amafilimu, ukuthwebula izithombe, nemisebenzi yangaphandle.

UDkt Davide Modolo uyi-Applied Science Manager kwa-Amazon AGI, esebenza ekwakheni amamodeli amakhulu esisekelo se-multimodal. Ngaphambi kokujoyina i-Amazon AGI, ubengumphathi/ehola iminyaka eyi-7 kuma-AWS AI Labs (i-Amazon Bedrock ne-Amazon Rekognition). Ngaphandle komsebenzi, uthanda ukuhamba nokudlala noma yiluphi uhlobo lomdlalo, ikakhulukazi ibhola likanobhutshuzwayo.

UDkt. Baichuan Sun, okwamanje ukhonza njengo-Sr. AI/ML Solutions Architect kwa-AWS, egxile ku-AI ekhiqizayo futhi usebenzisa ulwazi lwakhe kusayensi yedatha nokufunda komshini ukuze anikeze izixazululo ezisebenzayo, ezisekelwe emafini. Ngesipiliyoni sokubonisana nabaphathi kanye nesakhiwo sesisombululo se-AI, ubhekana nezinselelo eziningi eziyinkimbinkimbi, okuhlanganisa umbono wekhompyutha wamarobhothi, ukubikezela kochungechunge lwesikhathi, nokugcinwa kokubikezela, phakathi kokunye. Umsebenzi wakhe usekelwe kusizinda esiqinile sokuphathwa kwephrojekthi, i-software ye-R&D, kanye nokuphishekela imfundo. Ngaphandle komsebenzi, uDkt. Sun ujabulela ukulinganisela kokuhamba nokuchitha isikhathi nomndeni nabangane.

UDkt. Kai Zhu okwamanje usebenza njengoNjiniyela Wokusekela Kwamafu kwa-AWS, esiza amakhasimende anezinkinga kumasevisi ahlobene ne-AI/ML njenge-SageMaker, i-Bedrock, njll. Uyisazi se-SageMaker Subject Matter. Unolwazi lwesayensi yedatha nobunjiniyela bedatha, unentshisekelo yokwakha amaphrojekthi anamandla e-AI.

Kris Schultz isichithe iminyaka engaphezu kwengu-25 iletha okuhlangenwe nakho komsebenzisi okubandakanyayo ekuphileni ngokuhlanganisa ubuchwepheshe obusafufusa nomklamo osezingeni lomhlaba. Endimeni yakhe njengoMphathi Omkhulu Womkhiqizo, u-Kris usiza ukuklama nokwakha izinsiza ze-AWS ukuze anikeze amandla iMedia & Entertainment, Amageyimu, kanye ne-Spatial Computing.


isithasiselo

Ezigabeni ezilandelayo, sibonisa izimo zesampula eziyinselele ezifana nokufakwa kombhalo, izandla, nokuboniswa ukuze kugqanyiswe amakhono emodeli ye-Titan Image Generator. Siphinde sifake isampula yezithombe ezikhiqizwe ezibonelweni zangaphambili.

Umbhalo

Imodeli ye-Titan Image Generator iphumelela kakhulu ekusebenzeni okuyinkimbinkimbi njengokufaka umbhalo ofundekayo ezithombeni. Lesi sibonelo sibonisa ikhono le-Titan lokunikeza ngokucacile osonhlamvukazi nabancane ngesitayela esingaguquki ngaphakathi kwesithombe.

i-corgi egqoke ikepisi le-baseball elinombhalo othi "genai" umfana ojabulile onika isithupha, egqoke isikithi esinombhalo othi “generative AI”

izandla

Imodeli ye-Titan Image Generator futhi inamandla okukhiqiza izithombe ezinemininingwane ye-AI. Isithombe sibonisa izandla neminwe engokoqobo enemininingwane ebonakalayo, idlulela ngalé kwesizukulwane sesithombe se-AI esiyisisekelo esingase sintule ukucaciswa okunjalo. Kulezi zibonelo ezilandelayo, qaphela ukuvezwa okunembile kokuma kanye ne-anatomy.

isandla somuntu esibukwa phezulu ukubuka kahle izandla zomuntu ophethe inkomishi yekhofi

Mirror

Izithombe ezikhiqizwe imodeli ye-Titan Image Generator zihlela ngokwendawo izinto futhi zibonisa imiphumela yesibuko ngokunembile, njengoba kuboniswe ezibonelweni ezilandelayo.

Ikati elihle elimhlophe qhwa limi ngemilenze yangemuva, lilunguza ngokumangalisayo esibukweni segolide esiwubukhazikhazi. Ekucabangeni ikati liyazibonela enhle sky lake kanye reflection phezu kwamanzi

Izithombe zomkhiqizo zokwenziwa

Okulandelayo yizithombe zomkhiqizo ezenziwe ekuqaleni kwalokhu okuthunyelwe kumodeli yokushumeka kwe-Titan Multimodal.

indawo_img

Latest Intelligence

indawo_img