AWS Machine Learning

1. Machine Learning on AWS

2. Machine Learning API Service

この部分は簡単しすぎて、APIを利用するだけ

2.1 Comperhend for NLP

2.2 Polly

2.3 Lex

2.4 Rekoginition image/video

2.5 Transcribe and Translate

3. ML Platform SageMaker

この部分は重点的にマスターしましょう

3.1 Setup

  • This notebook was tested in Amazon SageMaker Studio on a ml.t3.medium instance with Python 3 (Data Science) kernel. Let's start by specifying:
  • S3 buckets and prefixesを定義すること。Regionを全部統一することが無難
  • The IAM roleを定義すること。
%%time

import os
import boto3
import re
import sagemaker

role = sagemaker.get_execution_role()
region = boto3.Session().region_name

# S3 bucket where the training data is located.
# Feel free to specify a different bucket and prefix
data_bucket = f"jumpstart-cache-prod-{region}"
data_prefix = "1p-notebooks-datasets/abalone/libsvm"
data_bucket_path = f"s3://{data_bucket}"

# S3 bucket for saving code and model artifacts.
# Feel free to specify a different bucket and prefix
output_bucket = sagemaker.Session().default_bucket()
output_prefix = "sagemaker/DEMO-xgboost-abalone-default"
output_bucket_path = f"s3://{output_bucket}"

3.2 get data with SageMaker

データは、RDB/MongoDB/S3または、Azure Synapseに保存している。

3.3. Train model with SageMaker job

  • After setting training parameters, we kick off training, and poll for status until training is completed, which in this example, takes between 5 and 6 minutes.
container = sagemaker.image_uris.retrieve("xgboost", region, "1.2-1")
%%time
import boto3
from time import gmtime, strftime

job_name = f"DEMO-xgboost-regression-{strftime('%Y-%m-%d-%H-%M-%S', gmtime())}"
print("Training job", job_name)

# Ensure that the training and validation data folders generated above are reflected in the "InputDataConfig" parameter below.

create_training_params = {
    "AlgorithmSpecification": {"TrainingImage": container, "TrainingInputMode": "File"},
    "RoleArn": role,
    "OutputDataConfig": {"S3OutputPath": f"{output_bucket_path}/{output_prefix}/single-xgboost"},
    "ResourceConfig": {"InstanceCount": 1, "InstanceType": "ml.m5.2xlarge", "VolumeSizeInGB": 5},
    "TrainingJobName": job_name,
    "HyperParameters": {
        "max_depth": "5",
        "eta": "0.2",
        "gamma": "4",
        "min_child_weight": "6",
        "subsample": "0.7",
        "objective": "reg:linear",
        "num_round": "50",
        "verbosity": "2",
    },
    "StoppingCondition": {"MaxRuntimeInSeconds": 3600},
    "InputDataConfig": [
        {
            "ChannelName": "train",
            "DataSource": {
                "S3DataSource": {
                    "S3DataType": "S3Prefix",
                    "S3Uri": f"{data_bucket_path}/{data_prefix}/train",
                    "S3DataDistributionType": "FullyReplicated",
                }
            },
            "ContentType": "libsvm",
            "CompressionType": "None",
        },
        {
            "ChannelName": "validation",
            "DataSource": {
                "S3DataSource": {
                    "S3DataType": "S3Prefix",
                    "S3Uri": f"{data_bucket_path}/{data_prefix}/validation",
                    "S3DataDistributionType": "FullyReplicated",
                }
            },
            "ContentType": "libsvm",
            "CompressionType": "None",
        },
    ],
}


client = boto3.client("sagemaker", region_name=region)
client.create_training_job(**create_training_params)

import time

status = client.describe_training_job(TrainingJobName=job_name)["TrainingJobStatus"]
print(status)
while status != "Completed" and status != "Failed":
    time.sleep(60)
    status = client.describe_training_job(TrainingJobName=job_name)["TrainingJobStatus"]
    print(status)

3.4. Deploy and host model with SageMaker model

%%time
import boto3
from time import gmtime, strftime

model_name = f"{job_name}-model"
print(model_name)

info = client.describe_training_job(TrainingJobName=job_name)
model_data = info["ModelArtifacts"]["S3ModelArtifacts"]
print(model_data)

primary_container = {"Image": container, "ModelDataUrl": model_data}

create_model_response = client.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container
)

print(create_model_response["ModelArn"])

3.5. Use model from SageMaker endpoint

from time import gmtime, strftime

endpoint_config_name = f"DEMO-XGBoostEndpointConfig-{strftime('%Y-%m-%d-%H-%M-%S', gmtime())}"
print(endpoint_config_name)
create_endpoint_config_response = client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.m5.xlarge",
            "InitialVariantWeight": 1,
            "InitialInstanceCount": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

print(f"Endpoint Config Arn: {create_endpoint_config_response['EndpointConfigArn']}")

3.6 Create endpoint

Lastly, the customer creates the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 9-11 minutes to complete.

%%time
import time

endpoint_name = f'DEMO-XGBoostEndpoint-{strftime("%Y-%m-%d-%H-%M-%S", gmtime())}'
print(endpoint_name)
create_endpoint_response = client.create_endpoint(
    EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name
)
print(create_endpoint_response["EndpointArn"])

resp = client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
while status == "Creating":
    print(f"Status: {status}")
    time.sleep(60)
    resp = client.describe_endpoint(EndpointName=endpoint_name)
    status = resp["EndpointStatus"]

print(f"Arn: {resp['EndpointArn']}")
print(f"Status: {status}")

4. Machine Learning Virtual servers

4.1 ML Virtual Servers

種類 説明
SaaS Databricks Sparkを利用するとき、一番早く開発環境を整える方法
PaaS EMR AWSのHadoop、Spark環境のサービス
IaaS EC2 ML AMI 一番自由のやり方、別料金が発生しない

4.2 Deep Learing

4.3 Gluon for MXNet in SageMaker

4.4 MXNet in SageMaker

set your machine

mlp_model = mx.mod.Module(
    symbol = build_graph(),
    context = get_train_context(num_cpus, num_gpus)
)

4.5 Databricks

4.6 Deep Learning AMI

5. ML Architectures

  1. ML API for conversational apps
  2. scikit-learn.org/stable
  3. kdunggets.com
  4. kaggle.com/competitions