Wednesday, January 22, 2020

Know your AWS SageMaker by Amazon Web Services

How AWS Describes SageMaker:

"Amazon SageMaker provides a fully managed service for data science and machine learning workflows. One of the most important capabilities of Amazon SageMaker is its ability to run fully managed training jobs to train machine learning models." Source 1

The Estimator Object

S3 Storage

AWS SageMaker instance types

Note AWS Sagemaker instances are now separated from EC2 instances, and can differ by region. It is has accelerated computing options more commonly known as GPUs such as ml.p2.xlarge.

See the full list of AWS Sagemaker instances here Source 2

There's a comprehensive table of instance type, vCPU count, GPU, Mem (GiB), GPU Mem (GiB), a simple description of Network Performance.

Optimization Bring your data to AWS
Previously all file has to be stored in S3 now you can use Amazon's distributed systems.
"Training machine learning models requires providing the training datasets to the training job. Until now, when using Amazon S3 as the training datasource in File input mode, all training data had to be downloaded from Amazon S3 to the EBS volumes attached to the training instances at the start of the training job. A distributed file system such as Amazon FSx for Lustre or EFS can speed up machine learning training by eliminating the need for this download step." 
Amazon FSx for Lustre or Amazon Elastic File System (EFS) Source 1


No comments:

Post a Comment

OpenCV cheat sheet

import cv2 cv2.imread() cv2.resize() .tranpose() on arrays .reshape() on arrays