Amazon Elastic Inference

What is Amazon Elastic Inference?

Amazon Elastic Inference allows users to attach just the right amount of GPU-powered inference acceleration to any Amazon EC2 instance, Amazon SageMaker instance, or ECS task. Users pay for the accelerator hours used. It is designed to be used with AWS’s enhanced versions of TensorFlow Serving, Apache MXNet and PyTorch, which automatically detect the presence of inference accelerators, and optimally distribute the model operations between the accelerator’s GPU and the instance’s CPU.

Categories & Use Cases

Machine Learning