Distributed Machine Learning Predictions Using PyTorch

Achieve faster ML model serving for your users at the edge by running a distributed ML model server. This tutorial will use CloudFlow to deploy PyTorch TorchServe with an example pretrained model.

The PyTorch container we will use is available on DockerHub. This tutorial was inspired by a GCP example.

note

Before starting, create a new CloudFlow Project and then delete the default Deployment and ingress-upstream Service to prepare the project for your new deployment.

Prerequisites

You need an account on Docker Hub.
You need Docker installed so that you can build a docker image.

Pull Down the Pretrained Model

Pull down the PyTorch example models from GitHub so that we can build one of them into the container image. We'll be using one that recognizes digits from PNG images.

mkdir my-pytorch-example
cd my-pytorch-example
git clone https://github.com/pytorch/serve.git \
  --branch=v0.3.0 \
  --depth=1

Create a Dockerfile for Your Container Image

The container image you'll build relies upon PyTorch Serve on Docker Hub. We'll be using the MNIST model, which will allow us to do image classification of digits from PNG images.

Dockerfile
FROM pytorch/torchserve:0.3.0-cpu

COPY serve/examples/image_classifier/mnist/mnist.py \
    serve/examples/image_classifier/mnist/mnist_cnn.pt \
    serve/examples/image_classifier/mnist/mnist_handler.py \
    /home/model-server/

USER root
RUN printf "\nservice_envelope=json" >> /home/model-server/config.properties
USER model-server

RUN torch-model-archiver \
  --model-name=mnist \
  --version=1.0 \
  --model-file=/home/model-server/mnist.py \
  --serialized-file=/home/model-server/mnist_cnn.pt \
  --handler=/home/model-server/mnist_handler.py \
  --export-path=/home/model-server/model-store

CMD ["torchserve", \
     "--start", \
     "--ts-config=/home/model-server/config.properties", \
     "--models", \
     "mnist=mnist.mar"]

Build and Publish the Image

Build the docker image and push it to Docker Hub, substituting YOUR_DOCKERHUB_ACCOUNT accordingly.

docker build -t my-pytorch-image .
docker tag my-pytorch-image YOUR_DOCKERHUB_ACCOUNT/pytorch:latest
docker push YOUR_DOCKERHUB_ACCOUNT/pytorch:latest

Create a Kubernetes Deployment for PyTorch

Next, create the deployment for PyTorch as pytorch-deployment.yaml substituting YOUR_DOCKERHUB_ACCOUNT accordingly. This will direct CloudFlow to distribute the container you've pushed to Docker Hub.

pytorch-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: pytorch
  name: pytorch
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pytorch
  template:
    metadata:
      labels:
        app: pytorch
    spec:
      containers:
      - image: YOUR_DOCKERHUB_ACCOUNT/pytorch:latest
        imagePullPolicy: Always
        name: pytorch
        resources:
          requests:
            memory: ".5Gi"
            cpu: "500m"
          limits:
            memory: ".5Gi"
            cpu: "500m"

Apply this deployment resource to your Project with either the Kubernetes dashboard or kubectl apply -f pytorch-upstream.yaml.

Expose the Service on the Internet

We want to expose the PyTorch service on the Internet. Create ingress-upstream.yaml as defined below.

ingress-upstream.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: ingress-upstream
  name: ingress-upstream
spec:
  ports:
  - name: 80-80
    port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: pytorch
  sessionAffinity: None
  type: ClusterIP

Apply this service resource to your Project with either the Kubernetes dashboard or kubectl apply -f ingress-upstream.yaml.

See the pods running on CloudFlow's network using kubectl get pods -o wide.

The -o wide switch shows where your container is running according to the default AEE location optimization strategy. Your container will be optimally deployed according to traffic.

Create a File with an Image

You'll send an image of a '3' to the prediction engine to see if it can figure it out. Place the following JSON into a file called png-image-of-a-3.json.

png-image-of-a-3.json
{
  "instances": [
    {
      "data": {
        "b64": "iVBORw0KGgoAAAANSUhEUgAAABwAAAAcCAAAAABXZoBIAAAAv0lEQVR4nGNgGKSA03faPyDwxibHu/7vvwfnzz/5tsgRU3LW33uukgwMCi1PdmBKOr7dAmEsuiiIKSssDpX8q4fbYYv/4ZZk3YTNWCg48HcGTrnOf39dcUgpzPv97+/b56LY5PKBIfTi+bt//7ptMSV7Py6NYWCQirn17zymJK8R1PRVd4RxuoqhG6erCEmevoBbbsqvUkxBXWMQabzk+wksOhZ9vHDh4oWPf1d6YZFUuff377+/9zp5cNtIHQAAtP5OgKw1m4AAAAAASUVORK5CYII="
      }
    }
  ]
}

The image looks like this:

Start Making Predictions at the Edge

Exercise the ML model server substituting YOUR_ENVIRONMENT_HOSTNAME accordingly.

curl -X POST \
  -H "Content-Type: application/json; charset=utf-8" \
  -d @png-image-of-a-3.json \
  YOUR_ENVIRONMENT_HOSTNAME/predictions/mnist

The result you'll get:

{ "predictions": [3] }

Distributed Machine Learning Predictions Using PyTorch

note

Prerequisites​

Pull Down the Pretrained Model​

Create a Dockerfile for Your Container Image​

Build and Publish the Image​

Create a Kubernetes Deployment for PyTorch​

Expose the Service on the Internet​

Create a File with an Image​

Start Making Predictions at the Edge​

Prerequisites

Pull Down the Pretrained Model

Create a Dockerfile for Your Container Image

Build and Publish the Image

Create a Kubernetes Deployment for PyTorch

Expose the Service on the Internet

Create a File with an Image

Start Making Predictions at the Edge