Let's say you've developed a predictive model in R, and you want to embed predictions (scores) from that model into another application (like a mobile or Web app, or some automated service). If you expect a heavy load of requests, R running on a single server isn't going to cut it: you'll need some kind of distributed architecture with enough servers to handle the volume of requests in real time.
This reference architecture for real-time scoring with R, published in Microsoft Docs, describes a Kubernetes-based system to distribute the load to R sessions running in containers. This diagram from the article provides a good overview:
You can find detailed instructions for deploying this architecture in Github. This architecture uses Azure-specific components, but you could also use their open source equivalents if you wanted to host them yourself:
- Microsoft Machine Learning Server (and specifically, the mrsdeploy package) is used to expose an R function for prediction as an HTTP endpoint. You can use any method you like to make R into a REST service: for example, standard R and the plumber package.
- Azure Container Registry is where the container images with R and your prediction code are stored. You could also use your own private Docker Registry, or Docker Hub here.?
- Azure Kubernetes Service provides a managed Kubernetes cluster to host the containers and manage them in the underlying server cluster (which can be scaled up and down in size dynamically, according to load). If you have your own Kubernetes cluster, you can substitute it here (but note the security considerations and authentication required).
For more details on this architecture, take a look at the Microsoft Docs article linked below.
Microsoft Docs: Real-time scoring of R machine learning models