--- title: "Deploying TensorFlow Models" output: rmarkdown::html_vignette: default vignette: > %\VignetteIndexEntry{Deploying TensorFlow Models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} type: docs repo: https://github.com/rstudio/tfdeploy menu: main: name: "Deploying Models" identifier: "tools-tfdeploy-introduction" parent: "tfdeploy-top" weight: 10 aliases: - /tools/tfdeploy/ --- ```{r setup, include = FALSE} knitr::opts_chunk$set(eval = FALSE) ``` ## Overview While TensorFlow models are typically defined and trained using R or Python code, it is possible to deploy TensorFlow models in a wide variety of environments without any runtime dependency on R or Python: - [TensorFlow Serving](https://www.tensorflow.org/serving/) is an open-source software library for serving TensorFlow models using a [gRPC](https://grpc.io/) interface. - [CloudML](https://tensorflow.rstudio.com/tools/cloudml/) is a managed cloud service that serves TensorFlow models using a [REST](https://cloud.google.com/ml-engine/reference/rest/v1/projects/predict) interface. - [RStudio Connect](https://www.rstudio.com/products/connect/) provides support for serving models using the same REST API as CloudML, but on a server within your own organization. TensorFlow models can also be deployed to [mobile](https://www.tensorflow.org/mobile/tflite/) and [embedded](https://aws.amazon.com/blogs/machine-learning/how-to-deploy-deep-learning-models-with-aws-lambda-and-tensorflow/) devices including iOS and Android mobile phones and Raspberry Pi computers. The R interface to TensorFlow includes a variety of tools designed to make exporting and serving TensorFlow models straightforward. The basic process for deploying TensorFlow models from R is as follows: - Train a model using the [keras](https://tensorflow.rstudio.com/keras/), [tfestimators](https://tensorflow.rstudio.com/tfestimators/), or [tensorflow](https://tensorflow.rstudio.com/tensorflow/) R packages. - Call the `export_savedmodel()` function on your trained model to write it to disk as a TensorFlow SavedModel. - Use the `serve_savedmodel()` function from the [tfdeploy](https://tensorflow.rstudio.com/tools/tfdeploy/) package to run a local test server that supports the same REST API as CloudML and RStudio Connect. - Deploy your model using TensorFlow Serving, CloudML, or RStudio Connect. ## Getting Started Begin by installing the **tfdeploy** package from CRAN as follows: ```{r} install.packages(tfdeploy) ``` To demonstrate the basics, we'll walk through an end-to-end example that trains a Keras model with the MNIST dataset, exports the saved model, and then serves the exported model locally for predictions with a REST API. After that we'll describe in more depth the specific requirements and various options associated with exporting models. Finally, we'll cover the various deployment options and provide links to additional documentation. ### MNIST Model We'll use a Keras model that recognizes handwritten digits from the [MNIST](https://en.wikipedia.org/wiki/MNIST_database) dataset as an example. MNIST consists of 28 x 28 grayscale images of handwritten digits like these: The dataset also includes labels for each image. For example, the labels for the above images are 5, 0, 4, and 1. Here's the complete source code for the model: ```{r} library(keras) # load data c(c(x_train, y_train), c(x_test, y_test)) %<-% dataset_mnist() # reshape and rescale x_train <- array_reshape(x_train, dim = c(nrow(x_train), 784)) / 255 x_test <- array_reshape(x_test, dim = c(nrow(x_test), 784)) / 255 # one-hot encode response y_train <- to_categorical(y_train, 10) y_test <- to_categorical(y_test, 10) # define and compile model model <- keras_model_sequential() model %>% layer_dense(units = 256, activation = 'relu', input_shape = c(784), name = "image") %>% layer_dense(units = 128, activation = 'relu') %>% layer_dense(units = 10, activation = 'softmax', name = "prediction") %>% compile( loss = 'categorical_crossentropy', optimizer = optimizer_rmsprop(), metrics = c('accuracy') ) # train model history <- model %>% fit( x_train, y_train, epochs = 35, batch_size = 128, validation_split = 0.2 ) ``` In R, it is easy to make predictions using the the trained model and R's `predict` function: ```{r} preds <- predict(model, x_test[1:5,]) ``` ``` [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 0 0 0 0 0 0 0 1 0 0 [2,] 0 0 1 0 0 0 0 0 0 0 [3,] 0 1 0 0 0 0 0 0 0 0 [4,] 1 0 0 0 0 0 0 0 0 0 ``` Each row represents an image, each column represents a digit from 0-9, and the values represent the model's prediction. For example, the first image is predicted to be a 7. What if we want to deploy the model in an environment where R isn't available? The following sections cover exporting and deploying the model with the **tfdeploy** package. ### Exporting the Model After training, the next step is to export the model as a TensorFlow SavedModel using the `export_savedmodel()` function: ```{r} library(tfdeploy) export_savedmodel(model, "savedmodel") ``` This will create a "savedmodel" directory that contains a saved version of your MNIST model. You can view the graph of your model using TensorBoard with the `view_savedmodel()` function: ```{r} view_savedmodel("savedmodel") ``` ### Using the Exported Model To test the exported model locally, use the `serve_savedmodel()` function. ```{r} library(tfdeploy) serve_savedmodel('savedmodel', browse = TRUE) ``` ![](images/swagger.png){width=80% .illustration} The REST API for the model is served under localhost with port 8989. Because we specified the `browse = TRUE` parameter, a webpage that describes the REST interface to the model is also displayed. The REST interface is based on the [CloudML predict request API](https://cloud.google.com/ml-engine/docs/v1/predict-request). The model can be used for prediction by making HTTP POST requests. The body of the request should contain instances of data to generate predictions for. The HTTP response will provide the model's predictions. **The data in the request body should be pre-processed and formatted in the same way as the original training data** (e.g. feature scaling and normalization, pixel transformations for images, etc.). For MNIST, the request body could be a JSON file containing one or more pre-processed images: **new_image.json** ```text { "instances": [ { "image_input": [0.12,0,0.79,...,0,0] } ] } ``` The HTTP POST request would be: ```{bash} curl -X POST -H "Content-Type: application/json" -d @new_image.json http://localhost:8089/serving_default/predict ``` Similar to R's predict function, the response includes an array representing the digits 0-9. The image in `new_image.json` is predicted to be a 7 (since that's the column which has a `1`, whereas the other columns have values approximating zero). ``` { "predictions": [ { "prediction": [ 1.3306e-24, 4.9968e-26, 1.8917e-23, 1.7047e-21, 0, 8.963e-33, 0, 1, 2.3306e-32, 2.0314e-22 ] } ] } ``` ### Deploying the Model Once you are satisifed with local testing, the next step is to deploy the model so others can use it. There are a number of available options for this including [TensorFlow Serving], [CloudML], and [RStudio Connect]. For example, to deploy the saved model to CloudML we could use the cloudml package: ```{r} library(cloudml) cloudml_deploy("savedmodel", name = "keras_mnist", version = "keras_mnist_1") ``` The same HTTP POST request we used to test the model locally can be used to generate predictions on CloudML, provided the proper access to the CloudML API. Now that we've deployed a simple end-to-end example, we'll describe the process of [Model Export] and [Model Deployment] in more detail. ## Model Export TensorFlow SavedModel defines a language-neutral format to save machine-learned models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models. The `export_savedmodel()` function creates a SavedModel from a model trained using the keras, tfestimators, or tensorflow R packages. There are subtle differences in how this works in practice depending on the package you are using. ### keras The [Keras Example](#mnist-model) above includes complete example code for creating and using SavedModel instances from Keras so we won't repeat all of those details here. To export a TensorFlow SavedModel from a Keras model, simply call the `export_savedmodel()` function on any Keras model: ```{r} export_savedmodel(model, "savedmodel") ```
Keras learning phase set to 0 for export (restart R session before doing additional training)
Note the message that is printed: exporting a Keras model requires setting the Keras "learning phase" to 0. In practice, this means that after calling `export_savedmodel` **you can not continue to train models in the same R session**.
It is important to assign reasonable names to the the first and last layers. For example, in the model code above we named the first layer "image" and the last layer "prediction".
```r
model %>%
layer_dense(units = 256, activation = 'relu', input_shape = c(784),
name = "image") %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dense(units = 10, activation = 'softmax',
name = "prediction")
```
The layer names are reflected in the structure of REST requests and responses to and from the deployed model.
### tfestimators
Exporting a TensorFlow SavedModel from a TF Estimators model works exactly the same way as exporting a keras model, simply call `export_savedmodel()` on the estimator. Here is a complete example:
```{r}
library(tfestimators)
mtcars_input_fn <- function(data, num_epochs = 1) {
input_fn(data,
features = c("disp", "cyl"),
response = "mpg",
batch_size = 32,
num_epochs = num_epochs)
}
cols <- feature_columns(column_numeric("disp"), column_numeric("cyl"))
model <- linear_regressor(feature_columns = cols)
indices <- sample(1:nrow(mtcars), size = 0.80 * nrow(mtcars))
train <- mtcars[indices, ]
test <- mtcars[-indices, ]
model %>% train(mtcars_input_fn(train, num_epochs = 10))
export_savedmodel(model, "savedmodel")
```
Generating predictions is done in the same way as with exported Keras models. First, use `serve_savedmodel()` to host the model locally. Once running, an HTTP POST request can be made:
```
curl -X POST "http://127.0.0.1:8089/predict/predict/" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
-d "{ \"instances\": [ { \"disp\": [ 160 ], \"cyl\": [ 4 ] } ]}"
```
Each instance of new data should be formatted as a json array, and each element in the array should be a named array corresponding to the feature columns. This structure is similar to a named list in R.
The response is the predicted MPG:
```
{
"predictions": [
{
"predictions": [
8.4974
]
}
]
}
```
### tensorflow
The [tensorflow](https://tensorflow.rstudio.com/tensorflow) package provides a lower-level interface to the TensorFlow API. You can also use the `export_savedmodel()` function to export models created with this API, however you need to provide some additional parmaeters indicating which tensors represent the inputs and outputs for your model.
For example, here's an MNIST model using the core TensorFlow API along with the requisite call to `export_savedmodel()`:
```{r}
library(tensorflow)
sess <- tf$Session()
datasets <- tf$contrib$learn$datasets
mnist <- datasets$mnist$read_data_sets("MNIST-data", one_hot = TRUE)
# Note that we define x as the input tensor
# and y as the output tensor that will contain
# the scores. These are referenced in export_savedmodel
x <- tf$placeholder(tf$float32, shape(NULL, 784L))
W <- tf$Variable(tf$zeros(shape(784L, 10L)))
b <- tf$Variable(tf$zeros(shape(10L)))
y <- tf$nn$softmax(tf$matmul(x, W) + b)
y_ <- tf$placeholder(tf$float32, shape(NULL, 10L))
cross_entropy <- tf$reduce_mean(
-tf$reduce_sum(y_ * tf$log(y), reduction_indices=1L)
)
optimizer <- tf$train$GradientDescentOptimizer(0.5)
train_step <- optimizer$minimize(cross_entropy)
init <- tf$global_variables_initializer()
sess$run(init)
for (i in 1:1000) {
batches <- mnist$train$next_batch(100L)
batch_xs <- batches[[1]]
batch_ys <- batches[[2]]
sess$run(train_step,
feed_dict = dict(x = batch_xs, y_ = batch_ys))
}
export_savedmodel(
sess,
"savedmodel",
inputs = list(image_input = x),
outputs = list(scores = y))
```
Once the model is exported, the same process of using `serve_savedmodel` can be used, and the same HTTP requests demonstrated in the Keras example can be used against the tensorflow model.
## Model Deployment
There are a variety of ways to deploy a TensorFlow SavedModel, each of which are described below. Of the 3 methods described, 2 of them (CloudML and RStudio Connect) share the same REST interface that we have been using with `serve_savedmodel` to test locally. The REST interface is described in detail here: