Use platform_config_file to configure Tensorflow session in Tensorflow Serving

Jun 29, 2021

The typical way of running tensorflow serving is to use tensorflow serving docker container or tensorflow_model_server binary. It offers a list of arguments you can pass to easily configure how you want to run the tensorflow serving server. For example, you may start your Tensorflow Serving server like this:

tensorflow_model_server --port=8500 --rest_api_port=8501 \
  --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME}

To see a full list of arguments you can pass, check out the source code.

However, when checking out the list of arguments it provides, something is missing. The common configuration that you might typically use in Tensorflow itself to configure Tensorflow session, such allow_soft_placement or allow_growth are nowhere to find. This limits our ability to configure Tensorflow session according to our needs when using Tensorflow Serving. In this post, we will look at what those needs are and how to use platform_config_file to configure Tensorflow session to address our needs when using Tensorflow Serving.

The missing session configurations

The list of arguments that are currently provided in Tensorflow Serving provide a good amount of flexibility to start your TF serving server. However, some Tensorflow session configurations are not available, including:

tf.config.set_soft_device_placement or allow_soft_placement - for configuring Tensorflow session to automatically choose an existing and supported device to run a Tensor operation in case the specified one doesn't exist
tf.config.experimental.set_memory_growth or allow_growth - for configuring Tensorflow session to not allocate all memory on the device but to grow as needed
tf.config.experimental.set_virtual_device_configuration or per_process_gpu_memory_fraction for configuring Tensorflow session to set a limit on memory usage on the device

Why would this matter? For example, if you have a ML model that contains a bunch of tensors, some tensors may only run on a CPU device. If you want to serve your ML model with Tensorflow Serving on a GPU device, without tf.config.set_soft_device_placement(True), the TF serving server will fail to start because it can not place those CPU only tensors correctly. The same case for when you want to customize GPU memory usage with Tensorflow Serving. If you don't want the server to allocate all memory on the default GPU device in the initial start, then you would need to use tf.config.experimental.set_memory_growth to ask Tensorflow to grow the memory usage when needed instead.

The solution? Use the platform_config_file argument.

What is platform_config_file?

platform_config_file is one of the arguments that Tensorflow Serving server provides. You can check out the argument in its source code. As the source code explains, when provided with a platform_config_file:

If non-empty, read an ascii PlatformConfigMap protobuf from the supplied file name, and use that platform config instead of the Tensorflow platform.

Essentially, using platform_config_file, we are not using the arguments (most of them) that TF serving server provides for ease of use, but directly configuring the underlying Tensorflow configurations. This gives us the ability to directly configure Tensorflow session as well.

Let's take a look at an example of platform_config_file:

platform_configs {
  key: "tensorflow"
  value {
    source_adapter_config {
      [type.googleapis.com/tensorflow.serving.SavedModelBundleSourceAdapterConfig] {
        legacy_config {
          session_config {
            gpu_options {
              per_process_gpu_memory_fraction: 0.4
              allow_growth: true
            }
            allow_soft_placement: true
          }
          enable_model_warmup: true
        }
      }
    }
  }
}

As you can see, in the session_config object here, we are able to provide all the typical Tensorflow session configurations just as we do when using standalone Tensorflow. In this case, using platform_config_file, those session configurations will be passed along to configure Tensorflow runtime session in Tensorflow Serving.

To use platform_config_file in Tensorflow Serving, what we need to do is to pass the argument and point it to your config file when starting Tensorflow Serving server:

tensorflow_model_server --port=8500 --rest_api_port=8501 \
  --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} \
  --platform_config_file=[PATH_TO_PLATFORM_CONFIG_FILE]

How to create a platform_config_file?

Creating a platform_config_file is very straightforward as well. There are 2 ways based on your preference.

The manual way

As shown in previous example of platform_config_file, the configurations are pretty straightforward, and you can easily add new fields/parameters. For all available parameters you can specify in the platform_config_file, checkout the source code file. A quick tip about the source code, its data format is called protocol buffer. It looks like JSON but it's not. Protocol Buffers (also called Protobuf) is a free and open source cross-platform library used to serialize structured data, developed by Google. It's not surprising that Tensorflow code base choose to use Protobuf, since it's also developed by Google.

In short, to create your own platform_config_file, you can copy the example I have in the above section, and add/delete fields as you need. This is the manual way.

The programmatic way

Now, if dealing with all the parameters in the source code seems overwhelming, there is a programmatic way to create your platform_config_file. Tensorflow provides a way to define all your session configurations using Tensorflow, and then convert them into a platform_config_fileformat. If that's something you want to explore, I recommend checking out this code snippet for generating platform_config_file.

Final Words

Not being able to configure Tensorflow runtime session when using Tensorflow Serving can be limiting, particularly when you are switching from standalone Tensorflow to Tensorflow Serving with various needs of configuring the runtime session or gpu options to fit your needs. In this article, we looked at how to use platform_config_file to address this limit and configure Tensorflow runtime session in Tensorflow Serving. Tensorflow Serving is still developing, hopefully in newer releases it will provide a way to more easily pass runtime Tensorflow session configurations.