注意：这篇文章写于 2016 年 4 月。它不再反映 TensorFlow 和 Keras 的最佳实践。Keras 现已集成到 TensorFlow 中。有关详细信息，请参阅 keras.io 文档。

将 Keras 作为 TensorFlow 工作流程的一部分使用的完整指南

如果 TensorFlow 是您的主要框架，并且您正在寻找一个简单且高级的模型定义接口来简化您的工作，那么本教程适合您。

Keras 层和模型与纯 TensorFlow 张量完全兼容，因此，Keras 是 TensorFlow 的一个很好的模型定义插件，甚至可以与其他 TensorFlow 库一起使用。让我们看看如何做到这一点。

**请注意，本教程假设您已将 Keras 配置为使用 TensorFlow 后端（而不是 Theano）**。以下是有关如何执行此操作的说明。

我们将涵盖以下几点

一：在 TensorFlow 张量上调用 Keras 层

二：将 Keras 模型与 TensorFlow 一起使用

三：多 GPU 和分布式训练

四：使用 TensorFlow-serving 导出模型

一：在 TensorFlow 张量上调用 Keras 层

让我们从一个简单的示例开始：MNIST 数字分类。我们将使用一堆 Keras Dense 层（全连接层）构建一个 TensorFlow 数字分类器。

我们应该首先创建一个 TensorFlow 会话并将其注册到 Keras。这意味着 Keras 将使用我们注册的会话来初始化它在内部创建的所有变量。

import tensorflow as tf
sess = tf.Session()

from keras import backend as K
K.set_session(sess)

现在让我们开始构建我们的 MNIST 模型。我们可以像在 TensorFlow 中一样开始构建分类器

# this placeholder will contain our input digits, as flat vectors
img = tf.placeholder(tf.float32, shape=(None, 784))

然后，我们可以使用 Keras 层来加速模型定义过程

from keras.layers import Dense

# Keras layers can be called on TensorFlow tensors:
x = Dense(128, activation='relu')(img)  # fully-connected layer with 128 units and ReLU activation
x = Dense(128, activation='relu')(x)
preds = Dense(10, activation='softmax')(x)  # output layer with 10 units and a softmax activation

我们定义标签的占位符，以及我们将使用的损失函数

labels = tf.placeholder(tf.float32, shape=(None, 10))

from keras.objectives import categorical_crossentropy
loss = tf.reduce_mean(categorical_crossentropy(labels, preds))

让我们使用 TensorFlow 优化器训练模型

from tensorflow.examples.tutorials.mnist import input_data
mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

# Initialize all variables
init_op = tf.global_variables_initializer()
sess.run(init_op)

# Run training loop
with sess.as_default():
    for i in range(100):
        batch = mnist_data.train.next_batch(50)
        train_step.run(feed_dict={img: batch[0],
                                  labels: batch[1]})

我们现在可以评估模型

from keras.metrics import categorical_accuracy as accuracy

acc_value = accuracy(labels, preds)
with sess.as_default():
    print acc_value.eval(feed_dict={img: mnist_data.test.images,
                                    labels: mnist_data.test.labels})

在这种情况下，我们仅将 Keras 用作语法上的快捷方式，以生成将某些张量输入映射到某些张量输出的操作，仅此而已。优化是通过原生 TensorFlow 优化器而不是 Keras 优化器完成的。我们甚至根本没有使用任何 Keras Model！

关于原生 TensorFlow 优化器和 Keras 优化器相对性能的说明：“Keras 方式”优化模型与使用 TensorFlow 优化器相比，速度略有差异。有点违反直觉的是，Keras 在大多数情况下似乎更快，速度提高了 5-10%。但是，这些差异足够小，以至于最终使用 Keras 优化器还是原生 TF 优化器来优化模型并不重要。

训练和测试期间的不同行为

一些 Keras 层（例如 Dropout、BatchNormalization）在训练时和测试时的行为不同。您可以通过打印 layer.uses_learning_phase（一个布尔值）来判断一个层是否使用“学习阶段”（训练/测试）：如果该层在训练模式和测试模式下具有不同的行为，则为 True，否则为 False。

如果您的模型包含此类层，则需要在 feed_dict 中指定学习阶段的值，以便您的模型知道是否应用 dropout 等。

可以通过 Keras 后端访问 Keras 学习阶段（一个标量 TensorFlow 张量）

from keras import backend as K
print K.learning_phase()

要使用学习阶段，只需将值“1”（训练模式）或“0”（测试模式）传递给 feed_dict

# train mode
train_step.run(feed_dict={x: batch[0], labels: batch[1], K.learning_phase(): 1})

例如，以下是将 Dropout 层添加到我们之前的 MNIST 示例中的方法

from keras.layers import Dropout
from keras import backend as K

img = tf.placeholder(tf.float32, shape=(None, 784))
labels = tf.placeholder(tf.float32, shape=(None, 10))

x = Dense(128, activation='relu')(img)
x = Dropout(0.5)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.5)(x)
preds = Dense(10, activation='softmax')(x)

loss = tf.reduce_mean(categorical_crossentropy(labels, preds))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)
with sess.as_default():
    for i in range(100):
        batch = mnist_data.train.next_batch(50)
        train_step.run(feed_dict={img: batch[0],
                                  labels: batch[1],
                                  K.learning_phase(): 1})

acc_value = accuracy(labels, preds)
with sess.as_default():
    print acc_value.eval(feed_dict={img: mnist_data.test.images,
                                    labels: mnist_data.test.labels,
                                    K.learning_phase(): 0})

与名称范围、设备范围的兼容性

Keras 层和模型与 TensorFlow 名称范围完全兼容。例如，请考虑以下代码片段

x = tf.placeholder(tf.float32, shape=(None, 20, 64))
with tf.name_scope('block1'):
    y = LSTM(32, name='mylstm')(x)

然后，我们的 LSTM 层的权重将被命名为 block1/mylstm_W_i、block1/mylstm_U_i 等...

类似地，设备范围按预期工作

with tf.device('/gpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops / variables in the LSTM layer will live on GPU:0

与图范围的兼容性

您在 TensorFlow 图范围中定义的任何 Keras 层或模型都将创建其所有变量和操作，作为指定图的一部分。例如，以下代码按预期工作

from keras.layers import LSTM
import tensorflow as tf

my_graph = tf.Graph()
with my_graph.as_default():
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops / variables in the LSTM layer are created as part of our graph

与变量范围的兼容性

变量共享应该通过多次调用相同的 Keras 层（或模型）实例来完成，而不是通过 TensorFlow 变量范围。TensorFlow 变量范围对 Keras 层或模型没有影响。有关使用 Keras 进行权重共享的更多信息，请参阅函数式 API 指南中的“权重共享”部分。

简要总结一下 Keras 中的权重共享是如何工作的：通过重复使用相同的层实例或模型实例，您就是在共享其权重。这是一个简单的示例

# instantiate a Keras layer
lstm = LSTM(32)

# instantiate two TF placeholders
x = tf.placeholder(tf.float32, shape=(None, 20, 64))
y = tf.placeholder(tf.float32, shape=(None, 20, 64))

# encode the two tensors with the *same* LSTM weights
x_encoded = lstm(x)
y_encoded = lstm(y)

收集可训练权重和状态更新

一些 Keras 层（有状态 RNN 和 BatchNormalization 层）具有需要作为每个训练步骤的一部分运行的内部更新。它们存储为张量元组列表，layer.updates。您应该为它们生成 assign 操作，以便在每个训练步骤中运行。这是一个示例

from keras.layers import BatchNormalization

layer = BatchNormalization()(x)

update_ops = []
for old_value, new_value in layer.updates:
    update_ops.append(tf.assign(old_value, new_value))

请注意，如果您使用的是 Keras 模型（Model 实例或 Sequential 实例），则 model.udpates 的行为方式相同（并收集模型中所有底层层的更新）。

此外，如果您需要显式收集层的可训练权重，您可以通过 layer.trainable_weights（或 model.trainable_weights）来完成，这是一个 TensorFlow Variable 实例列表

from keras.layers import Dense

layer = Dense(32)(x)  # instantiate and call a layer
print layer.trainable_weights  # list of TensorFlow Variables

了解这一点后，您就可以基于 TensorFlow 优化器实现自己的训练例程。

二：将 Keras 模型与 TensorFlow 一起使用

转换 Keras `Sequential` 模型以在 TensorFlow 工作流程中使用

您找到了一个想要在 TensorFlow 项目中重复使用的 Keras Sequential 模型（例如，请考虑这个具有预训练权重的 VGG16 图像分类器）。如何进行？

首先，请注意，如果您的预训练权重包含使用 Theano 训练的卷积（层 Convolution2D 或 Convolution1D），则在加载权重时需要翻转卷积核。这是因为 Theano 和 TensorFlow 以不同的方式实现卷积（TensorFlow 实际上实现的是相关性，很像 Caffe）。这是在这种情况下您需要做什么的简要指南。

假设您从以下 Keras 模型开始，并且想要对其进行修改，使其将特定的 TensorFlow 张量 my_input_tensor 作为输入。此输入张量可以是数据馈送操作，例如，也可以是先前 TensorFlow 模型的输出。

# this is our initial Keras model
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(10, activation='softmax'))

您只需使用 keras.layers.InputLayer 在自定义 TensorFlow 占位符之上开始构建您的 Sequential 模型，然后在其之上构建模型的其余部分

from keras.layers import InputLayer

# this is our modified Keras model
model = Sequential()
model.add(InputLayer(input_tensor=custom_input_tensor,
                     input_shape=(None, 784)))

# build the rest of the model as before
model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))

在此阶段，您可以调用 model.load_weights(weights_file) 来加载您的预训练权重。

然后，您可能想要收集 Sequential 模型的输出张量

output_tensor = model.output

您现在可以在 output_tensor 等之上添加新的 TensorFlow 操作。

在 TensorFlow 张量上调用 Keras 模型

Keras 模型的行为与层相同，因此可以在 TensorFlow 张量上调用

from keras.models import Sequential

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(10, activation='softmax'))

# this works! 
x = tf.placeholder(tf.float32, shape=(None, 784))
y = model(x)

注意：通过调用 Keras 模型，您将重复使用其架构和权重。当您在张量上调用模型时，您将在输入张量之上创建新的 TF 操作，并且这些操作将重复使用模型中已经存在的 TF Variable 实例。

三：多 GPU 和分布式训练

将 Keras 模型的一部分分配给不同的 GPU

TensorFlow 设备范围与 Keras 层和模型完全兼容，因此您可以使用它们将图的特定部分分配给不同的 GPU。这是一个简单的示例

with tf.device('/gpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:0

with tf.device('/gpu:1'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:1

请注意，LSTM 层创建的变量不会驻留在 GPU 上：所有 TensorFlow 变量始终驻留在 CPU 上，而与其创建所在的设备范围无关。TensorFlow 在后台处理设备到设备的变量传输。

如果您想在不同的 GPU 上训练同一个模型的多个副本，同时在不同副本之间共享相同的权重，则应首先在一个设备范围内实例化您的模型（或层），然后在不同的 GPU 设备范围内多次调用相同的模型实例，例如

with tf.device('/cpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 784))

    # shared model living on CPU:0
    # it won't actually be run during training; it acts as an op template
    # and as a repository for shared variables
    model = Sequential()
    model.add(Dense(32, activation='relu', input_dim=784))
    model.add(Dense(10, activation='softmax'))

# replica 0
with tf.device('/gpu:0'):
    output_0 = model(x)  # all ops in the replica will live on GPU:0

# replica 1
with tf.device('/gpu:1'):
    output_1 = model(x)  # all ops in the replica will live on GPU:1

# merge outputs on CPU
with tf.device('/cpu:0'):
    preds = 0.5 * (output_0 + output_1)

# we only run the `preds` tensor, so that only the two
# replicas on GPU get run (plus the merge op on CPU)
output_value = sess.run([preds], feed_dict={x: data})

分布式训练

您可以通过向 Keras 注册链接到集群的 TF 会话，轻松地利用 TensorFlow 分布式训练

server = tf.train.Server.create_local_server()
sess = tf.Session(server.target)

from keras import backend as K
K.set_session(sess)

有关在分布式环境中使用 TensorFlow 的更多信息，请参阅本教程。

四：使用 TensorFlow-serving 导出模型

TensorFlow Serving 是一个用于在生产环境中提供 TensorFlow 模型的库，由 Google 开发。

任何 Keras 模型都可以使用 TensorFlow-serving 导出（只要它只有一个输入和一个输出，这是 TF-serving 的限制），无论它是否作为 TensorFlow 工作流程的一部分进行训练。事实上，您甚至可以使用 Theano 训练您的 Keras 模型，然后切换到 TensorFlow Keras 后端并导出您的模型。

以下是它的工作原理。

如果您的图使用了 Keras 学习阶段（训练时和测试时的行为不同），那么在导出模型之前，您需要做的第一件事就是将学习阶段的值（大概是 0，即测试模式）硬编码到您的图中。这可以通过以下步骤完成：1）使用 Keras 后端注册一个常量学习阶段，以及 2）之后重新构建您的模型。

以下是这两个简单步骤的实际操作

from keras import backend as K

K.set_learning_phase(0)  # all new operations will be in test mode from now on

# serialize the model and get its weights, for quick re-building
config = previous_model.get_config()
weights = previous_model.get_weights()

# re-build a model where the learning phase is now hard-coded to 0
from keras.models import model_from_config
new_model = model_from_config(config)
new_model.set_weights(weights)

我们现在可以使用 TensorFlow-serving 来导出模型，请按照官方教程中的说明进行操作

from tensorflow_serving.session_bundle import exporter

export_path = ... # where to save the exported graph
export_version = ... # version number (integer)

saver = tf.train.Saver(sharded=True)
model_exporter = exporter.Exporter(saver)
signature = exporter.classification_signature(input_tensor=model.input,
                                              scores_tensor=model.output)
model_exporter.init(sess.graph.as_graph_def(),
                    default_graph_signature=signature)
model_exporter.export(export_path, tf.constant(export_version), sess)

想在本指南中看到新的主题？请在 Twitter 上联系我们。

Keras 博客

Keras 作为 TensorFlow 的简化接口：教程

将 Keras 作为 TensorFlow 工作流程的一部分使用的完整指南

一：在 TensorFlow 张量上调用 Keras 层

训练和测试期间的不同行为

与名称范围、设备范围的兼容性

与图范围的兼容性

与变量范围的兼容性

收集可训练权重和状态更新

二：将 Keras 模型与 TensorFlow 一起使用

转换 Keras `Sequential` 模型以在 TensorFlow 工作流程中使用

在 TensorFlow 张量上调用 Keras 模型

三：多 GPU 和分布式训练

将 Keras 模型的一部分分配给不同的 GPU

分布式训练

四：使用 TensorFlow-serving 导出模型

将 Keras 作为 TensorFlow 工作流程的一部分使用的完整指南

一：在 TensorFlow 张量上调用 Keras 层

训练和测试期间的不同行为

与名称范围、设备范围的兼容性

与图范围的兼容性

与变量范围的兼容性

收集可训练权重和状态更新

二：将 Keras 模型与 TensorFlow 一起使用

转换 Keras Sequential 模型以在 TensorFlow 工作流程中使用

在 TensorFlow 张量上调用 Keras 模型

三：多 GPU 和分布式训练

将 Keras 模型的一部分分配给不同的 GPU

分布式训练

四：使用 TensorFlow-serving 导出模型

转换 Keras `Sequential` 模型以在 TensorFlow 工作流程中使用