Tensorflow2.0教程-自定義層

tensorflow2.0建議使用tf.keras作為構建神經網路的高級API。 也就是說,大多數TensorFlow API都可用於eager執行模式。

最全Tensorflow 2.0 入門教程持續更新:

Doit:最全Tensorflow2.0 入門教程持續更新?

zhuanlan.zhihu.com圖標

完整tensorflow2.0教程代碼請看 github.com/czy36mengfei (歡迎star)

本教程主要由tensorflow2.0官方教程的個人學習復現筆記整理而來,中文講解,方便喜歡閱讀中文教程的朋友,官方教程:https://www.tensorflow.org

一、網路層layer的常見操作

通常機器學習模型可以表示為簡單網路層的堆疊與組合,而tensorflow就提供了常見的網路層,為我們編寫神經網路程序提供了便利。 TensorFlow2推薦使用tf.keras來構建網路層,tf.keras來自原生keras,用其來構建網路具有更好的可讀性和易用性。

如,我們要構造一個簡單的全連接網路,只需要指定網路的神經元個數

layer = tf.keras.layers.Dense(100)
# 也可以添加輸入維度限制
layer = tf.keras.layers.Dense(100, input_shape=(None, 20))

可以在文檔中查看預先存在的圖層的完整列表。 它包括Dense,Conv2D,LSTM,BatchNormalization,Dropout等等。

每個層都可以當作一個函數,然後以輸入的數據作為函數的輸入

layer(tf.ones([6, 6]))

同時我們也可以得到網路的變數、權重矩陣、偏置等

print(layer.variables) # 包含了權重和偏置
[<tf.Variable dense_1/kernel:0 shape=(6, 100) dtype=float32, numpy=
array([[-0.18237606, 0.16415142, 0.20687856, 0.23396944, 0.09779547,
-0.14794639, -0.10231382, -0.22263053, -0.0950674 , 0.18697281,
...
-0.10450925, -0.12111329, -0.2259491 , 0.12304659, -0.04047236]],
dtype=float32)>, <tf.Variable dense_1/bias:0 shape=(100,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
dtype=float32)>]
print(layer.kernel, layer.bias) # 也可以分別取出權重和偏置
<tf.Variable dense_1/kernel:0 shape=(6, 100) dtype=float32, numpy=
array([[-0.18237606, 0.16415142, 0.20687856, 0.23396944, 0.09779547,
-0.14794639, -0.10231382, -0.22263053, -0.0950674 , 0.18697281,
...
-0.10348159, -0.07493602, -0.1722112 , -0.23290877, 0.18784209],
[ 0.13477843, 0.11936818, -0.21257897, 0.21244659, -0.18786472,
-0.06494723, -0.07063387, -0.07994832, -0.11256738, -0.22335076,
-0.02153319, -0.20943552, -0.21425952, -0.12278055, -0.00619341,
-0.09176037, -0.1766775 , -0.21622379, -0.04250833, 0.23764552,
0.21168886, 0.09459655, -0.07919639, -0.21559525, 0.20465617,
-0.20613717, 0.13103445, 0.21384992, 0.04693423, 0.20122723,
0.12190209, 0.22194327, -0.05410977, -0.1792583 , -0.03342254,
0.09272121, 0.06039228, 0.09666802, -0.22759588, -0.14688678,
0.12520896, 0.15474696, -0.23104139, 0.18017791, -0.02388267,
-0.01371126, 0.2352383 , -0.10501392, 0.01626216, -0.14222105,
0.13740788, 0.18499441, 0.03618436, -0.01862051, -0.1401035 ,
-0.01304157, -0.04905747, -0.07051091, 0.10759439, -0.08964662,
-0.01344521, -0.17841959, -0.17568308, -0.12892699, 0.11976974,
0.02280475, 0.16669382, 0.21027894, 0.21428709, -0.04820213,
-0.22136293, -0.13934767, 0.142024 , -0.07064074, 0.1470062 ,
0.00042979, -0.2371952 , -0.06649312, 0.10123204, -0.20473264,
-0.09161748, 0.20804678, -0.22195774, -0.09219673, 0.02311908,
0.13456099, 0.14470674, -0.05369592, 0.02126037, 0.0682667 ,
0.08384518, 0.17998771, -0.1927835 , -0.11473013, -0.01386146,
-0.10450925, -0.12111329, -0.2259491 , 0.12304659, -0.04047236]],
dtype=float32)> <tf.Variable dense_1/bias:0 shape=(100,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
dtype=float32)>

二、實現自定義網路層

實現自己的層的最佳方法是擴展tf.keras.Layer類並實現:

  • __init__()函數,你可以在其中執行所有與輸入無關的初始化
  • build()函數,可以獲得輸入張量的形狀,並可以進行其餘的初始化
  • call()函數,構建網路結構,進行前向傳播

實際上,你不必等到調用build()來創建網路結構,您也可以在__init__() 中創建它們。 但是,在build()中創建它們的優點是它可以根據圖層將要操作的輸入的形狀啟用後期的網路構建。 另一方面,在__init__中創建變數意味著需要明確指定創建變數所需的形狀。

class MyDense(tf.keras.layers.Layer):
def __init__(self, n_outputs):
super(MyDense, self).__init__()
self.n_outputs = n_outputs

def build(self, input_shape):
self.kernel = self.add_variable(kernel,
shape=[int(input_shape[-1]),
self.n_outputs])
def call(self, input):
return tf.matmul(input, self.kernel)
layer = MyDense(10)
print(layer(tf.ones([6, 5])))
print(layer.trainable_variables)
tf.Tensor(
[[ 1.0200843 -0.42590106 -0.92992705 0.46160045 0.7518406 0.32543844
0.34020287 0.08215448 0.22044104 -0.5337319 ]
[ 1.0200843 -0.42590106 -0.92992705 0.46160045 0.7518406 0.32543844
0.34020287 0.08215448 0.22044104 -0.5337319 ]
[ 1.0200843 -0.42590106 -0.92992705 0.46160045 0.7518406 0.32543844
0.34020287 0.08215448 0.22044104 -0.5337319 ]
[ 1.0200843 -0.42590106 -0.92992705 0.46160045 0.7518406 0.32543844
0.34020287 0.08215448 0.22044104 -0.5337319 ]
[ 1.0200843 -0.42590106 -0.92992705 0.46160045 0.7518406 0.32543844
0.34020287 0.08215448 0.22044104 -0.5337319 ]
[ 1.0200843 -0.42590106 -0.92992705 0.46160045 0.7518406 0.32543844
0.34020287 0.08215448 0.22044104 -0.5337319 ]], shape=(6, 10), dtype=float32)
[<tf.Variable my_dense/kernel:0 shape=(5, 10) dtype=float32, numpy=
array([[ 0.54810244, 0.042225 , 0.25634396, 0.1677258 , -0.0361526 ,
0.32831818, 0.17709464, 0.46625894, 0.29662275, -0.32920587],
[ 0.30925363, -0.426274 , -0.49862564, 0.3068235 , 0.29526353,
0.50076336, 0.17321467, 0.21151704, -0.26317668, -0.2006711 ],
[ 0.10354012, -0.3258371 , -0.12274069, -0.33250242, 0.46343058,
-0.45535576, 0.5332853 , -0.37351888, -0.00410944, 0.16418225],
[-0.4515978 , 0.04706419, -0.42583126, -0.19347438, 0.54246336,
0.57910997, 0.01877069, 0.01255274, -0.14176458, -0.6309683 ],
[ 0.5107859 , 0.23692083, -0.13907343, 0.51302797, -0.5131643 ,
-0.6273973 , -0.56216246, -0.23465535, 0.332869 , 0.4629311 ]],
dtype=float32)>]

三、網路層組合

機器學習模型中有很多是通過疊加不同的結構層組合而成的,如resnet的每個殘差塊就是「卷積+批標準化+殘差連接」的組合。

在tensorflow2中要創建一個包含多個網路層的的結構,一般繼承與tf.keras.Model類。

# 殘差塊
class ResnetBlock(tf.keras.Model):
def __init__(self, kernel_size, filters):
super(ResnetBlock, self).__init__(name=resnet_block)

# 每個子層卷積核數
filter1, filter2, filter3 = filters

# 三個子層,每層1個卷積加一個批正則化
# 第一個子層, 1*1的卷積
self.conv1 = tf.keras.layers.Conv2D(filter1, (1,1))
self.bn1 = tf.keras.layers.BatchNormalization()
# 第二個子層, 使用特點的kernel_size
self.conv2 = tf.keras.layers.Conv2D(filter2, kernel_size, padding=same)
self.bn2 = tf.keras.layers.BatchNormalization()
# 第三個子層,1*1卷積
self.conv3 = tf.keras.layers.Conv2D(filter3, (1,1))
self.bn3 = tf.keras.layers.BatchNormalization()

def call(self, inputs, training=False):

# 堆疊每個子層
x = self.conv1(inputs)
x = self.bn1(x, training=training)

x = self.conv2(x)
x = self.bn2(x, training=training)

x = self.conv3(x)
x = self.bn3(x, training=training)

# 殘差連接
x += inputs
outputs = tf.nn.relu(x)

return outputs

resnetBlock = ResnetBlock(2, [6,4,9])
# 數據測試
print(resnetBlock(tf.ones([1,3,9,9])))
# 查看網路中的變數名
print([x.name for x in resnetBlock.trainable_variables])
tf.Tensor(
[[[[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.83203167 0.9436392 1.0989372 1.2588525 0.8683256 1.1279813
0.7571581 0.47963202 0.88908756]]

[[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.79764616 1.0550306 0.9386751 1.1079601 0.9402881 0.99479383
0.9072118 0.5618475 0.9134829 ]
[0.83203167 0.9436392 1.0989372 1.2588525 0.8683256 1.1279813
0.7571581 0.47963202 0.88908756]]

[[1.0775117 1.1620466 0.72680396 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[1.0775117 1.1620466 0.72680396 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[1.0775117 1.1620466 0.72680396 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[1.0775117 1.1620466 0.72680396 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[1.0775117 1.1620466 0.72680396 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[1.0775117 1.1620466 0.72680396 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[1.0775117 1.1620466 0.7268039 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[1.0775117 1.1620466 0.7268039 1.0019443 1.2767658 1.1365149
1.1792164 1.0868194 1.0623009 ]
[0.87889266 0.9541194 0.8929231 0.96703756 1.0905087 1.0646607
0.9235744 0.9829142 1.1302696 ]]]], shape=(1, 3, 9, 9), dtype=float32)
[resnet_block/conv2d_12/kernel:0, resnet_block/conv2d_12/bias:0, resnet_block/batch_normalization_v2_12/gamma:0, resnet_block/batch_normalization_v2_12/beta:0, resnet_block/conv2d_13/kernel:0, resnet_block/conv2d_13/bias:0, resnet_block/batch_normalization_v2_13/gamma:0, resnet_block/batch_normalization_v2_13/beta:0, resnet_block/conv2d_14/kernel:0, resnet_block/conv2d_14/bias:0, resnet_block/batch_normalization_v2_14/gamma:0, resnet_block/batch_normalization_v2_14/beta:0]

如果模型是線性的,可以直接用tf.keras.Sequential來構造。

seq_model = tf.keras.Sequential(
[
tf.keras.layers.Conv2D(1, 1, input_shape=(None, None, 3)),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(2, 1, padding=same),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(3, 1),
tf.keras.layers.BatchNormalization(),

])
seq_model(tf.ones([1,2,3,3]))
<tf.Tensor: id=1354, shape=(1, 2, 3, 3), dtype=float32, numpy=
array([[[[-0.36850607, -0.60731524, 1.2792252 ],
[-0.36850607, -0.60731524, 1.2792252 ],
[-0.36850607, -0.60731524, 1.2792252 ]],

[[-0.36850607, -0.60731524, 1.2792252 ],
[-0.36850607, -0.60731524, 1.2792252 ],
[-0.36850607, -0.60731524, 1.2792252 ]]]], dtype=float32)>

推薦閱讀:

相關文章