【学习笔记】cs231n中assignment2的dropout

写在前面：

这仅仅是自己的学习笔记，如果侵权，还请告知;
讲义是参照杜客等人对cs231n的中文翻译。

对于神经网路的正则化惩罚，方法主要有L2正则化、L1正则化、最大范式约束（Max norm constraints）以及随机失活（Dropout）。

首先简要介绍其他正则化方法：

（1）L2正则化：可能是最常用的正则化方法。对于每个权重 $W_{i}$ ，在目标函数中加入 $frac{1}{2}lambda W_i^{2}$ ，则关于梯度则是 $lambda W_{i}$ 。

（2）L1正则化：对于每个 $W_{i}$ ，在目标函数中加入 $lambda left| W_{i} ight|$ ，L2正则化与L1正则化可以混用。

一般来说L2正则化效果比L1正则化效果好。

（3）最大范式约束（Max norm constraints）

（4）随机失活（Dropout）：简单又及其有效。

与L1正则化，L2正则化和最大范式约束等方法互为补充。在训练的时候，随机失活的实现方法是让神经元以超参数的概率被激活或者被设置为0。

过拟合是深度神经网路（DNN）的常见问题：模型只学会在训练数据集上进行分类。

dropout的思想是训练整体DNN，并平均整个集合的结果。而不是训练单个的DNN。DNNs是以概率p舍弃部分神经元，其他神经元以概率q=1-p被保留，舍弃的神经元的输出都被设置为0。（详见深度学习网路大杀器之dropout——深入解析dropout）

Dropout在实践中能很好工作是因为其在训练阶段阻止神经元的共适应。

在实际应用中，一般采用反向随机失活（Inverted Dropout）。即在训练阶段对数值进行所放，测试阶段保持不变。

详细代码如下：

前向传播：

def dropout_forward(x, dropout_param): """ Performs the forward pass for (inverted) dropout.

Inputs:
- x: Input data, of any shape
- dropout_param: A dictionary with the following keys:
- p: Dropout parameter. We drop each neuron output with probability p.
- mode: test or train. If the mode is train, then perform dropout;
if the mode is test, then just return the input.
- seed: Seed for the random number generator. Passing seed makes this
function deterministic, which is needed for gradient checking but not
in real networks.

Outputs:
- out: Array of the same shape as x.
- cache: tuple (dropout_param, mask). In training mode, mask is the dropout
mask that was used to multiply the input; in test mode, mask is None.
"""
p, mode = dropout_param[p], dropout_param[mode]
if seed in dropout_param:
np.random.seed(dropout_param[seed])

mask = None
out = None

if mode == train:
#######################################################################
# TODO: Implement training phase forward pass for inverted dropout. #
# Store the dropout mask in the mask variable. #
#######################################################################
pass
mask=(np.random.rand(*x.shape)<p)/p ##以某一概率随机失活
out=x * mask
#######################################################################
# END OF YOUR CODE #
#######################################################################
elif mode == test:
#######################################################################
# TODO: Implement the test phase forward pass for inverted dropout. #
#######################################################################
pass
out=x
#######################################################################
# END OF YOUR CODE #
#######################################################################
cache = (dropout_param, mask)
out = out.astype(x.dtype, copy=False)
return out, cache

反向传播时代码：

def dropout_backward(dout, cache): """ Perform the backward pass for (inverted) dropout.

Inputs:
- dout: Upstream derivatives, of any shape
- cache: (dropout_param, mask) from dropout_forward.
"""
dropout_param, mask = cache
mode = dropout_param[mode]

dx = None
if mode == train:
#######################################################################
# TODO: Implement training phase backward pass for inverted dropout #
#######################################################################
pass
dx=dout*mask
#######################################################################
# END OF YOUR CODE #
#######################################################################
elif mode == test:
dx = dout
return dx

【学习笔记】cs231n中assignment2的dropout

热门新闻

周热门

【学习笔记】cs231n中assignment2的dropout

轻量型网路：MixNet解读

【学习笔记】cs231n中assignment1中的k-Nearest Neighbor (kNN) exercise

用小样本数据集构建强大的图像分类模型

「知识星球」网路结构1000变上线，下半年更新500+网路模型解读

PN-28: Sub-policy Adaptation for HRL (arXiv 1906)

重磅！图像分类相关文献/代码大列表

强化学习之Q-Learning

详解PyTorch中的ModuleList和Sequential

人工智慧ai演算法、深度学习、机器学习、自然语言处理工程师岗位

F-Principle：初探理解深度学习不能做什么

换脸视频后AI又出偏门应用：用演算法「脱」女性衣服

【KDD18最佳论文揭晓】中科大等斩获最佳学生论文

EasyDL终于让「人工智慧」的「人工」部分智能了

伯克利Deep Reinforcement Learning-1

用LUT来做一个可动态配置的卷积核

热门新闻

周热门