本公众号由哥大,清华,复旦,中科大多名研究生共同创建,主要更新最新AI顶会快速解读和Pytorch编程,金融大数据与量化投资。如果你喜欢的话,请关注我们公众号,有学习资源放送,谢谢!

weixin.qq.com/r/Bik6IiT (二维码自动识别)

欢迎加微信号:uft-uft进群交流,记得备注知乎+所在学校或企业。

u.wechat.com/MK2C3qQoAk (二维码自动识别)

TCN是指时间卷积网路,一种新型的可以用来解决时间序列预测的演算法。这次更新将解读最终训练的代码部分。

论文名称:

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

作者:Shaojie Bai 1 J. Zico Kolter 2 Vladlen Koltun 3

代码详解与结果

下面是代码详解与结果部分

第一部分

第二部分

第三部分:训练

main部分与结果

具体代码

model = TCN(emsize, n_words, num_chans, dropout=dropout,

emb_dropout=emb_dropout, kernel_size=k_size, tied_weights=tied)

if cuda:

model.cuda()

# May use adaptive softmax to speed up training

criterion = nn.CrossEntropyLoss()

#交叉熵损失函数

optimizer = getattr(optim, optimization)(model.parameters(), lr=lr)

#getattr() 函数是用于返回一个对象属性值。

#getattr(object, name[, default])

#object -- 对象

#name -- 字元串,对象属性。

#default -- 默认返回值,如果不提供该参数,在没有对应属性时,将触发 AttributeError。

#model.parameters()是提供给optim的参数

def evaluate(data_source):

model.eval()

total_loss = 0

processed_data_size = 0

for i in range(0, data_source.size(1) - 1, validseqlen):

if i + seq_len - validseqlen >= data_source.size(1) - 1:

continue

data, targets = get_batch(data_source, i, seq_len, evaluation=True)

#获得数据

output = model(data)

#得到输出值

# Discard the effective history, just like in training

eff_history = seq_len - validseqlen

#eff_history是总序列长度-验证序列的长度

final_output = output[:, eff_history:].contiguous().view(-1, n_words)

final_target = targets[:, eff_history:].contiguous().view(-1)

loss = criterion(final_output, final_target)

#交叉熵损失函数

# Note that we dont add TAR loss here

total_loss += (data.size(1) - e ff_history) * loss.data

#总损失

processed_data_size += data.size(1) - eff_history#大小

return total_loss[0] / processed_data_size

def train():

# Turn on training mode which enables dropout.

global train_data

model.train()

total_loss = 0

start_time = time.time()

for batch_idx, i in enumerate(range(0, train_data.size(1) - 1, validseqlen)):

if i + seq_len - validseqlen >= train_data.size(1) - 1:

continue

data, targets = get_batch(train_data, i, seq_len)

optimizer.zero_grad()#梯度清零

output = model(data)

# Discard the effective history part

eff_history = seq_len - validseqlen

if eff_history < 0:

raise ValueError("Valid sequence length must be smaller than sequence length!")

final_target = targets[:, eff_history:].contiguous().view(-1)

#contiguous是因为view需要tensor的内存是整块的

final_output = output[:, eff_history:].contiguous().view(-1, n_words)

loss = criterion(final_output, final_target)#交叉熵函数

loss.backward()

#看似没有传参数,其实是反向求解梯度的过程

if clip > 0:

torch.nn.utils.clip_grad_norm(model.parameters(), clip)

#梯度裁剪,输入是(NN参数,最大梯度范数,范数类型=2) 一般默认为L2 范数

optimizer.step()

#使用step方法来对所有参数进行更新

total_loss += loss.data

if batch_idx % log_interval == 0 and batch_idx > 0:

cur_loss = total_loss[0] / log_interval

elapsed = time.time() - start_time

print(| epoch {:3d} | {:5d}/{:5d} batches | lr {:02.5f} | ms/batch {:5.5f} |

loss {:5.2f} | ppl {:8.2f}.format(

epoch, batch_idx, train_data.size(1) // validseqlen, lr,

elapsed * 1000 / log_interval, cur_loss, math.exp(cur_loss)))

#输出结果,具体结果可查看之后图片

total_loss = 0

start_time = time.time()

if __name__ == "__main__":

best_vloss = 1e8

# At any point you can hit Ctrl + C to break out of training early.

try:

all_vloss = []

for epoch in range(1, epochs+1):

epoch_start_time = time.time()

train()

val_loss = evaluate(val_data)

test_loss = evaluate(test_data)

print(- * 89)

print(| end of epoch {:3d} | time: {:5.2f}s | valid loss {:5.2f} |

valid ppl {:8.2f}.format(epoch, (time.time() - epoch_start_time),

val_loss, math.exp(val_loss)))

print(| end of epoch {:3d} | time: {:5.2f}s | test loss {:5.2f} |

test ppl {:8.2f}.format(epoch, (time.time() - epoch_start_time),

test_loss, math.exp(test_loss)))

print(- * 89)

# Save the model if the validation loss is the best weve seen so far.

if val_loss < best_vloss:

with open("model.pt", wb) as f:

print(Save model!
)

torch.save(model, f)

best_vloss = val_loss

# Anneal the learning rate if the validation loss plateaus

if epoch > 5 and val_loss >= max(all_vloss[-5:]):

lr = lr / 2.

for param_group in optimizer.param_groups:

param_group[lr] = lr

all_vloss.append(val_loss)

except KeyboardInterrupt:

print(- * 89)

print(Exiting from training early)

# Load the best saved model.

with open("model.pt", rb) as f:

model = torch.load(f)

# Run on test data.

test_loss = evaluate(test_data)

print(= * 89)

print(| End of training | test loss {:5.2f} | test ppl {:8.2f}.format(

test_loss, math.exp(test_loss)))

print(= * 89)

往期文章精选

AI顶会论文快速解读|将金融领域中的CVaR作为目标策略来定制对话模型

AI顶会论文快速解读|上下文对比特征与门控多尺度聚合用于场景分割

AI顶会论文详细解读|对抗训练解决开放式生成式对话

AI顶会论文详细解读|深度强化学习之基于对话交互的学习对话

AI顶会论文快速解读|控制具体化程度的闲聊对话系统

精美PPT快速熟悉RNN与LSTM,附tensorflow教程代码(回复PPT有资源放送)

每天三分钟之Pytorch编程-1:为何选择你?

每天三分钟之Pytorch编程-2:一起来搭积木吧

每天三分钟之Pytorch编程-3:没事就来炼丹吧

每天三分钟之Pytorch编程-4:来搭建个翻译系统吧(1)

每天三分钟之Pytorch编程-4:来搭建个翻译系统吧(2)

每天三分钟之Pytorch编程-4:来搭建个翻译系统吧(3)

关注本公众号

weixin.qq.com/r/Bik6IiT (二维码自动识别)


推荐阅读:
相关文章