本公眾號由哥大,清華,復旦,中科大多名研究生共同創建,主要更新最新AI頂會快速解讀和Pytorch編程,金融大數據與量化投資。如果你喜歡的話,請關注我們公眾號,有學習資源放送,謝謝!
http://weixin.qq.com/r/Bik6IiTElnnprWBB93wU (二維碼自動識別)
歡迎加微信號:uft-uft進羣交流,記得備註知乎+所在學校或企業。
https://u.wechat.com/MK2C3qQoAkXkuiSqn2DomBc (二維碼自動識別)
TCN是指時間卷積網路,一種新型的可以用來解決時間序列預測的演算法。這次更新將解讀最終訓練的代碼部分。
論文名稱:
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
作者:Shaojie Bai 1 J. Zico Kolter 2 Vladlen Koltun 3
代碼詳解與結果
下面是代碼詳解與結果部分
第一部分
第二部分
第三部分:訓練
main部分與結果
具體代碼
model = TCN(emsize, n_words, num_chans, dropout=dropout,
emb_dropout=emb_dropout, kernel_size=k_size, tied_weights=tied)
if cuda:
model.cuda()
# May use adaptive softmax to speed up training
criterion = nn.CrossEntropyLoss()
#交叉熵損失函數
optimizer = getattr(optim, optimization)(model.parameters(), lr=lr)
#getattr() 函數是用於返回一個對象屬性值。
#getattr(object, name[, default])
#object -- 對象
#name -- 字元串,對象屬性。
#default -- 默認返回值,如果不提供該參數,在沒有對應屬性時,將觸發 AttributeError。
#model.parameters()是提供給optim的參數
def evaluate(data_source):
model.eval()
total_loss = 0
processed_data_size = 0
for i in range(0, data_source.size(1) - 1, validseqlen):
if i + seq_len - validseqlen >= data_source.size(1) - 1:
continue
data, targets = get_batch(data_source, i, seq_len, evaluation=True)
#獲得數據
output = model(data)
#得到輸出值
# Discard the effective history, just like in training
eff_history = seq_len - validseqlen
#eff_history是總序列長度-驗證序列的長度
final_output = output[:, eff_history:].contiguous().view(-1, n_words)
final_target = targets[:, eff_history:].contiguous().view(-1)
loss = criterion(final_output, final_target)
# Note that we dont add TAR loss here
total_loss += (data.size(1) - e ff_history) * loss.data
#總損失
processed_data_size += data.size(1) - eff_history#大小
return total_loss[0] / processed_data_size
def train():
# Turn on training mode which enables dropout.
global train_data
model.train()
start_time = time.time()
for batch_idx, i in enumerate(range(0, train_data.size(1) - 1, validseqlen)):
if i + seq_len - validseqlen >= train_data.size(1) - 1:
data, targets = get_batch(train_data, i, seq_len)
optimizer.zero_grad()#梯度清零
# Discard the effective history part
if eff_history < 0:
raise ValueError("Valid sequence length must be smaller than sequence length!")
#contiguous是因為view需要tensor的內存是整塊的
loss = criterion(final_output, final_target)#交叉熵函數
loss.backward()
#看似沒有傳參數,其實是反向求解梯度的過程
if clip > 0:
torch.nn.utils.clip_grad_norm(model.parameters(), clip)
#梯度裁剪,輸入是(NN參數,最大梯度範數,範數類型=2) 一般默認為L2 範數
optimizer.step()
#使用step方法來對所有參數進行更新
total_loss += loss.data
if batch_idx % log_interval == 0 and batch_idx > 0:
cur_loss = total_loss[0] / log_interval
elapsed = time.time() - start_time
print(| epoch {:3d} | {:5d}/{:5d} batches | lr {:02.5f} | ms/batch {:5.5f} |
loss {:5.2f} | ppl {:8.2f}.format(
epoch, batch_idx, train_data.size(1) // validseqlen, lr,
elapsed * 1000 / log_interval, cur_loss, math.exp(cur_loss)))
#輸出結果,具體結果可查看之後圖片
if __name__ == "__main__":
best_vloss = 1e8
# At any point you can hit Ctrl + C to break out of training early.
try:
all_vloss = []
for epoch in range(1, epochs+1):
epoch_start_time = time.time()
train()
val_loss = evaluate(val_data)
test_loss = evaluate(test_data)
print(- * 89)
print(| end of epoch {:3d} | time: {:5.2f}s | valid loss {:5.2f} |
valid ppl {:8.2f}.format(epoch, (time.time() - epoch_start_time),
val_loss, math.exp(val_loss)))
print(| end of epoch {:3d} | time: {:5.2f}s | test loss {:5.2f} |
test ppl {:8.2f}.format(epoch, (time.time() - epoch_start_time),
test_loss, math.exp(test_loss)))
# Save the model if the validation loss is the best weve seen so far.
if val_loss < best_vloss:
with open("model.pt", wb) as f:
print(Save model! )
torch.save(model, f)
best_vloss = val_loss
# Anneal the learning rate if the validation loss plateaus
if epoch > 5 and val_loss >= max(all_vloss[-5:]):
lr = lr / 2.
for param_group in optimizer.param_groups:
param_group[lr] = lr
all_vloss.append(val_loss)
except KeyboardInterrupt:
print(Exiting from training early)
# Load the best saved model.
with open("model.pt", rb) as f:
model = torch.load(f)
# Run on test data.
print(= * 89)
print(| End of training | test loss {:5.2f} | test ppl {:8.2f}.format(
往期文章精選
AI頂會論文快速解讀|將金融領域中的CVaR作為目標策略來定製對話模型
AI頂會論文快速解讀|上下文對比特徵與門控多尺度聚合用於場景分割
AI頂會論文詳細解讀|對抗訓練解決開放式生成式對話
AI頂會論文詳細解讀|深度強化學習之基於對話交互的學習對話
AI頂會論文快速解讀|控制具體化程度的閑聊對話系統
精美PPT快速熟悉RNN與LSTM,附tensorflow教程代碼(回復PPT有資源放送)
每天三分鐘之Pytorch編程-1:為何選擇你?
每天三分鐘之Pytorch編程-2:一起來搭積木吧
每天三分鐘之Pytorch編程-3:沒事就來煉丹吧
每天三分鐘之Pytorch編程-4:來搭建個翻譯系統吧(1)
每天三分鐘之Pytorch編程-4:來搭建個翻譯系統吧(2)
每天三分鐘之Pytorch編程-4:來搭建個翻譯系統吧(3)
關注本公眾號