本公眾號由哥大,清華,復旦,中科大多名研究生共同創建,主要更新最新AI頂會快速解讀和Pytorch編程,金融大數據與量化投資。如果你喜歡的話,請關注我們公眾號,有學習資源放送,謝謝!

weixin.qq.com/r/Bik6IiT (二維碼自動識別)

歡迎加微信號:uft-uft進羣交流,記得備註知乎+所在學校或企業。

u.wechat.com/MK2C3qQoAk (二維碼自動識別)

TCN是指時間卷積網路,一種新型的可以用來解決時間序列預測的演算法。這次更新將解讀最終訓練的代碼部分。

論文名稱:

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

作者:Shaojie Bai 1 J. Zico Kolter 2 Vladlen Koltun 3

代碼詳解與結果

下面是代碼詳解與結果部分

第一部分

第二部分

第三部分:訓練

main部分與結果

具體代碼

model = TCN(emsize, n_words, num_chans, dropout=dropout,

emb_dropout=emb_dropout, kernel_size=k_size, tied_weights=tied)

if cuda:

model.cuda()

# May use adaptive softmax to speed up training

criterion = nn.CrossEntropyLoss()

#交叉熵損失函數

optimizer = getattr(optim, optimization)(model.parameters(), lr=lr)

#getattr() 函數是用於返回一個對象屬性值。

#getattr(object, name[, default])

#object -- 對象

#name -- 字元串,對象屬性。

#default -- 默認返回值,如果不提供該參數,在沒有對應屬性時,將觸發 AttributeError。

#model.parameters()是提供給optim的參數

def evaluate(data_source):

model.eval()

total_loss = 0

processed_data_size = 0

for i in range(0, data_source.size(1) - 1, validseqlen):

if i + seq_len - validseqlen >= data_source.size(1) - 1:

continue

data, targets = get_batch(data_source, i, seq_len, evaluation=True)

#獲得數據

output = model(data)

#得到輸出值

# Discard the effective history, just like in training

eff_history = seq_len - validseqlen

#eff_history是總序列長度-驗證序列的長度

final_output = output[:, eff_history:].contiguous().view(-1, n_words)

final_target = targets[:, eff_history:].contiguous().view(-1)

loss = criterion(final_output, final_target)

#交叉熵損失函數

# Note that we dont add TAR loss here

total_loss += (data.size(1) - e ff_history) * loss.data

#總損失

processed_data_size += data.size(1) - eff_history#大小

return total_loss[0] / processed_data_size

def train():

# Turn on training mode which enables dropout.

global train_data

model.train()

total_loss = 0

start_time = time.time()

for batch_idx, i in enumerate(range(0, train_data.size(1) - 1, validseqlen)):

if i + seq_len - validseqlen >= train_data.size(1) - 1:

continue

data, targets = get_batch(train_data, i, seq_len)

optimizer.zero_grad()#梯度清零

output = model(data)

# Discard the effective history part

eff_history = seq_len - validseqlen

if eff_history < 0:

raise ValueError("Valid sequence length must be smaller than sequence length!")

final_target = targets[:, eff_history:].contiguous().view(-1)

#contiguous是因為view需要tensor的內存是整塊的

final_output = output[:, eff_history:].contiguous().view(-1, n_words)

loss = criterion(final_output, final_target)#交叉熵函數

loss.backward()

#看似沒有傳參數,其實是反向求解梯度的過程

if clip > 0:

torch.nn.utils.clip_grad_norm(model.parameters(), clip)

#梯度裁剪,輸入是(NN參數,最大梯度範數,範數類型=2) 一般默認為L2 範數

optimizer.step()

#使用step方法來對所有參數進行更新

total_loss += loss.data

if batch_idx % log_interval == 0 and batch_idx > 0:

cur_loss = total_loss[0] / log_interval

elapsed = time.time() - start_time

print(| epoch {:3d} | {:5d}/{:5d} batches | lr {:02.5f} | ms/batch {:5.5f} |

loss {:5.2f} | ppl {:8.2f}.format(

epoch, batch_idx, train_data.size(1) // validseqlen, lr,

elapsed * 1000 / log_interval, cur_loss, math.exp(cur_loss)))

#輸出結果,具體結果可查看之後圖片

total_loss = 0

start_time = time.time()

if __name__ == "__main__":

best_vloss = 1e8

# At any point you can hit Ctrl + C to break out of training early.

try:

all_vloss = []

for epoch in range(1, epochs+1):

epoch_start_time = time.time()

train()

val_loss = evaluate(val_data)

test_loss = evaluate(test_data)

print(- * 89)

print(| end of epoch {:3d} | time: {:5.2f}s | valid loss {:5.2f} |

valid ppl {:8.2f}.format(epoch, (time.time() - epoch_start_time),

val_loss, math.exp(val_loss)))

print(| end of epoch {:3d} | time: {:5.2f}s | test loss {:5.2f} |

test ppl {:8.2f}.format(epoch, (time.time() - epoch_start_time),

test_loss, math.exp(test_loss)))

print(- * 89)

# Save the model if the validation loss is the best weve seen so far.

if val_loss < best_vloss:

with open("model.pt", wb) as f:

print(Save model!
)

torch.save(model, f)

best_vloss = val_loss

# Anneal the learning rate if the validation loss plateaus

if epoch > 5 and val_loss >= max(all_vloss[-5:]):

lr = lr / 2.

for param_group in optimizer.param_groups:

param_group[lr] = lr

all_vloss.append(val_loss)

except KeyboardInterrupt:

print(- * 89)

print(Exiting from training early)

# Load the best saved model.

with open("model.pt", rb) as f:

model = torch.load(f)

# Run on test data.

test_loss = evaluate(test_data)

print(= * 89)

print(| End of training | test loss {:5.2f} | test ppl {:8.2f}.format(

test_loss, math.exp(test_loss)))

print(= * 89)

往期文章精選

AI頂會論文快速解讀|將金融領域中的CVaR作為目標策略來定製對話模型

AI頂會論文快速解讀|上下文對比特徵與門控多尺度聚合用於場景分割

AI頂會論文詳細解讀|對抗訓練解決開放式生成式對話

AI頂會論文詳細解讀|深度強化學習之基於對話交互的學習對話

AI頂會論文快速解讀|控制具體化程度的閑聊對話系統

精美PPT快速熟悉RNN與LSTM,附tensorflow教程代碼(回復PPT有資源放送)

每天三分鐘之Pytorch編程-1:為何選擇你?

每天三分鐘之Pytorch編程-2:一起來搭積木吧

每天三分鐘之Pytorch編程-3:沒事就來煉丹吧

每天三分鐘之Pytorch編程-4:來搭建個翻譯系統吧(1)

每天三分鐘之Pytorch編程-4:來搭建個翻譯系統吧(2)

每天三分鐘之Pytorch編程-4:來搭建個翻譯系統吧(3)

關注本公眾號

weixin.qq.com/r/Bik6IiT (二維碼自動識別)


推薦閱讀:
相關文章