TFLearn模型預測、保存和載入
模型預測
- predict_label (X)
預測X的類標籤,輸入數組或數組列表,返回數組或數組列表
- 新的預測函數?
RNN是一個非常適合判斷是否是XSS攻擊的深度學習模型,通過predict_label
驗證樣本,驗證一條空樣本平均耗時5.5us,但是驗證一條有效樣本耗時竟然達到1.56秒,可能和具體的樣本有關係,反正現在看到的數據性能特別差,樣本需要批量處理,否則嚴重影響性能。對於一個報文來說,需要將請求URL、請求體相關數據整理成樣本數組,來提高模型驗證速度。後續有空驗證非攻擊樣本,看看攻擊樣本的驗證速度和正常樣本的驗證速度關係。
驗證樣本從1000開始:
驗證1條樣本:1.56秒,平均一條耗時1.56s
驗證10條驗證:0.089秒,平均一條耗時8.9ms驗證100條樣本:0.247秒,平均一條耗時2.5ms驗證1000條樣本:2.08秒,平均一條耗時2.1ms
驗證樣本從8000開始:
驗證1條樣本:1.57秒,平均一條耗時1.57s
驗證10條驗證:0.0897秒,平均一條耗時9.0ms驗證100條樣本:0.249秒,平均一條耗時2.5ms驗證1000條樣本:2.053秒,平均一條耗時2.1ms
模型保存和載入
訓練15萬數據,真的需要花費一定時間,每次跑新樣本,都要重新訓練一遍,多浪費時間啊,怎麼辦呢?以前我的做法就是訓練好了,返回模型。然後把訓練樣本,放一個while(1)
循環中,我們只需要更新樣本,程序就給我們樣本的分類。這相當於熱模型,程序結束了,就需要重新訓練樣本。如何能保存模型,冷啟動,模型持久化,那豈不是美滋滋。如何實現呢?其實TFLearn自帶了API,save
和load
,怎麼用呢?其實很簡單。
使用思路:
第一步:訓練模型,保存模型
with tf.Graph().as_default():
#模型訓練
model.save("./gut.tfl")
第二步:用第一步的模型框架,載入保存的模型
with tf.Graph().as_default():
#模型框架
model.load("./gut.tfl")
完整代碼:
import sys
import urllib
import numpy as np
import tensorflow as tf
import tflearn
from tflearn.data_utils import to_categorical, pad_sequences
from sklearn.model_selection import train_test_split
import time
def elt(line):
x = []
for i, c in enumerate(line):
c = c.lower()
x.append(ord(c))
return x
def load_file(filename,label,ms=[],ns=[]):
with open(filename) as f:
for line in f:
line = line.strip(
)
line = urllib.unquote(line)
if len(line)<= 100:
m = elt(line)
if(label):
n = 1
else:
n = 0
ms.append(m)
ns.append(n)
def load_files(file1,file2):
xs = []
ys = []
load_file(file1,1,xs,ys)
load_file(file2,0,xs,ys)
return xs,ys
def train(x,y):
graph1 = tf.Graph()
with graph1.as_default():
x_train, x_test, y_train, y_test=train_test_split( x,y, test_size=0.4,random_state=0)
x_train = pad_sequences(x_train,maxlen=100,value=0.)
x_test = pad_sequences(x_test,maxlen=100,value=0.)
y_train = to_categorical(y_train, nb_classes=2)
y_test = to_categorical(y_test, nb_classes=2)
net = tflearn.input_data([None, 100])
net = tflearn.embedding(net, input_dim=256, output_dim=128)
net = tflearn.lstm(net, 128, dropout=0.8)
net = tflearn.fully_connected(net, 2, activation=softmax)
net = tflearn.regression(net, optimizer=adam, learning_rate=0.1,
loss=categorical_crossentropy)
model = tflearn.DNN(net, tensorboard_verbose=3)
model.fit(x_train, y_train,n_epoch=1, validation_set=(x_test, y_test), show_metric=True,
batch_size=200,run_id="gut")
print("---------befor-------")
print(model.predict_label(x_test[8:9]))
model.save("./gut.tfl")
return x_test
def gut(x_test):
graph2 = tf.Graph()
with graph2.as_default():
net = tflearn.input_data([None, 100])
net = tflearn.embedding(net, input_dim=256, output_dim=128)
net = tflearn.lstm(net, 128, dropout=0.8)
net = tflearn.fully_connected(net, 2, activation=softmax)
net = tflearn.regression(net, optimizer=adam, learning_rate=0.1,
loss=categorical_crossentropy)
model = tflearn.DNN(net, tensorboard_verbose=3)
print("...........after.....")
model.load("./gut.tfl")
print(model.predict_label(x_test[8:9]))
if __name__ == "__main__":
xs,ys = load_files(sys.argv[1],sys.argv[2])
x_test = train(xs,ys)
gut(x_test)
推薦閱讀: