P3D models 是在論文 Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks 中提出來的。

1,下載P3D的代碼

git clone https://github.com/ZhaofanQiu/pseudo-3d-residual-networks.git

2,下載C3D的代碼

因為訓練P3D需要C3D中的input data layer的支撐。

git clone https://github.com/facebook/C3D

3,合併P3D的代碼到C3D倉庫里

P3D新增的代碼如下所示:

gemfield@T630:/bigdata/gemfield/c3d/P3D/caffe$ find ../pseudo-3d-residual-networks/ -path ../pseudo-3d-residual-networks/.git -prune -o -type f
../pseudo-3d-residual-networks/prototxt/deploy_p3d_resnet_kinetics.prototxt
../pseudo-3d-residual-networks/prototxt/deploy_p3d_resnet_sports1m.prototxt
../pseudo-3d-residual-networks/prototxt/deploy_p3d_resnet_kinetics_flow.prototxt
../pseudo-3d-residual-networks/LICENSE
../pseudo-3d-residual-networks/.git
../pseudo-3d-residual-networks/README.md
../pseudo-3d-residual-networks/caffe_add_layers/include/bn_layer.hpp
../pseudo-3d-residual-networks/caffe_add_layers/include/pooling3d_layer.hpp
../pseudo-3d-residual-networks/caffe_add_layers/include/video_common.hpp
../pseudo-3d-residual-networks/caffe_add_layers/src/pooling3d_layer.cpp
../pseudo-3d-residual-networks/caffe_add_layers/src/video_common.cpp
../pseudo-3d-residual-networks/caffe_add_layers/src/bn_layer.cu
../pseudo-3d-residual-networks/caffe_add_layers/src/pooling3d_layer.cu
../pseudo-3d-residual-networks/caffe_add_layers/src/bn_layer.cpp
../pseudo-3d-residual-networks/caffe_add_layers/caffe.proto

合併proto文件

gemfield@gemfield:/bigdata/gemfield/c3d/P3D/C3D/C3D-v1.1# cat ../../pseudo-3d-residual-networks/caffe_add_layers/caffe.proto >> ./src/caffe/proto/caffe.proto

再刪掉該proto文件中之前的message Pooling3DParameter ;

然後在該proto文件的message LayerParameter塊中加入

optional BNParameter bn_param = 7030;

合併頭文件

gemfield@gemfield:/bigdata/gemfield/c3d/P3D/C3D/C3D-v1.1/include/caffe/layers# cp ../../../../../pseudo-3d-residual-networks/caffe_add_layers/include/* .

合併源文件

gemfield@gemfield:/bigdata/gemfield/c3d/P3D/C3D/C3D-v1.1/src/caffe/layers# cp ../../../../../pseudo-3d-residual-networks/caffe_add_layers/src/* .

刪掉C3D本來存在的src/caffe/layers/pool3d_layer.cpp 、pool3d_layer.cu、pool3d_layer.hpp(因為proto中刪掉了)。

4, 編譯P3D

P3D在caffe的基礎上添加了額外的layer,但是編譯P3D所需的依賴和caffe一模一樣,編譯的步驟和caffe一模一樣,修改Makefile.config(注意,打開cudnn後fine-tune C3D會出現loss不下降的問題,目前Gemfield是關閉的cudnn。P3D上還沒有實驗),然後開始make:

gemfield@T630:/bigdata/gemfield/c3d/P3D/caffe# make -j 4
PROTOC src/caffe/proto/caffe.proto
......
NVCC src/caffe/layers/pooling3d_layer.cu
NVCC src/caffe/layers/bn_layer.cu
......
CXX src/caffe/layers/video_common.cpp
CXX src/caffe/layers/pooling3d_layer.cpp
CXX src/caffe/layers/bn_layer.cpp
......
AR -o .build_release/lib/libcaffe.a
LD -o .build_release/lib/libcaffe.so.1.0.0-rc5
CXX/LD -o .build_release/tools/extract_image_features.bin
CXX/LD -o .build_release/tools/extract_learned_filters.bin
CXX/LD -o .build_release/tools/net_speed_benchmark.bin
CXX/LD -o .build_release/tools/upgrade_net_proto_text.bin
CXX/LD -o .build_release/tools/finetune_net.bin
CXX/LD -o .build_release/tools/extract_all_learned_filters.bin
CXX/LD -o .build_release/tools/caffe.bin
CXX/LD -o .build_release/tools/upgrade_net_proto_binary.bin
CXX/LD -o .build_release/tools/test_net.bin
CXX/LD -o .build_release/tools/compute_image_mean.bin
CXX/LD -o .build_release/tools/convert_imageset.bin
CXX/LD -o .build_release/tools/extract_features.bin
CXX/LD -o .build_release/tools/upgrade_solver_proto_text.bin
CXX/LD -o .build_release/tools/device_query.bin
CXX/LD -o .build_release/tools/train_net.bin
CXX/LD -o .build_release/examples/siamese/convert_mnist_siamese_data.bin
CXX/LD -o .build_release/examples/cpp_classification/classification.bin
CXX/LD -o .build_release/examples/mnist/convert_mnist_data.bin
CXX/LD -o .build_release/examples/cifar10/convert_cifar_data.bin

編譯pycaffe

gemfield@gemfield:/bigdata/gemfield/c3d/P3D/C3D/C3D-v1.1# make pycaffe
CXX/LD -o python/caffe/_caffe.so python/caffe/_caffe.cpp
touch python/caffe/proto/__init__.py
PROTOC (python) src/caffe/proto/caffe.proto

5, 下載預訓練模型

gemfield下載的是在Kinetics dataset上預訓練的P3D模型。

root@gemfield:/bigdata/gemfield/c3d/P3D/prototxt# ls -l p3d_resnet_kinetics_iter_190000.caffemodel
-rw-rw-r-- 1 1000 1000 266761048 Oct 16 07:45 p3d_resnet_kinetics_iter_190000.caffemodel

6,訓練

1,準備好預訓練數據集、train prototxt、solver prototxt,像C3D那樣。

2,開始訓練:

caffe train --solver=gemfield_solver.prototxt --weights=p3d_resnet_kinetics_iter_190000.caffemodel --gpu 0

下面是一些P3D的訓練Log

I0406 16:25:44.225684 6955 caffe.cpp:155] Finetuning from p3d_resnet_kinetics_iter_190000.caffemodel
I0406 16:25:44.523165 6955 net.cpp:744] Ignoring source layer data_raw
I0406 16:25:44.523198 6955 net.cpp:744] Ignoring source layer data_reshape
I0406 16:25:44.594054 6955 net.cpp:744] Ignoring source layer fc_kinetics
I0406 16:25:44.730566 6955 net.cpp:744] Ignoring source layer data_raw
I0406 16:25:44.730595 6955 net.cpp:744] Ignoring source layer data_reshape
I0406 16:25:44.830770 6955 net.cpp:744] Ignoring source layer fc_kinetics
I0406 16:25:44.833505 6955 caffe.cpp:248] Starting Optimization
I0406 16:25:44.833528 6955 solver.cpp:273] Solving TRAIN_P3D_RESNET_KINETICS
I0406 16:25:44.833537 6955 solver.cpp:274] Learning Rate Policy: step
I0406 16:25:44.929944 6955 solver.cpp:331] Iteration 0, Testing net (#0)
I0406 16:25:45.810014 6955 blocking_queue.cpp:49] Waiting for data
I0406 16:26:38.654268 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.01
I0406 16:26:38.654544 6955 solver.cpp:398] Test net output #1: accuracy/top-5 = 0.044
I0406 16:26:38.654563 6955 solver.cpp:398] Test net output #2: loss = 4.61709 (* 1 = 4.61709 loss)
I0406 16:26:39.735512 6955 solver.cpp:219] Iteration 0 (0 iter/s, 54.9019s/20 iters), loss = 4.60751
I0406 16:26:39.735591 6955 solver.cpp:238] Train net output #0: loss = 4.60751 (* 1 = 4.60751 loss)
I0406 16:26:39.735621 6955 sgd_solver.cpp:105] Iteration 0, lr = 0.001
I0406 16:26:47.323081 6955 solver.cpp:219] Iteration 20 (2.63591 iter/s, 7.58751s/20 iters), loss = 3.64746
I0406 16:26:47.323135 6955 solver.cpp:238] Train net output #0: loss = 3.64746 (* 1 = 3.64746 loss)
I0406 16:26:47.323144 6955 sgd_solver.cpp:105] Iteration 20, lr = 0.001
I0406 16:26:57.480957 6955 solver.cpp:219] Iteration 40 (1.96893 iter/s, 10.1578s/20 iters), loss = 3.55093
I0406 16:26:57.482547 6955 solver.cpp:238] Train net output #0: loss = 3.55093 (* 1 = 3.55093 loss)

下面是top-1的變化:

gemfield@T630:/bigdata/gemfield/c3d/P3D/prototxt/LOG_TRAIN$ grep "accuracy/top-1" -n caffe.INFO
16294:I0406 16:25:44.160794 6955 net.cpp:380] accuracy -> accuracy/top-1
17096:I0406 16:25:44.166739 6955 net.cpp:242] This network produces output accuracy/top-1
17113:I0406 16:26:38.654268 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.01
17268:I0406 16:34:35.025169 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.541
17423:I0406 16:42:09.896800 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.652
17578:I0406 16:49:44.389753 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.676
17732:I0406 16:57:23.390195 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.745
17887:I0406 17:05:08.705271 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.77
18044:I0406 17:13:17.783442 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.833
18199:I0406 17:21:04.829033 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.87
18354:I0406 17:28:53.590893 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.864
18509:I0406 17:36:38.413074 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.86
18664:I0406 17:44:10.448470 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.885
18818:I0406 17:51:47.879482 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.882
18975:I0406 17:59:58.552734 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.874
19130:I0406 18:07:54.042325 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.904
19285:I0406 18:15:40.305274 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.891
19440:I0406 18:23:19.788849 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.883
19595:I0406 18:30:13.204354 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.888
19750:I0406 18:37:06.393033 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.887
19906:I0406 18:44:31.083456 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.876
20061:I0406 18:51:16.493669 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.887
20216:I0406 18:58:10.843623 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.888
20371:I0406 19:05:04.862282 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.874
20526:I0406 19:11:52.326022 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.883
20680:I0406 19:18:38.888053 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.883
20837:I0406 19:25:54.717254 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.884
20992:I0406 19:32:54.592254 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.894
21147:I0406 19:39:55.762933 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.894
21302:I0406 19:46:46.837859 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.897
21456:I0406 19:53:34.188211 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.878
21611:I0406 20:00:22.611583 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.887
21768:I0406 20:07:35.929399 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.905
21923:I0406 20:14:22.014807 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.875
22077:I0406 20:21:13.615586 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.897
22232:I0406 20:28:05.496280 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.893
22387:I0406 20:34:58.353957 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.888
22542:I0406 20:41:50.670209 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.883
22699:I0406 20:49:06.156272 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.866
22853:I0406 20:55:48.707999 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.88
23008:I0406 21:02:33.457456 6955 solver.cpp:398] Test net output #0: accuracy/top-1 = 0.911

7,end2end


推薦閱讀:
相关文章