定義幾個簡單標準,便於實現一致性,突出重點:

#重點強調代碼語句,將配以形如「//++++++++++XXXX+++++++++」的注釋,以示關鍵

#重點強調代碼塊,將在區塊上端配以形如「//++++++++++XXXX+++++++++」的注釋和下端配以形如「//+++++++++++++++++++」的注釋,以示關鍵

#非上述操作的注釋,皆來自於caffe源碼

1、簡單粗暴,從caffe.cpp中的main函數入手:

int main(int argc, char** argv) {
//略。。。
caffe::GlobalInit(&argc, &argv);
if (argc == 2) {
#ifdef WITH_PYTHON_LAYER
try {
#endif
return GetBrewFunction(caffe::string(argv[1]))(); //++++++重點+++++++++
#ifdef WITH_PYTHON_LAYER
} catch (bp::error_already_set) {
PyErr_Print();
return 1;
}
#endif
} else {
gflags::ShowUsageWithFlagsRestrict(argv[0], "tools/caffe");
}
}

「重點」注釋語句:GetBrewFunction語句真正調用了train(),test(),time()和device_query()四條語句(如何調用見鏈接 定義函數指針類型 typedef int (*fun)() ),就看命令行輸入的指令是什麼,在本篇中,我們輸入${CAFFE_PATH}/tools/caffe train ...語句,那麼「重點」注釋語句將執行train()函數,接下來我們看看train()函數的具體實現。

2、train()函數入手

// Train / Finetune a model.
int train() {
CHECK_GT(FLAGS_solver.size(), 0) << "Need a solver definition to train.";
CHECK(!FLAGS_snapshot.size() || !FLAGS_weights.size())
<< "Give a snapshot to resume training or weights to finetune "
"but not both.";
vector<string> stages = get_stages_from_flags();

caffe::SolverParameter solver_param;
caffe::ReadSolverParamsFromTextFileOrDie(FLAGS_solver, &solver_param);//++++++++++++++讀入solver文件++++++++++++++++

//略。。。

if (gpus.size() > 1) {
caffe::P2PSync<float> sync(solver, NULL, solver->param());
sync.Run(gpus);
} else {
LOG(INFO) << "Starting Optimization";
solver->Solve();//+++++++++++++++++++具體執行+++++++++++++++++++++++
}
LOG(INFO) << "Optimization Done.";
return 0;
}
RegisterBrewFunction(train); //++++++++函數定義完畢後,對其進行註冊++++++++

該函數中的主要功能就是從solver文件讀入超參數,並根據參數創建相應的solver,以及設定了CPU和GPU模式,其他的stage,level等的配置有需要的可以看看;執行solver的Solver()函數是train()的具體執行語句,接下來我們進入virtual void Solve(const char* resume_file = NULL)函數來繼續摸索主線。

3、Solver()函數

void Solver<Dtype>::Solve(const char* resume_file) {
//略。。。

// For a network that is trained by the solver, no bottom or top vecs
// should be given, and we will just provide dummy vecs.
int start_iter = iter_;
Step(param_.max_iter() - iter_); //++++++++++++迭代執行+++++++++++++++
//略。。。
LOG(INFO) << "Optimization Done.";
}

重點執行Step()函數,其參數為迭代次數。caffe將模型訓練過程描述為對net的權值的優化(optimization)過程,在優化結束後,最後還要對優化好的net再做一次前向操作,計算出一個最終的loss值,來顯示一下這個net的最終loss。我們繼續跟蹤主線,進入void Solver<Dtype>::Step(int iters)函數。

4、Step(int iters)函數

template <typename Dtype>
void Solver<Dtype>::Step(int iters) {
const int start_iter = iter_;
const int stop_iter = iter_ + iters;
int average_loss = this->param_.average_loss();
losses_.clear();
smoothed_loss_ = 0;

//++++++++++++++++迭代訓練網路+++++++++++++++++++++++
while (iter_ < stop_iter) {
// zero-init the params
net_->ClearParamDiffs();
if (param_.test_interval() && iter_ % param_.test_interval() == 0
&& (iter_ > 0 || param_.test_initialization())
&& Caffe::root_solver()) {
TestAll();
if (requested_early_exit_) {
// Break out of the while loop because stop was requested while testing.
break;
}
}

for (int i = 0; i < callbacks_.size(); ++i) {
callbacks_[i]->on_start();
}
const bool display = param_.display() && iter_ % param_.display() == 0;
net_->set_debug_info(display && param_.debug_info());
// accumulate the loss and gradient
Dtype loss = 0;
for (int i = 0; i < param_.iter_size(); ++i) {
loss += net_->ForwardBackward();//+++++單步執行網路的前滾和後滾+++++++
}
loss /= param_.iter_size();
// average the loss across iterations for smoothed reporting
UpdateSmoothedLoss(loss, start_iter, average_loss);
if (display) {
LOG_IF(INFO, Caffe::root_solver()) << "Iteration " << iter_
<< ", loss = " << smoothed_loss_;
const vector<Blob<Dtype>*>& result = net_->output_blobs();
int score_index = 0;
for (int j = 0; j < result.size(); ++j) {
const Dtype* result_vec = result[j]->cpu_data();
const string& output_name =
net_->blob_names()[net_->output_blob_indices()[j]];
const Dtype loss_weight =
net_->blob_loss_weights()[net_->output_blob_indices()[j]];
for (int k = 0; k < result[j]->count(); ++k) {
ostringstream loss_msg_stream;
if (loss_weight) {
loss_msg_stream << " (* " << loss_weight
<< " = " << loss_weight * result_vec[k] << " loss)";
}
LOG_IF(INFO, Caffe::root_solver()) << " Train net output #"
<< score_index++ << ": " << output_name << " = "
<< result_vec[k] << loss_msg_stream.str();
}
}
}
for (int i = 0; i < callbacks_.size(); ++i) {
callbacks_[i]->on_gradients_ready();
}
ApplyUpdate();

// Increment the internal iter_ counter -- its value should always indicate
// the number of times the weights have been updated.
++iter_;

SolverAction::Enum request = GetRequestedAction();

// Save a snapshot if needed.
if ((param_.snapshot()
&& iter_ % param_.snapshot() == 0
&& Caffe::root_solver()) ||
(request == SolverAction::SNAPSHOT)) {
Snapshot();
}
if (SolverAction::STOP == request) {
requested_early_exit_ = true;
// Break out of training loop.
break;
}
}
//+++++++++++++++++++++++++++++++++++++++++++++++++
}

從ForwardBackward()函數開始,將正式進入caffe::Net類執行操作;此函數最後還實現了Snapshot生成的功能。從此處開始正式進入了Net、Layer、Blob三層次的訓練具體執行部分,最後進到Net類中來看看Dtype ForwardBackward()函數。

5、ForwardBackward()函數

Dtype ForwardBackward() {
Dtype loss;
Forward(&loss);
Backward();
return loss;
}

這裡就很簡單明了了,設置一個變數loss記錄返回每一次的損失值,通過const vector<Blob<Dtype>*>& Forward(Dtype* loss = NULL)函數計算loss值,再調用void Backward()函數進行回滾調整網路各節點權值。

推薦閱讀:

相关文章