PyTorch的Tensor（上）

背景

一個神經網路主要是由什麼構成的呢？大概就是各種各樣的op、op結合的方式、loss；這種形態對應到一個組織上，那就分別是組織中各種各樣的人、組織架構、組織的獎懲機制。如此看來，小到一個人，大到一個國家，都具有這樣的形態，就像自相似分形一樣。在神經網路中，各種各樣的op擁有或多或少的參數，而Tensor就是用來存儲並計算這些參數的基礎。

更近一步，PyTorch宣稱自己是支持GPU運算的numpy，並且可以自動求微分，這究竟是什麼意思呢？因此在本文中，gemfield將從以下幾個方面來講述Tensor：

1，如何創建一個tensor？創建一個tensor的時候發生了什麼？

2，CUDA tensor和CPU tensor的區別是什麼呢？這兩個之間怎麼互相轉換呢？轉換的時候發生了什麼？

3，對於tensor上的方法調用，真正的執行邏輯是在哪裡定義的呢？CPU tensor和CUDA tensor的執行有什麼不一樣呢？

4，更重要的是，一個tensor的requires_grad標誌有什麼含義呢？一個tensor的grad_fn屬性又代表了什麼呢？

5，最後逃避不了的問題就是，針對一個tensor的backward()調用發生了什麼？每個tensor上的grad屬性的意義又是什麼呢？

那麼歡迎來到PyTorch的Tensor系列，這個系列應該是PyTorch關於tensor講解的最底層的文章了。此篇為本系列第一篇，專門闡述Tensor的創建。

PyTorch Tensor在Python中的繼承體系

在Gemfield：詳解Pytorch中的網路構造一文中，gemfield提到過，所有可學習的參數（如weights和bias）的類型都是Parameter類，Parameter的父類正是torch.Tensor類（Parameter和torch.Tensor的區別只有4個：Parameter重新實現了序列化、如何print、deep copy、requires_grad默認True），而torch.Tensor的父類又是torch._C._TensorBase。看起來這個Tensor的繼承體系是這樣的：

#在python中定義了Parameter類 class Parameter(torch.Tensor)

#在python中定義了torch.Tensor類
class Tensor(torch._C._TensorBase)

#在C++中定義了Variable類
struct TORCH_API Variable : public at::Tensor

//PyObject* Py_InitModule(char *name, PyMethodDef *methods)
//創建torch._C
Py_InitModule("torch._C", methods.data()）

//創建 torch._C._TensorBase
PyModule_AddObject(module, "_TensorBase", (PyObject *)&THPVariableType);

要了解Pytorch的Tensor，我們就肯定需要了解Tensor的繼承體系以及父子之間的區別。Gemfield就從最基類torch._C說起。import torch的時候，按照Python規範，位於torch/__init__.py中的邏輯就會被執行：

from torch._C import * ...... __all__ += [name for name in dir(_C) if name[0] != _ and not name.endswith(Base)]

這裡的關鍵就是torch._C，因為Tensor類就是繼承自torch._C._TensorBase。如果我們按照誕生順序（初始化順序）來描述這一過程的話，就是先有了torch._C，然後有了torch._C._TensorBase，然後有了torch.Tensor繼承自torch._C._TensorBase。但這畢竟是C++部分，要在Python中能夠import torch._C，則必定要使用Python的擴展規範來導出這個符號，PyTorch就是這麼做的：如果是Python2，會使用Py_InitModule API；如果是Python3的話，則會使用PyModule_Create API；不管是哪個API，都會創建出torch._C這個python對象：

//name is "torch._C" PyObject* Py_InitModule(char *name, PyMethodDef *methods)

並在torch._C上註冊一個list的function，這個list很長很長。每一個函數由一個PyMethodDef代表，存放在幾個很長很長的list裏。這些符號的名字都會在dir(torch)裏看到(除了那些符號名前帶"_" prefix和"Base" suffix的)。有了這麼一個長長的methods list，我們就可以使用CPython的API來創建一個新的Python類：torch._C這個Python 類就誕生了。

下面的工作就是要往torch._C這個對象上注入一些（很多）成員。其中一個就是torch._C._TensorBase。torch._C的_TensorBase是通過下面的調用完成的初始化：

//來自civilnet的torch/csrc/autograd/python_variable.cpp bool THPVariable_initModule(PyObject *module) { static std::vector<PyMethodDef> methods; THPUtils_addPyMethodDefs(methods, torch::autograd::variable_methods); THPUtils_addPyMethodDefs(methods, extra_methods); THPVariableType.tp_methods = methods.data(); if (PyType_Ready(&THPVariableType) < 0) return false; Py_INCREF(&THPVariableType); PyModule_AddObject(module, "_TensorBase", (PyObject *)&THPVariableType); torch::autograd::initTorchFunctions(module); return true; }

執行THPVariable_initModule的時候，使用

PyModule_AddObject(module, "_TensorBase", (PyObject *)&THPVariableType);

來將THPVariableType註冊成為torch._C._TensorBase。所以你現在知道了，torch._C._TensorBase就是c++中的THPVariableType（類型是PyTypeObject，Python對象系統中最重要的一個類）。現在我們註冊了torch._C._TensorBase這個Python類，下面就要往這個類上註冊一些函數：

THPUtils_addPyMethodDefs(methods, torch::autograd::variable_methods); THPUtils_addPyMethodDefs(methods, extra_methods); ...... torch::autograd::initTorchFunctions(module);

其中，torch::autograd::variable_methods包含了下列358個方法：

//來自syszux的torch/csrc/autograd/generated/python_variable_methods.cpp //的torch::autograd::variable_methods

"__add__", (PyCFunction)THPVariable_add
......

另外，在初始化完成torch._C._TensorBase後，緊接著立刻初始化了torch._C._VariableFunctions（包含了大量的方法），torch._C._VariableFunctions主要是暴露給torch/functional.py使用的符號。對於torch._C._TensorBase來說，初始化工作就要到此結束了。不過還有一個巨大的疑問沒有解釋，就是torch._C._TensorBase上的359個方法是在哪裡實現的呢？難道這不應該是最關鍵的嗎？再等等......好像從本文開頭到現在，我們已經提到過三處函數區了（torch._C的方法、torch._C._TensorBase的方法、torch._C._VariableFunctions的方法）。Anyway，在本章節，我們就先聚焦torch._C._TensorBase的方法。這些方法都是torch::autograd::variable_methods，想起什麼來了嗎？在Gemfield：PyTorch Autograd代碼的動態生成一文中已經提到過，這些方法的實現都是動態生成的，並且由生成的python_variable_methods_dispatch.h中定義的inline dispatch函數將這些variable_methods的邏輯分發到Tensor類對應的方法上，比如native、cuda等等。

PyTorch Tensor在C++中的繼承體系

一個tensor比較重要的特質主要有：tensor的維度信息、tensor的值內容、tensor的grad、tensor的type、tensor的backend等等。更重要的是，一個tensor需要精巧的內存管理。在C++中，一個tensor是由DataPtr、StorageImpl、Storage、TensorImpl、Tensor、Variable::Impl、Variable、AutogradMeta這些底層的類組成的。這個繼承體系看起來是這樣的：

#垂直表示繼承，水平表示被包含

DataPtr -> StorageImpl -> Storage -> TensorImpl -> Tensor
| |
v v
Variable::Impl -> Variable

Gemfield從底層到上層一一說來：

1，DataPtr

這個類位於最底層，用來直接維護tensor所需的內存。

class UniqueVoidPtr { void* data_; std::unique_ptr<void, DeleterFnPtr> ctx_; } class DataPtr { c10::detail::UniqueVoidPtr ptr_; Device device_; }

device成員用來指定tensor的存儲是在cpu還是cuda設備上。不過，Gemfield感覺這裡沒必要用DataPtr多封裝一層UniqueVoidPtr，尋思著是不是應該改掉呢。DataPtr使用構造函數DataPtr(void* data, void* ctx, DeleterFnPtr ctx_deleter, Device device)來構造一個實例，事實上，這個構造過程是由Allocator完成的。以CPU上的tensor為例，這個構造是由DefaultCPUAllocator類的allocate函數完成的：

at::DataPtr allocate(size_t nbytes) const override { ...... //DataPtr(void* data, void* ctx, DeleterFnPtr ctx_deleter, Device device) return {data, data, &Delete, at::Device(at::DeviceType::CPU)} }

假設這個調用是在x86的Linux上完成的，那麼經此一役，最後return中的data來自posix_memalign調用（這是一個libc函數，用來申請對齊後的內存）；Delete來自free()調用；device_就是at::Device(at::DeviceType::CPU)了。

2，StorageImpl和Storage

StorageImpl繼承自intrusive_ptr_target，目的是藉助父類實現的計數功能，然後結合智能指針c10::intrusive_ptr（其負責內存管理，但不負責計數）的幫助，就可以實現「侵入式」的引用計數指針。類的聲明如下：

struct StorageImpl final : public c10::intrusive_ptr_target { ...... caffe2::TypeMeta data_type_; DataPtr data_ptr_; int64_t numel_; bool resizable_; Allocator* allocator_; ...... }

struct Storage{
......
protected:
c10::intrusive_ptr<StorageImpl> storage_impl_;
};

可以看到StorageImpl恰好就使用了上面的DataPtr。另外一個值得注意的成員就是allocator_，在PyTorch中，我們通過REGISTER_ALLOCATOR宏定義了2個allocator，cpu和cuda的：

REGISTER_ALLOCATOR(CPU, &g_cpu_alloc); REGISTER_ALLOCATOR(CUDA, &g_cuda_alloc);

cpu上tensor的內存分配使用的是g_cpu_alloc，類型是DefaultCPUAllocator。

Storage類和StorageImpl之間使用了bridge設計模式，主要是為了保證ABI的兼容。

3，TensorImpl和Tensor

這兩者之間也是bridge模式。另外值得注意的是這裡的tensor指的是at tensor，pytorch由於merge了caffe2的緣故，目前有c10tensor、caffe2 tensor、aten tensor，比較混亂，以後肯定會統一，只不過目前有各種依賴，還不好merge。

struct TensorImpl : public c10::intrusive_ptr_target { ...... Storage storage_; std::unique_ptr<c10::AutogradMetaInterface> autograd_meta_ = nullptr; SmallVector<int64_t,5> sizes_; SmallVector<int64_t,5> strides_; int64_t storage_offset_ = 0; int64_t numel_ = 1; caffe2::TypeMeta data_type_; TensorTypeId type_id_; bool is_contiguous_ = true; bool is_variable_ = false; bool is_wrapped_number_ = false; bool allow_tensor_metadata_change_ = true; bool reserved_ = false; ...... };

class Tensor {
......
protected:
c10::intrusive_ptr<TensorImpl, UndefinedTensorImpl> impl_;
};

namespace detail {
template <typename T, typename... Args>
Tensor make_tensor(Args&&... args) {
return Tensor(c10::make_intrusive<T>(std::forward<Args>(args)...));
}
}

在TensorImpl中有一個成員是autograd_meta_，這是為自動微分服務的。如果一個tensor是Variable，那麼這個成員將會被初始化（比如make_variable函數）並發揮後續作用；如果只是tensor，這個值就是nullptr。

make_tensor就是個helper function，使用了C++ template的perfect forwarding，來幫助用戶簡單創建一個實現了「侵入式」計數功能的tensor智能指針。

4，Variable和Variable::Impl

Variable使用PIMPL（Pointer to IMPLementation）模式來隱藏實現細節（就是Variable::Impl了）：

struct Variable : public at::Tensor { ...... struct AutogradMeta; struct Impl; struct DifferentiableViewImpl; struct DifferentiableViewMeta; };

struct Variable::Impl : public at::TensorImpl {
......
at::Tensor data_;
};

//Variable的構造函數,by gemfield
inline Variable::Variable(c10::intrusive_ptr<Variable::Impl> self)
: at::Tensor(std::move(self)) {}

經過構造後，一個Variable的實例中的impl_將指向一個Variable::Impl的實例，Variable::Impl繼承自at::TensorImpl，然後又包含了一個at::Tensor。。。這不就有兩份TensorImpl裏的東西了嗎（比如Storage）？更近一步，在make_variable的時候，Variable::Impl實例中包含的at::Tensor被賦予了外部輸入的tensor（有個storage實例），而Variable::Impl實例中還有一個通過繼承得到的Storage實例，這不是有2份嗎？

Variable是Tensor的子類，相比Tensor（本質還是多維數組），增加了自動微分系統，這個自動微分系統就是靠AutogradMeta類來實現的。Variable有2種：用戶直接定義的tensor（比如權重參數等）是動態圖中的leaf；由用戶定義variable經過計算得到的中間值是內部variable。每個Variable還包含了一個variable（autograd_meta_中的grad_，由父類TensorImpl提供），稱之為grad(gradient)。如果這個variable是leaf, 那麼在動態圖的運行中其gradient將會被累加。

5，Variable::AutogradMeta

所有和audograd相關的東西都定義在這個類，然後在TensorImpl/VariableImpl中將會包含這個類的實例：

struct Variable::AutogradMeta : public c10::AutogradMetaInterface { std::string name; Variable grad_; std::shared_ptr<Function> grad_fn_; std::weak_ptr<Function> grad_accumulator_; VariableVersion version_counter_; std::vector<std::shared_ptr<FunctionPreHook>> hooks_; bool requires_grad_; bool is_view_; uint32_t output_nr_; PyObject* pyobj_ = nullptr; std::mutex mutex_; };

struct Variable::DifferentiableViewMeta : public Variable::AutogradMeta {
Variable base_;
uint32_t attr_version;
};

struct Variable::DifferentiableViewImpl : public Variable::Impl {};

//構造variable的helper
inline Variable make_variable(at::Tensor data,bool requires_grad = false,bool allow_tensor_metadata_change = true) {
auto data_impl_copy = data.getIntrusivePtr()->shallow_copy_and_detach();
data_impl_copy->set_allow_tensor_metadata_change(allow_tensor_metadata_change);
auto data_copy = at::Tensor(data_impl_copy);
auto autograd_meta = c10::guts::make_unique<Variable::AutogradMeta>();
return Variable(c10::make_intrusive<Variable::Impl>(data_copy, std::move(autograd_meta), requires_grad));
}

每個Variable都有一個獨一無二的（由std::unique_ptr智能指針來保證）AutogradMeta結構體，用來做和自動微分相關的一切工作。而一個Variable的實例化一般是通過make_variable helper函數（在torch::autograd命名空間裏）來實現的，這個函數根據輸入的Tensor、requires_grad來產生一個Variable的實例：

Variable(c10::make_intrusive<Variable::Impl>(data_copy, std::move(autograd_meta), requires_grad))

如何創建一個PyTorch的Tensor

在PyTorch中直接創建一個tensor的方法主要有2種：使用torch的方法（empty、rand、zeros、tensor...）、實例化torch.Tensor這個類。

#第一種 gemfield = torch.empty(7, 19) gemfield = torch.rand(7, 19) gemfield = torch.zeros(7, 19, dtype=torch.long) gemfield = torch.tensor([7.0, 19])

#第二種
gemfield = torch.Tensor([1,2])

使用torch方法創建一個Tensor

比如torch.empty(7,19)會創建一個7x19的二維矩陣，這裡以empty為例，其它的方法也類似。那麼這個empty的符號是怎麼出現在torch模塊上的呢？在torch/__init__.py中，有下面這樣的代碼：

for name in dir(_C._VariableFunctions): if name.startswith(__): continue globals()[name] = getattr(_C._VariableFunctions, name)

也就是_C._VariableFunctions中的符號全部被注入到torch模塊下（除了__prefix的）。而在torch/csrc/autograd/generated/python_torch_functions.cpp（由模板生成）中我們定義了_C._VariableFunctions：

static PyMethodDef torch_functions[] = { {"from_numpy", (PyCFunction)THPVariable_from_numpy, METH_STATIC | METH_O, NULL}, {"add", (PyCFunction)THPVariable_add, METH_VARARGS | METH_KEYWORDS | METH_STATIC, NULL}, {"empty", (PyCFunction)THPVariable_empty, METH_VARARGS | METH_KEYWORDS | METH_STATIC, NULL}, {"rand", (PyCFunction)THPVariable_rand, METH_VARARGS | METH_KEYWORDS | METH_STATIC, NULL}, {"tensor", (PyCFunction)THPVariable_tensor, METH_VARARGS | METH_KEYWORDS | METH_STATIC, NULL}, {"zeros", (PyCFunction)THPVariable_zeros, METH_VARARGS | METH_KEYWORDS | METH_STATIC, NULL}, ...... }

static PyTypeObject THPVariableFunctions = {
PyVarObject_HEAD_INIT(NULL, 0)
"torch._C._VariableFunctions", /* tp_name */
......
torch_functions, /* tp_methods */
......
}

因此，當我們使用torch.empty/rand/zeros來初始化一個tensor的時候，實際上就調用了C++中的THPVariable_empty/THPVariable_rand/THPVariable_zeros函數。這裡我們還是以THPVariable_empty為例，

const auto options = TensorOptions() .dtype(dtype) .device(device) .layout(r.layout(3).layout) .requires_grad(r.toBool(5)); // size is (7,19) here, by gemfield return wrap(dispatch_empty(size, options));

先介紹下TensorOptions。因為一個tensor有這些特質：equires_grad, is_variable, device(cpu、CUDA...), dtype(int、float...), layout(Strided、sparse)，這些屬性都由TensorOptions封裝。

接著說下dispatch_empty函數，就像它的名字一樣，這個函數起得是分發作用。它將torch的empty調用分發到torch::empty C++調用上：

inline Tensor dispatch_empty(IntList size, const TensorOptions & options) { return torch::empty(size, options); }

在torch::empty的實現在torch/csrc/autograd/generated/variable_factories.h中，如下所示：

inline at::Tensor empty(at::IntList size, const at::TensorOptions & options = {}) { at::Tensor tensor = at::empty(size, at::TensorOptions(options).is_variable(false)); at::Tensor result = autograd::make_variable(tensor, options.requires_grad()); return result; }

先後調用了at::empty和autograd::make_variable。構建一個tensor的調用棧此刻已經完全展開了，這個調用棧主要有3步構成：1，at::empty；2，autograd::make_variable；3，wrap。

1，at::empty

對於at::empty來說，它實現在ATen模塊中。這個時候，我們不由自主的想起了在gemfield專欄文章（Gemfield：PyTorch ATen代碼的動態生成）中曾經提到過的Type繼承體系，是的，我們現在到那裡了。TypeExtendedInterface的getType會轉而使用type_registry表去查詢所要使用的Type，查詢使用的索引就是backend（device）和scaler type（dtype），查詢得到的結果將會是那30個type中的某一個。

//ATen/Functions.h 模板生成的, by civilnet static inline Tensor empty(IntArrayRef size, const TensorOptions & options) { return at::getType(options).empty(size, options); }

//ATen/context.cpp
TypeExtendedInterface& getType(TensorOptions options) {
return globalContext().getType(
options.backend(), typeMetaToScalarType(options.dtype()), options.is_variable());
}

這裡就假設使用的是其中的CPUFloatType：

//來自syszux的ATen CPUFloatType.cpp Tensor CPUFloatType::empty(IntArrayRef size, const TensorOptions & options) const { const DeviceGuard device_guard(options.device()); return at::native::empty_cpu(/* actuals */ size, options); }

//ATen/native/TensorFactories.cpp
Tensor empty_cpu(IntList size, const TensorOptions& options) {
auto* allocator = at::getCPUAllocator();
int64_t nelements = prod_intlist(size);
auto dtype = options.dtype();
auto storage_impl = c10::make_intrusive<StorageImpl>(
dtype,nelements,
allocator->allocate(nelements * dtype.itemsize()),
allocator,true);

auto tensor = detail::make_tensor<TensorImpl>(storage_impl, at::CPUTensorId(), false);
// Default TensorImpl has size [0]
if (size.size() != 1 || size[0] != 0) {
tensor.unsafeGetTensorImpl()->set_sizes_contiguous(size);
}
return tensor;
}

我們已經能夠看到一個構建cpu上的float tensor的過程。cpu上tensor的內存分配使用的是DefaultCPUAllocator，在Linux上底層使用了posix_memalign(&data, gCaffe2Alignment, nbytes)，這是libc中實現的一個內存分配函數；cuda的tensor則是使用cuda代碼實現的。經此一役，一個tensor已經產生了，這個tensor的impl_成員已經初始化為了一個TensorImpl的實例，而TensorImpl實例中的type、elements數量、allocator類型、分配的內存（DataPtr）也已經初始化完畢。

2，autograd::make_variable

而autograd::make_variable實現如下：

inline Variable make_variable(at::Tensor data, bool requires_grad = false, bool allow_tensor_metadata_change = true) { if (data.defined()) { auto data_impl_copy = data.getIntrusivePtr()->shallow_copy_and_detach(); data_impl_copy->set_allow_tensor_metadata_change(allow_tensor_metadata_change); auto data_copy = at::Tensor(data_impl_copy); auto autograd_meta = c10::guts::make_unique<Variable::AutogradMeta>(); return Variable(c10::make_intrusive<Variable::Impl>(data_copy, std::move(autograd_meta), requires_grad)); } return Variable(); }

這個前面已經闡述過了。主要的工作就是實例化一個Variable的實例，並且Variable實例中的data_成員(at::Tensor類型）初始化為了上述構建的tensor。

3，wrap

最後dispatch_empty函數的返回由wrap函數封裝，wrap定義如下：

inline PyObject* wrap(at::Tensor tensor) { return THPVariable_Wrap(Variable(std::move(tensor))); }

PyObject * THPVariable_Wrap(Variable var)
{
return THPVariable_NewWithVar((PyTypeObject *)THPVariableClass, std::move(var));
}

可以看到wrap函數調用了THPVariable_Wrap函數，而THPVariable_Wrap進行一些簡單的判斷後（主要看是否已經創建過同一個tensor），接著調用THPVariable_NewWithVar來進行python中tensor的創建。我們先暫停在這裡，記住THPVariable_NewWithVar這個名字。記住：當使用torch的empty/zeros/rand...這些方法創建一個tensor的時候，調用棧最後到達了THPVariable_NewWithVar。

使用torch.Tensor類實例化一個tensor

當使用下面這種方式構建tensor時，我們實際上實例化了torch.Tensor這個python類。

gemfield = torch.Tensor([1,2])

torch.Tensor這個類的父類正是torch._C._TensorBase，torch._C._TensorBase是按照CPython的語法定義的：

PyTypeObject THPVariableType = { PyVarObject_HEAD_INIT(nullptr, 0) "torch._C._TensorBase", /* tp_name */ sizeof(THPVariable), /* tp_basicsize */ 0, /* tp_itemsize */ (destructor)THPVariable_dealloc, /* tp_dealloc */ &THPVariable_as_mapping, /* tp_as_mapping */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /* tp_flags */ (traverseproc)THPVariable_traverse, /* tp_traverse */ (inquiry)THPVariable_clear, /* tp_clear */ 0, /* tp_weaklistoffset */ THPVariable_properties, /* tp_getset */ THPVariable_pynew /* tp_new */ ...... };

tp_new的欄位指向的是THPVariable_pynew函數，因此，當new一個torch.Tensor的時候，THPVariable_pynew函數被調用：

static PyObject *THPVariable_pynew(PyTypeObject *type, PyObject *args, PyObject *kwargs) { auto& default_type = torch::tensors::get_default_tensor_type(); auto tensor = torch::utils::legacy_tensor_ctor(default_type, args, kwargs); return THPVariable_NewWithVar(type, std::move(tensor)); }

THPVariable_pynew中創建了c++的tensor後，調用棧同樣來到了THPVariable_NewWithVar，還記得這個名字嗎？創建python中tensor的道路殊途同歸，最後都來到了THPVariable_NewWithVar。

C++ tensor 到Python tensor

THPVariable_NewWithVar函數就是PyTorch的Tensor從C++通往Python之路。定義如下：

static PyObject* THPVariable_NewWithVar(PyTypeObject* type, Variable var) { PyObject* obj = type->tp_alloc(type, 0); if (obj) { auto v = (THPVariable*) obj; new (&v->cdata) Variable(std::move(var)); v->cdata.set_pyobj(obj); if (auto fn = dynamic_cast<PyFunction*>(v->cdata.grad_fn_unsafe())) { // Create a new reference to the THPFunction. This ensures that ref count // of the THPFunction is at least the number of referring THPVariables. const auto output_nr = v->cdata.output_nr(); auto grad_fn = THPFunction_asFunction((THPFunction*)fn->obj); v->cdata.set_gradient_edge({std::move(grad_fn), output_nr}); } } return obj; }

簡單來說就是，python中的tensor和c++中的tensor是不一樣的，c++中的tensor是python中tensor的payload，對，正是cdata（python中的_cdata)。

struct THPVariable { torch::autograd::Variable cdata; PyObject* backward_hooks = nullptr; };

THPVariable_NewWithVar中所作的工作除了將C++中初始化好的Variable實例賦給Python tensor的payload之外，就是下面這樣的工作：

//v->cdata正是C++中初始化好的Variable實例，by gemfield if (auto fn = dynamic_cast<PyFunction*>(v->cdata.grad_fn_unsafe())) { const auto output_nr = v->cdata.output_nr(); auto grad_fn = THPFunction_asFunction((THPFunction*)fn->obj); v->cdata.set_gradient_edge({std::move(grad_fn), output_nr}); }

使用set_gradient_edge來設置gradient function，如果是leaf的話，則grad_fn是null，這種情況下如果requires_grad =True那麼grad_fn將會被解釋為gradient accumulator；如果是內部Variable的話，grad_fn將是對應的求梯度函數，類型為std::shared_ptr<PyFunction>。

總結

讓我們再次回顧一下創建一個PyTorch tensor的過程。

第一步是創建tensor，根據tensor的sparse/strided、backend、scaler type、大小等等，被dispatch到Type繼承體系中的某一個type上，然後選擇合適的內存allocator，分配內存，創建StorageImpl，然後創建Tensor；

第二步是make variable，函數名也是make_variable，最重要的初始化內容就是加入了自動微分系統，這一系統由類AutogradMeta封裝；

第三步是從C++ tensor 到Python tensor，C++的tensor成為了Python tensor的payload後，額外又通過set_gradient_edge設置Variable的gradient function。

PyTorch的Tensor（上）

背景

PyTorch Tensor在Python中的繼承體系

PyTorch Tensor在C++中的繼承體系

如何創建一個PyTorch的Tensor

使用torch方法創建一個Tensor

使用torch.Tensor類實例化一個tensor

C++ tensor 到Python tensor

總結

熱門新聞

週熱門

PyTorch的Tensor（上）

背景

PyTorch Tensor在Python中的繼承體系

PyTorch Tensor在C++中的繼承體系

如何創建一個PyTorch的Tensor

使用torch方法創建一個Tensor

使用torch.Tensor類實例化一個tensor

C++ tensor 到Python tensor

總結

對pytorch,為什麼載入好的模型每次預測的結果都不一樣？

pytorch 中的Dataset這個類為什麼可以調用__getitem__？

請問大神在anaconda中pytorch環境調用torch會失誤怎麼解決啊？

請問pip install pytorch具體有哪些坑，可以繞過嗎？

2019年11月，tensorflow已經發布2.0正式版本，tf 1.1x版本用戶應該如何選擇？

pytorch保存模型再載入比訓練時低了兩個點，這是為什麼呢？

PyTorch的動態計算圖體現在什麼地方？

使用pytorch 訓練一個二分類器，訓練集的準確率不斷提高，但是驗證集的準確率卻波動很大，這是為啥？

深度學習模型越小，檢測速度是不是一定越快？

請問用pytorch訓練過程，是否可以實現GPU一直處於訓練狀態，模型評價交給CPU？

tensorflow2.0是否值得跟進，還是繼續使用pytorch？

pytorch訓練二分類器，訓練集loss下降的很慢或者剛開始較快然後越來越慢,大佬們這是什麼原因呢？

如何比較Tensorflow 2.0和Pytorch 1.0？

計算機視覺有哪些比較好發論文的方向？

tensorflow是否比pytorch在顯存管理利用方面更有優勢？

熱門新聞

週熱門

pytorch 中的Dataset這個類為什麼可以調用getitem？