pytorch文檔閱讀（一）

本章主要針對pytorch0.4.0英文文檔的前兩節,順序可能有些不同：html

torch
torch.Tensor

張量 Tensors

Data type	CPU tensor	GPU tensor	type
32-bit floating point	torch.FloatTensor	torch.cuda.FloatTensor	torch.float32
64-bit floating point	torch.DoubleTensor	torch.cuda.DoubleTensor	torch.float64
16-bit floating point	N/A	torch.cuda.HalfTensor	torch.cuda.float16
8-bit integer (unsigned)	torch.ByteTensor	torch.cuda.ByteTensor	torch.uint8
8-bit integer (signed)	torch.CharTensor	torch.cuda.CharTensor	torch.int8
16-bit integer (signed)	torch.ShortTensor	torch.cuda.ShortTensor	torch.int16
32-bit integer (signed)	torch.IntTensor	torch.cuda.IntTensor	torch.int32
64-bit integer (signed)	torch.LongTensor	torch.cuda.LongTensor	torch.int64

torch.is_stensor/torch.is_storage
torch.set_default_tensor_type() 這個有用，若是大部分操做是GPU上構建的，可你把默認類型定爲cuda tensor

if torch.cuda.is_available():
    if args.cuda:
        torch.set_default_tensor_type('torch.cuda.FloatTensor')
    if not args.cuda:
        print("WARNING: It looks like you have a CUDA device, but aren't " +
              "using CUDA.\nRun with --cuda for optimal training speed.")
        torch.set_default_tensor_type('torch.FloatTensor')
else:
    torch.set_default_tensor_type('torch.FloatTensor')

torch.numel(input)->int/numel() /nelement()
torch.set_printoptions(precision=None, threshold=None, edgeitems=None, linewidth=None, profile=None) 打印選項
torch.set_flush_denormal(mode) → bool 禁用cpu很是規浮點

>>> torch.set_flush_denormal(True)
True
>>> torch.tensor([1e-323], dtype=torch.float64)
tensor([ 0.], dtype=torch.float64)
>>> torch.set_flush_denormal(False)
True
>>> torch.tensor([1e-323], dtype=torch.float64)
tensor(9.88131e-324 *
       [ 1.0000], dtype=torch.float64)

## 建立操做 Creation Ops

torch方法後綴_like :建立除了值之外的任何設置相同的tensor 包括：zeros，ones，empty， full，rand，randint，randnpython

torch.tensor(data, dtype=None, device=None, requires_grad=False) → Tensor
torch.from_numpy(ndarray) → Tensor
torch.eye(n, m=None, out=None)
torch.linspace(start, end, steps=100, out=None) → Tensor
torch.logspace(start, end, steps=100, out=None) → Tensor
torch.ones(*sizes, out=None) → Tensor
torch.empty(*sizes, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor
torch.reshape(input, shape) → Tensor 注意這是個坑，
torch.rand(*sizes, out=None) → Tensor（均勻分佈）
torch.randn(*sizes, out=None) → Tensor（正態分佈）
torch.randperm(n, out=None) → LongTensor（隨機整數0，，，n-1）
torch.randint(low=0, high, size, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor
torch.arange(start, end, step=1, out=None) → Tensor/torch.range(start, end, step=1, out=None) → Tensor 一個包含末尾，一個不包含
torch.zeros(*sizes, out=None) → Tensor

## 索引,切片,鏈接,編譯操做

這些操做絕大多數tensor自己也包含 tensor方法的通用後綴 _ inplace操做，多線程

torch.cat(seq, dim=0, out=None) → Tensor/torch.stack(sequence, dim=0) 經常使用操做,一個是存在的維度上，一個是新的維度上（新建一個維度，已經存在的維度天然向後擠了）
torch.split(tensor, split_size, dim=0)/torch.chunk(tensor, chunks, dim=0)/split()/chunk(),這兩個功能相近，一個是沿軸均分指定大小（若是沒法整除，最後一塊返回較小的塊），另外一個chunk是返回固定塊數（也是和split同樣，最後一塊返回較小塊）

a = torch.Tensor([1,2,3,4,5])
b = a.split(2)
c = a.chunk(3)

torch.gather(input, dim, index, out=None) → Tensor/gather(dim, index) 這個函數就很迷了，當時學習tensorflow時就研究了很久╮(╯﹏╰）╭，注意全部的index都是torch.LongTensor

torch.gather(t, 1, torch.LongTensor([[0,0],[1,0]]))

torch.index_select(input, dim, index, out=None) → Tensor/index_select(dim, index) f很是重要的函數
torch.masked_select(input, mask, out=None) → Tensor/masked_select(mask) 注意全部mask都爲torch.ByteTensor,同時須要注意mask的shape不必定要和tensor相同數量也不必定要相同，shape中必須有一個軸要和tensor的軸對應，此時按此軸索引

a = torch.Tensor([[1,2,3],[4,5,6]])
mask = torch.Tensor([[1,0],[0,0],[1,0]]).type(torch.ByteTensor)
mask_1 = torch.Tensor([[1],[0]]).type(torch.ByteTensor)
mask_2 = torch.Tensor([0,1,1]).type(torch.ByteTensor)
b = a.masked_select(mask)#error
c = a.masked_select(mask_1)
d = a.masked_select(mask_2)

torch.nonzero(input, out=None) → LongTensor 注意是返回的高維索引

>>> torch.nonzero(torch.Tensor([[0.6, 0.0, 0.0, 0.0],
...                             [0.0, 0.4, 0.0, 0.0],
...                             [0.0, 0.0, 1.2, 0.0],
...                             [0.0, 0.0, 0.0,-0.4]]))

 0  0
 1  1
 2  2
 3  3
[torch.LongTensor of size 4x2]

torch.squeeze(input, dim=None, out=None)/squeeze(dim=None) 超超超重要的函數
torch.stack(sequence, dim=0)
torch.t/t()
torch.transpose(input, dim0, dim1, out=None) → Tensor/transpose() 交換任意兩個維度
torch.take(input, indices) → Tensor 展開以後的tensor取索引
tensor.permute(dims) 很是重要

x.permute(2, 0, 1)

torch.unbind(tensor, dim=0) 移除指定維度，返回一個truple，包含了沿着指定維切片後的各個切片
torch.unsqueeze(input, dim, out=None) 插入維度
torch.where(condition, x, y) → Tensor 注意condition (ByteTensor)

隨機抽樣 Random sampling

torch.manual_seed(seed)
torch.initial_seed() 注意作對比實驗的時爲了控制變量，多線程載入數據時每一個線程的seed都須要嚴格設定
torch.get_rng_state() ->(ByteTensor)
torch.set_rng_state
torch.default_generator
torch.bernoulli(input, out=None) → Tensor伯努利投硬幣，經常使用於樣本的挖掘（hard example）
torch.multinomial(input, num_samples,replacement=False, out=None) → Longtensor 多項分佈抽取樣本
torch.normal(means, std, out=None) 離散正態分佈中抽取隨機數

torch.normal(means=torch.arange(1, 11), std=torch.arange(1, 0, -0.1))

 1.5104
 1.6955
 2.4895
 4.9185
 4.9895
 6.9155
 7.3683
 8.1836
 8.7164
 9.8916
[torch.FloatTensor of size 10]
>>> torch.normal(mean=0.5, std=torch.arange(1, 6))

  0.5723
  0.0871
 -0.3783
 -2.5689
 10.7893
[torch.FloatTensor of size 5]
>>> torch.normal(means=torch.arange(1, 6))

 1.1681
 2.8884
 3.7718
 2.5616
 4.2500
[torch.FloatTensor of size 5]

序列化 Serialization

torch.saves
torch.load

並行化 Parallelism

torch.get_num_threads
torch.set_num_threads(int)

數學操做Math operations

tensor有所有的對應數學函數 挑幾個經常使用的：app

ceil/floor/frac
round
torch.clamp(input, min, max, out=None) → Tensor 等價於tensorflow的tf.clip
torch.argmax(input, dim=None, keepdim=False)/torch.argmin(input, dim=None, keepdim=False)
torch.cumprod(input, dim, out=None) → Tensor $$y_i = x_1 \times x_2\times x_3\times \dots \times x_i$$/torch.prod(input, dim, keepdim=False, out=None) → Tensor
torch.cumsum(input, dim, out=None) → Tensor 同上
torch.dist(input, other, p=2) → Tensor p範數/torch.norm(input, p, dim, keepdim=False, out=None) → Tensor
- torch.mean(input, dim, keepdim=False, out=None) → Tensor 注意keep_dim是是否保持維度不變

>>> a = torch.randn(4, 4)
>>> a
tensor([[-0.3841,  0.6320,  0.4254, -0.7384],
        [-0.9644,  1.0131, -0.6549, -1.4279],
        [-0.2951, -1.3350, -0.7694,  0.5600],
        [ 1.0842, -0.9580,  0.3623,  0.2343]])
>>> torch.mean(a, 1)
tensor([-0.0163, -0.5085, -0.4599,  0.1807])
>>> torch.mean(a, 1, True)
tensor([[-0.0163],
        [-0.5085],
        [-0.4599],
        [ 0.1807]])

torch.median() 返回中間值

>>> a = torch.randn(1, 3)
>>> a
tensor([[ 1.5219, -1.5212,  0.2202]])
>>> torch.median(a)
tensor(0.2202)

torch.median(input, dim=-1, keepdim=False, values=None, indices=None) -> (Tensor, LongTensor)

>>> a = torch.randn(4, 5)
>>> a
tensor([[ 0.2505, -0.3982, -0.9948,  0.3518, -1.3131],
        [ 0.3180, -0.6993,  1.0436,  0.0438,  0.2270],
        [-0.2751,  0.7303,  0.2192,  0.3321,  0.2488],
        [ 1.0778, -1.9510,  0.7048,  0.4742, -0.7125]])
>>> torch.median(a, 1)
(tensor([-0.3982,  0.2270,  0.2488,  0.4742]), tensor([ 1,  4,  4,  3]))

torch.std(input, dim, keepdim=False, unbiased=True, out=None) → Tensor/torch.var(input, dim, keepdim=False, unbiased=True, out=None) → Tensor 標準差和方差
torch.sum(input, dim, keepdim=False, out=None) → Tensor
torch.unique(input, sorted=False, return_inverse=False) 元素去重（只能是1D tensor）

Comparison Ops

torch.eq(input, other, out=None) → Tensor other必須能廣播，返回mask
torch.equal(tensor1, tensor2) → bool
torch.ge/gt/le/lt/ne/ ≥/＞/≤/＜
torch.topk(input, k, dim=None, largest=True, sorted=True, out=None) -> (Tensor, LongTensor)/torch.kthvalue(input, k, dim=None, keepdim=False, out=None) -> (Tensor, LongTensor)
torch.max(input) → Tensor
torch.max(input, dim, keepdim=False, out=None) -> (Tensor, LongTensor)
torch.max(input, other, out=None) → Tensor

>>> a = torch.randn(4)
>>> a
tensor([ 0.2942, -0.7416,  0.2653, -0.1584])
>>> b = torch.randn(4)
>>> b
tensor([ 0.8722, -1.7421, -0.4141, -0.5055])
>>> torch.max(a, b)
tensor([ 0.8722, -0.7416,  0.2653, -0.1584])

torch.min()也有三種方法，使用同max
torch.sort(input, dim=None, descending=False, out=None) -> (Tensor, LongTensor)

BLAS and LAPACK Operations

各類矩陣的基礎運算 ##Spectral Ops 終於算是加上了。。dom

Tensor獨有的Ops

tensor的前綴new_方法,是固定變量賦值，適用於ones，zeros，full，tensor（坑）ide

>>> tensor = torch.ones((2,), dtype=torch.int8)
>>> data = [[0, 1], [2, 3]]
>>> tensor.new_tensor(data)
tensor([[ 0,  1],[ 2,  3]], dtype=torch.int8)

torch.Tensor.item() 坑，注意只能是一個值，適合返回loss，acc等
apply_(callable) → Tensor(相似於map，python層面的cpu funtion，效率低)
cauchy_(median=0, sigma=1, *, generator=None) → Tensor
char(),byte(),double() ,int()
clone() /copy() 第一個是徹底克隆，第二個是可廣播的數值
contiguous() → Tensor 一些op爲了高效運算，默認實現連續內存運算需求的，這時候要保證tensor的連續存儲
is_contiguous() → bool/is_pinned()/is_cuda/is_pinned()/is_signed()
cpu()/cuda()
dim()
device
element_size() → int 返回變量類型的內存佔用字節
expand(*sizes) → Tensor 重要：擴展dim維1的軸

>>> x = torch.tensor([[1], [2], [3]])
>>> x.size()
torch.Size([3, 1])
>>> x.expand(3, 4)
tensor([[ 1,  1,  1,  1],
        [ 2,  2,  2,  2],
        [ 3,  3,  3,  3]])
>>> x.expand(-1, 4)   # -1 means not changing the size of that dimension
tensor([[ 1,  1,  1,  1],
        [ 2,  2,  2,  2],
        [ 3,  3,  3,  3]])

index_copy_(dim, index, tensor) → Tensor 按索引複製元素

>>> x = torch.zeros(5, 3)
>>> t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=torch.float)
>>> index = torch.tensor([0, 4, 2])
>>> x.index_copy_(0, index, t)
tensor([[ 1.,  2.,  3.],
        [ 0.,  0.,  0.],
        [ 7.,  8.,  9.],
        [ 0.,  0.,  0.],
        [ 4.,  5.,  6.]])

index_fill_(dim, index, val) → Tensor
map_(tensor, callable) Applies callable for each element in self tensor and the given tensor and stores the results in self tensor. self tensor and the given tensor must be broadcastable. (待續)

一些坑

new_tensor 會新建變量，use torch.Tensor.requires_grad_() or torch.Tensor.detach()
mask_select index_select也會新建變量
reshape，resize，review （待續）