2、张量数据结构

T4mako大约 5 分钟

2、张量数据结构

2.1、张量数据类型

张量的数据类型和 numpy.array 基本一一对应，但是不支持 str 类型

在 python 中有数据类型 int、float......
在 PyTorch 中对应 IntTensor，FloatTensor......，同时该数据类型有维度，如 int 标量 3 的维度是 0，矩阵 的维度是 2

torch.float64(torch.double)
torch.float32(torch.float)
torch.float16
torch.int64(torch.long)
torch.int32(torch.int)
torch.int16
torch.int8
torch.uint8
torch.bool

数据类型	dtype	CPU Tensor	GPU Tensor
32bit float	`torch.float` or `torch.float32`	torch.FloatTensor	torch.cuda.FloatTensor
64bit float	`torch.double` or `torch.float64`	torch.DoubleTensor	torch.cuda.DoubleTensor
8bit integer	`torch.uint8`	torch.ByteTensor	torch.cuda.ByteTensor
32bit integer	`torch.int32` or `torch.int`	torch.IntTensor	torch.cuda.IntTensor
64bit integer	`torch.int64` or `torch.long`	torch.LongTensor	torch.cuda.LongTensor

一般神经网络建模使用的都是 torch.float32 类型。

import numpy as np
import torch 

# 自动推断数据类型
i = torch.tensor(1);print(i,i.dtype)
x = torch.tensor(2.0);print(x,x.dtype)
b = torch.tensor(True);print(b,b.dtype)

# 指定数据类型
i = torch.tensor(1,dtype = torch.int32);print(i,i.dtype)
x = torch.tensor(2.0,dtype = torch.double);print(x,x.dtype)

# 使用特定类型构造函数
i = torch.IntTensor(1);print(i,i.dtype)
x = torch.Tensor(np.array(2.0));print(x,x.dtype) #等价于torch.FloatTensor
b = torch.BoolTensor(np.array([1,0,2,0])); print(b,b.dtype)

# 不同类型进行转换
i = torch.tensor(1); print(i,i.dtype)
x = i.float(); print(x,x.dtype) #调用 float 方法转换成浮点类型
y = i.type(torch.float); print(y,y.dtype) #使用 type 函数转换成浮点类型
z = i.type_as(x);print(z,z.dtype) #使用 type_as 方法转换成某个 Tensor 相同类型

api 总结

方法	含义
`torch.tensor(value)`	创建 tensor，参数中接收的是数据的内容
`tensor.dtype()`	获取 tensor 的类型
`torch.FloatTensor(nums/shape/value)` `torch.IntTensor(nums/shape/value)` `torch.Tensor(nums/shape/value)`...	参数接收个数， shape 或 value 创建 tensor `torch.FloatTensor(2),torch.FloatTensor(2,3),torch.FloatTensor([2.,8.3])`
`tensor.float()`	将 tensor 转换为浮点类型
`tensor.type(torch.float)`	使用 type 函数转换成浮点类型
`tensor.type_as(x)`	使用 type_as 方法转换成某个 Tensor 相同类型

2.2、张量的维度

不同类型的数据可以用不同维度 (dimension) 的张量来表示。

标量为 0 维张量
向量为 1 维张量
矩阵为 2 维张量
彩色图像有 rgb 三个通道，可以表示为3维张量
视频还有时间维，可以表示为 4 维张量

scalar = torch.tensor(True)
print(scalar)
print(scalar.dim())  # 标量，0维张量
vector = torch.tensor([1.0,2.0,3.0,4.0]) #向量，1维张量
print(vector)
print(vector.dim())
matrix = torch.tensor([[1.0,2.0],[3.0,4.0]]) #矩阵, 2维张量
print(matrix)
print(matrix.dim())
tensor3 = torch.tensor([[[1.0,2.0],[3.0,4.0]],[[5.0,6.0],[7.0,8.0]]])  # 3维张量
print(tensor3)
print(tensor3.dim())
tensor4 = torch.tensor([[[[1.0,1.0],[2.0,2.0]],[[3.0,3.0],[4.0,4.0]]],
                        [[[5.0,5.0],[6.0,6.0]],[[7.0,7.0],[8.0,8.0]]]])  # 4维张量
print(tensor4)
print(tensor4.dim())

api 总结

方法	含义
`a.dim()`	返回 tensor 的维度（size 的长度）

2.3、张量的尺寸

可以使用 shape 属性或者 size() 方法查看张量在每一维的长度.

可以使用 view 方法改变张量的尺寸。

如果 view 方法改变尺寸失败，可以使用 reshape 方法

vector = torch.tensor([1.0,2.0,3.0,4.0])
print(vector.size())
print(vector.shape)

# 使用view可以改变张量尺寸
vector = torch.arange(0,12)
print(vector)
print(vector.shape)

matrix34 = vector.view(3,4)
print(matrix34)
print(matrix34.shape)

matrix43 = vector.view(4,-1) #-1表示该位置长度由程序自动推断
print(matrix43)
print(matrix43.shape)

# 有些操作会让张量存储结构扭曲，直接使用view会失败，可以用reshape方法

matrix26 = torch.arange(0,12).view(2,6)
print(matrix26)
print(matrix26.shape)

# 转置操作让张量存储结构扭曲
matrix62 = matrix26.t()
print(matrix62.is_contiguous())


# 直接使用view方法会失败，可以使用reshape方法
#matrix34 = matrix62.view(3,4) #error!
matrix34 = matrix62.reshape(3,4) #等价于matrix34 = matrix62.contiguous().view(3,4)
print(matrix34)

api 总结

方法	含义
`tensor.shape`	返回 tensor 的 size（不同维度长度的数组）
`torch.arange(min,max,[step])`	生成值为等差数列的 tensor
`tensor.size([i])`	返回 size 数组索引为 i 的值，没有 i 与 shape 属性相同
`tensor.shape[i]`	返回 size 数组索引为 i 的值，也可以直接使用 `a.shape` 获取 size 对象，可以直接使用 `list(a.shape)` 将 size 转换为 list
`tensor.view(shape)`	对 tensor 进行变换
`tensor.reshape(shape)`	对 tensor 进行变换

2.4、张量和 numpy 数组

可以用 numpy 方法从 Tensor 得到 numpy 数组，也可以用 torch.from_numpy 从 numpy 数组得到 Tensor。

这两种方法关联的 Tensor 和 numpy 数组是共享数据内存的。如果改变其中一个，另外一个的值也会发生改变。

如果有需要，可以用张量的 clone 方法拷贝张量，中断这种关联。

此外，还可以使用 item 方法从标量张量得到对应的 Python 数值。使用 tolist 方法从张量得到对应的 Python 数值列表。

import numpy as np
import torch 

#torch.from_numpy 函数从 numpy 数组得到 Tensor
arr = np.zeros(3)
tensor = torch.from_numpy(arr)
print(tensor) # tensor([0., 0., 0.], dtype=torch.float64)
np.add(arr,1, out = arr) # 给 arr 增加1，tensor 也随之改变
print(tensor) # tensor([1., 1., 1.], dtype=torch.float64)

# numpy 方法从 Tensor 得到 numpy 数组
tensor = torch.zeros(3) # tensor([0., 0., 0.])
arr = tensor.numpy() # [0. 0. 0.]

# 使用带下划线的方法表示计算结果会返回给调用张量
tensor.add_(1) #给 tensor 增加 1，arr 也随之改变 
# 或： torch.add(tensor,1,out = tensor)
print(tensor) # tensor([1., 1., 1.])
print(arr) # [1. 1. 1.]

# 可以用 clone() 方法拷贝张量，中断这种关联
tensor = torch.zeros(3)
#使用 clone 方法拷贝张量, 拷贝后的张量和原始张量内存独立
arr = tensor.clone().numpy() # 也可以使用 tensor.data.numpy()
#使用带下划线的方法表示计算结果会返回给调用张量
tensor.add_(1) #给 tensor增加1，arr不再随之改变
print(tensor) # tensor([1., 1., 1.])
print(arr) # [0. 0. 0.]

# item 方法和 tolist 方法可以将张量转换成 Python 数值和数值列表
scalar = torch.tensor(1.0)
s = scalar.item() # 1.0
print(type(s)) # <class 'float'>
tensor = torch.rand(2,2)
t = tensor.tolist()
print(t) # [[0.4581589698791504, 0.46063995361328125], [0.5779597759246826, 0.40021681785583496]]
print(type(t)) # <class 'list'>

api 总结

方法	含义
`torch.from_numpy(arr)`	使用 numpy 创建 tensor
`torch.zeros(3)`	生成全是 0 的 tensor
`torch.ones(shape)`	生成全是 1 的 tensor
`tensor.numpy()`	将 tensor 变为 numpy 数组
`tensor.add_(num)`	带下划线的方法计算结果会返回给调用张量
`torch.add(tensor,num,out = tensor)`	与上一个方法类似
`tensor.clone()`	拷贝张量
`tensor.item()`	将张量转换成 Python 数值
`tensor.tolist()`	将张量转换成 Python 数值列表

昵称

邮箱

网址

按正序
按倒序
按热度

2、张量数据结构

# 2、张量数据结构

# 2.1、张量数据类型

# 2.2、张量的维度

# 2.3、张量的尺寸

# 2.4、张量和 numpy 数组

预览:

2、张量数据结构

2.1、张量数据类型

2.2、张量的维度

2.3、张量的尺寸

2.4、张量和 numpy 数组