實例化數據庫的時候,有一個可選的參數能夠對數據進行轉換,知足大多神經網絡的要求輸入固定尺寸的圖片,所以要對原圖進行Rescale
或者Crop操做,而後返回的數據須要轉換成Tensor如:數據庫
import FaceLandmarksDataset face_dataset = FaceLandmarksDataset(csv_file='data/faces/face_landmarks.csv', root_dir='data/faces/', transform=transforms.Compose([ Rescale(256), RandomCrop(224), ToTensor()]) )
數據轉換(Transfrom)發生在數據庫中的__getitem__操做中。以上代碼中,transforms.Compose(transform_list),Compose即組合的意思,其參數是一個轉換操做的列表。如上是[ Rescale(256), RandomCrop(224), ToTensor()],如下是實現這三個轉換類。咱們將把它們寫成可調用的類,而不是簡單的函數,這樣在每次調用轉換時就不須要傳遞它的參數。爲此,咱們只須要實現__call__方法,若是須要,還須要實現__init__方法。而後咱們能夠使用這樣的變換:網絡
#建立一個轉換可調用類的實例 tsfm = Transform(params) #使用轉換操做實例對樣本sample進行轉換 transformed_sample = tsfm(sample)
下面觀察這些轉換是如何應用於圖像和標註的。(注:每個操做對應一個類)app
class Rescale(object): """Rescale the image in a sample to a given size. Args: output_size (tuple or int): Desired output size. If tuple, output is matched to output_size. If int, smaller of image edges is matched to output_size keeping aspect ratio the same. """ def __init__(self, output_size): assert isinstance(output_size, (int, tuple)) self.output_size = output_size def __call__(self, sample): image, landmarks = sample['image'], sample['landmarks'] h, w = image.shape[:2] if isinstance(self.output_size, int): if h > w: new_h, new_w = self.output_size * h / w, self.output_size else: new_h, new_w = self.output_size, self.output_size * w / h else: new_h, new_w = self.output_size new_h, new_w = int(new_h), int(new_w) img = transform.resize(image, (new_h, new_w)) # h and w are swapped for landmarks because for images, # x and y axes are axis 1 and 0 respectively landmarks = landmarks * [new_w / w, new_h / h] return {'image': img, 'landmarks': landmarks} class RandomCrop(object): """Crop randomly the image in a sample. Args: output_size (tuple or int): Desired output size. If int, square crop is made. """ def __init__(self, output_size): assert isinstance(output_size, (int, tuple)) if isinstance(output_size, int): self.output_size = (output_size, output_size) else: assert len(output_size) == 2 self.output_size = output_size def __call__(self, sample): image, landmarks = sample['image'], sample['landmarks'] h, w = image.shape[:2] new_h, new_w = self.output_size top = np.random.randint(0, h - new_h) left = np.random.randint(0, w - new_w) image = image[top: top + new_h, left: left + new_w] landmarks = landmarks - [left, top] return {'image': image, 'landmarks': landmarks} class ToTensor(object): """Convert ndarrays in sample to Tensors.""" def __call__(self, sample): image, landmarks = sample['image'], sample['landmarks'] # swap color axis because # numpy image: H x W x C # torch image: C X H X W image = image.transpose((2, 0, 1)) return {'image': torch.from_numpy(image), 'landmarks': torch.from_numpy(landmarks)}
如下來介紹轉換的用法。dom
#獲取一條數據 sample = face_dataset[index] #單獨進行操做 scale = Rescale(256) crope= RandomCrop(224) scale(sample) crope(sample) #使用Compose組合操做 compose = transforms.Compose([Rescale(256),RandomCrop(224)]) compose(sample)
上述轉換後數據仍然是PIL類型,若是要求返回是一個tensor,那麼還得在Compose的最後一個元素進行Totensor操做。函數