在定义了训练和测试阶段的转换之后,我想使用以下方法准备训练和测试数据集:
train_dataset = Caltech(DATA_DIR, split='train', transform=train_transform)
其中DATA_DIR包含类别的子文件夹,并且在每个子文件夹中包含要转换的图像jpg。 我有一个Caltech类要实现,以便从其他txt文件中包含的列表中获取训练和测试数据集,并通过提供的转换从图像中进行转换:
class Caltech(VisionDataset):
def __init__(self, root, split='train', transform=None, target_transform=None):
super(Caltech, self).__init__(root, transform=transform, target_transform=target_transform)
self.split = split # This defines the split you are going to use
# (split files are called 'train.txt' and 'test.txt')
'''
- Here you should implement the logic for reading the splits files and accessing elements
- PyTorch Dataset classes use indexes to read elements
- You should provide a way for the __getitem__ method to access the image-label pair
through the index
- Labels should start from 0, so for Caltech you will have lables 0...100 (excluding the background class)
'''
def __getitem__(self, index):
'''
__getitem__ should access an element through its index
Args:
index (int): Index
Returns:
tuple: (sample, target) where target is class_index of the target class.
'''
# Applies preprocessing when accessing the image
if self.transform is not None:
image = self.transform(image)
return image, label
def __len__(self):
'''
The __len__ method returns the length of the dataset
It is mandatory, as this is used by several other components
'''
length =#
return length
'''