加州理工学院101级分裂火车测试数据集

在定义了训练和测试阶段的转换之后,我想使用以下方法准备训练和测试数据集:

train_dataset = Caltech(DATA_DIR, split='train',  transform=train_transform)

其中DATA_DIR包含类别的子文件夹,并且在每个子文件夹中包含要转换的图像jpg。 我有一个Caltech类要实现,以便从其他txt文件中包含的列表中获取训练和测试数据集,并通过提供的转换从图像中进行转换:

class Caltech(VisionDataset):
    def __init__(self, root, split='train', transform=None, target_transform=None):
        super(Caltech, self).__init__(root, transform=transform, target_transform=target_transform)

        self.split = split # This defines the split you are going to use
                           # (split files are called 'train.txt' and 'test.txt')

        '''
        - Here you should implement the logic for reading the splits files and accessing elements
        - PyTorch Dataset classes use indexes to read elements
        - You should provide a way for the __getitem__ method to access the image-label pair
          through the index
        - Labels should start from 0, so for Caltech you will have lables 0...100 (excluding the background class) 
        '''

    def __getitem__(self, index):
        '''
        __getitem__ should access an element through its index
        Args:
            index (int): Index

        Returns:
            tuple: (sample, target) where target is class_index of the target class.
        '''

        # Applies preprocessing when accessing the image
        if self.transform is not None:
            image = self.transform(image)

        return image, label

    def __len__(self):
        '''
        The __len__ method returns the length of the dataset
        It is mandatory, as this is used by several other components
        '''
        length =#
        return length

'''