PyTorch Training#
Getting Started#
Aiaccel-based training is a wrapper of PyTorch Lightning, which can be executed as follows:
python -m aiaccel.torch.apps.train config.yaml
The config file config.yaml typically consists of trainer, datamodule, and task as follows:
config.yaml#
1 _base_: ${base_config_path}/train_base.yaml
2
3 trainer:
4 max_epochs: 10
5
6 callbacks:
7 - _target_: lightning.pytorch.callbacks.ModelCheckpoint
8 filename: "{epoch:04d}"
9 save_last: True
10 save_top_k: -1
11
12 datamodule:
13 _target_: aiaccel.torch.lightning.datamodules.SingleDataModule
14
15 train_dataset_fn:
16 _partial_: True
17 _target_: torchvision.datasets.MNIST
18
19 root: "./dataset"
20 train: True
21 download: True
22
23 transform:
24 _target_: torchvision.transforms.Compose
25 transforms:
26 - _target_: torchvision.transforms.Resize
27 size: [[256, 256]]
28 - _target_: torchvision.transforms.Grayscale
29 num_output_channels: 3
30 - _target_: torchvision.transforms.ToTensor
31 - _target_: torchvision.transforms.Normalize
32 mean: [0.5]
33 std: [0.5]
34
35 val_dataset_fn:
36 _partial_: True
37 _inherit_: ${datamodule.train_dataset_fn}
38
39 train: False
40
41 batch_size: 128
42 wrap_scatter_dataset: False
43
44 task:
45 _target_: my_task.MyTask
46 num_classes: 10
47
48 model:
49 _target_: torchvision.models.resnet50
50 weights:
51 _target_: hydra.utils.get_object
52 path: torchvision.models.ResNet50_Weights.DEFAULT
53
54 optimizer_config:
55 _target_: aiaccel.torch.lightning.OptimizerConfig
56 optimizer_generator:
57 _partial_: True
58 _target_: torch.optim.Adam
59 lr: 1.e-4
Distributed Training#
WIP…
Other Utilities#
Other utilities are listed in API Reference.