Shortcuts

파이토치(PyTorch) 기본 익히기 || 빠른 시작 || 텐서(Tensor) || Dataset과 Dataloader || 변형(Transform) || 신경망 모델 구성하기 || Autograd || 최적화(Optimization) || 모델 저장하고 불러오기

모델 저장하고 불러오기

이번 장에서는 저장하기나 불러오기를 통해 모델의 상태를 유지(persist)하고 모델의 예측을 실행하는 방법을 알아보겠습니다.

import torch
import torchvision.models as models

모델 가중치 저장하고 불러오기

PyTorch 모델은 학습한 매개변수를 state_dict라고 불리는 내부 상태 사전(internal state dictionary)에 저장합니다. 이 상태 값들은 torch.save 메소드를 사용하여 저장(persist)할 수 있습니다:

model = models.vgg16(weights='IMAGENET1K_V1')
torch.save(model.state_dict(), 'model_weights.pth')
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /workspace/.cache/torch/hub/checkpoints/vgg16-397923af.pth

  0%|          | 0.00/528M [00:00<?, ?B/s]
  1%|1         | 7.83M/528M [00:00<00:06, 82.1MB/s]
  3%|2         | 15.7M/528M [00:00<00:06, 77.7MB/s]
  4%|4         | 23.1M/528M [00:00<00:06, 77.4MB/s]
  6%|5         | 30.5M/528M [00:00<00:06, 77.2MB/s]
  7%|7         | 38.3M/528M [00:00<00:06, 79.1MB/s]
  9%|8         | 45.9M/528M [00:00<00:06, 79.0MB/s]
 10%|#         | 53.4M/528M [00:00<00:06, 78.8MB/s]
 12%|#1        | 61.0M/528M [00:00<00:06, 77.0MB/s]
 13%|#2        | 68.3M/528M [00:00<00:06, 70.5MB/s]
 14%|#4        | 75.6M/528M [00:01<00:06, 72.1MB/s]
 16%|#5        | 82.5M/528M [00:01<00:06, 72.0MB/s]
 17%|#6        | 89.4M/528M [00:01<00:06, 71.3MB/s]
 18%|#8        | 97.0M/528M [00:01<00:06, 73.7MB/s]
 20%|#9        | 105M/528M [00:01<00:05, 75.6MB/s]
 21%|##1       | 113M/528M [00:01<00:05, 78.6MB/s]
 23%|##2       | 120M/528M [00:01<00:05, 77.4MB/s]
 24%|##4       | 128M/528M [00:01<00:05, 79.0MB/s]
 26%|##5       | 136M/528M [00:01<00:05, 76.9MB/s]
 27%|##7       | 143M/528M [00:02<00:05, 67.7MB/s]
 28%|##8       | 150M/528M [00:02<00:05, 66.1MB/s]
 30%|##9       | 157M/528M [00:02<00:05, 69.3MB/s]
 31%|###1      | 164M/528M [00:02<00:05, 70.2MB/s]
 33%|###2      | 172M/528M [00:02<00:05, 72.9MB/s]
 34%|###4      | 180M/528M [00:02<00:04, 76.1MB/s]
 36%|###5      | 188M/528M [00:02<00:04, 79.6MB/s]
 37%|###7      | 196M/528M [00:02<00:04, 79.4MB/s]
 39%|###8      | 204M/528M [00:02<00:04, 77.2MB/s]
 40%|####      | 212M/528M [00:02<00:04, 79.5MB/s]
 42%|####1     | 219M/528M [00:03<00:04, 78.1MB/s]
 43%|####2     | 227M/528M [00:03<00:04, 76.7MB/s]
 44%|####4     | 234M/528M [00:03<00:04, 69.3MB/s]
 46%|####5     | 241M/528M [00:03<00:04, 65.5MB/s]
 47%|####6     | 247M/528M [00:03<00:04, 64.6MB/s]
 48%|####8     | 254M/528M [00:03<00:04, 65.6MB/s]
 49%|####9     | 260M/528M [00:03<00:04, 63.3MB/s]
 50%|#####     | 266M/528M [00:03<00:04, 63.4MB/s]
 52%|#####1    | 272M/528M [00:03<00:04, 60.7MB/s]
 53%|#####2    | 279M/528M [00:04<00:04, 64.4MB/s]
 54%|#####4    | 286M/528M [00:04<00:03, 66.8MB/s]
 55%|#####5    | 293M/528M [00:04<00:03, 64.1MB/s]
 57%|#####6    | 299M/528M [00:04<00:03, 61.0MB/s]
 58%|#####7    | 305M/528M [00:04<00:03, 62.5MB/s]
 59%|#####8    | 311M/528M [00:04<00:03, 58.3MB/s]
 60%|######    | 317M/528M [00:04<00:03, 57.9MB/s]
 61%|######1   | 323M/528M [00:04<00:03, 59.3MB/s]
 62%|######2   | 329M/528M [00:04<00:03, 58.5MB/s]
 63%|######3   | 335M/528M [00:05<00:03, 60.5MB/s]
 65%|######4   | 341M/528M [00:05<00:03, 61.6MB/s]
 66%|######5   | 347M/528M [00:05<00:03, 58.5MB/s]
 67%|######6   | 353M/528M [00:05<00:03, 60.9MB/s]
 68%|######8   | 360M/528M [00:05<00:02, 63.2MB/s]
 69%|######9   | 366M/528M [00:05<00:02, 61.9MB/s]
 71%|#######   | 373M/528M [00:05<00:02, 65.3MB/s]
 72%|#######1  | 379M/528M [00:05<00:02, 61.3MB/s]
 73%|#######2  | 385M/528M [00:05<00:02, 60.3MB/s]
 74%|#######4  | 391M/528M [00:05<00:02, 59.6MB/s]
 75%|#######5  | 398M/528M [00:06<00:02, 63.1MB/s]
 77%|#######6  | 405M/528M [00:06<00:01, 65.7MB/s]
 78%|#######7  | 411M/528M [00:06<00:01, 65.5MB/s]
 79%|#######9  | 417M/528M [00:06<00:01, 64.6MB/s]
 80%|########  | 425M/528M [00:06<00:01, 69.0MB/s]
 82%|########1 | 432M/528M [00:06<00:01, 67.0MB/s]
 83%|########3 | 439M/528M [00:06<00:01, 68.4MB/s]
 84%|########4 | 445M/528M [00:06<00:01, 69.4MB/s]
 86%|########5 | 452M/528M [00:06<00:01, 65.6MB/s]
 87%|########6 | 458M/528M [00:07<00:01, 63.8MB/s]
 88%|########8 | 465M/528M [00:07<00:01, 65.0MB/s]
 89%|########9 | 472M/528M [00:07<00:00, 67.3MB/s]
 91%|######### | 479M/528M [00:07<00:00, 70.2MB/s]
 92%|#########2| 486M/528M [00:07<00:00, 69.9MB/s]
 93%|#########3| 493M/528M [00:07<00:00, 70.7MB/s]
 95%|#########4| 500M/528M [00:07<00:00, 73.3MB/s]
 96%|#########6| 508M/528M [00:07<00:00, 71.3MB/s]
 98%|#########7| 515M/528M [00:07<00:00, 72.8MB/s]
 99%|#########8| 522M/528M [00:07<00:00, 72.8MB/s]
100%|##########| 528M/528M [00:08<00:00, 68.6MB/s]

모델 가중치를 불러오기 위해서는, 먼저 동일한 모델의 인스턴스(instance)를 생성한 다음에 load_state_dict() 메소드를 사용하여 매개변수들을 불러옵니다.

model = models.vgg16() # 여기서는 ``weights`` 를 지정하지 않았으므로, 학습되지 않은 모델을 생성합니다.
model.load_state_dict(torch.load('model_weights.pth'))
model.eval()
VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace=True)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace=True)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)

참고

추론(inference)을 하기 전에 model.eval() 메소드를 호출하여 드롭아웃(dropout)과 배치 정규화(batch normalization)를 평가 모드(evaluation mode)로 설정해야 합니다. 그렇지 않으면 일관성 없는 추론 결과가 생성됩니다.

모델의 형태를 포함하여 저장하고 불러오기

모델의 가중치를 불러올 때, 신경망의 구조를 정의하기 위해 모델 클래스를 먼저 생성(instantiate)해야 했습니다. 이 클래스의 구조를 모델과 함께 저장하고 싶으면, (model.state_dict()가 아닌) model 을 저장 함수에 전달합니다:

torch.save(model, 'model.pth')

다음과 같이 모델을 불러올 수 있습니다:

model = torch.load('model.pth')

참고

이 접근 방식은 Python pickle 모듈을 사용하여 모델을 직렬화(serialize)하므로, 모델을 불러올 때 실제 클래스 정의(definition)를 적용(rely on)합니다.


더 궁금하시거나 개선할 내용이 있으신가요? 커뮤니티에 참여해보세요!


이 튜토리얼이 어떠셨나요? 평가해주시면 이후 개선에 참고하겠습니다! :)

© Copyright 2018-2023, PyTorch & 파이토치 한국 사용자 모임(PyTorch Korea User Group).

Built with Sphinx using a theme provided by Read the Docs.

PyTorchKorea @ GitHub

파이토치 한국 사용자 모임을 GitHub에서 만나보세요.

GitHub로 이동

한국어 튜토리얼

한국어로 번역 중인 PyTorch 튜토리얼입니다.

튜토리얼로 이동

커뮤니티

다른 사용자들과 의견을 나누고, 도와주세요!

커뮤니티로 이동