使用torch.nn.utils在Python中实现数据平行化
发布时间:2023-12-11 05:52:15
在PyTorch中实现数据并行处理可以使用torch.nn.DataParallel类和torch.nn.parallel.DistributedDataParallel类。这些类允许在多个GPU上同时训练模型,并可以在单个GPU上处理更大的批次。
下面是一个使用torch.nn.DataParallel的例子:
首先,导入所需的库和模块:
import torch import torch.nn as nn import torch.optim as optim import torch.nn.functional as F import torch.nn.utils as utils import torchvision.datasets as datasets import torchvision.transforms as transforms
然后,定义一个简单的卷积神经网络模型:
class ConvNet(nn.Module):
def __init__(self):
super(ConvNet, self).__init__()
self.conv1 = nn.Conv2d(3, 64, kernel_size=3)
self.conv2 = nn.Conv2d(64, 128, kernel_size=3)
self.fc1 = nn.Linear(128 * 5 * 5, 1024)
self.fc2 = nn.Linear(1024, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = x.view(-1, 128 * 5 * 5)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
接下来,定义数据加载器和转换:
train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transforms.ToTensor()) train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True) test_dataset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transforms.ToTensor()) test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=64, shuffle=False)
然后,定义训练函数:
def train(model, optimizer, criterion, data_loader):
model.train()
for inputs, targets in data_loader:
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
接下来,实例化模型、定义优化器和损失函数:
model = ConvNet() model = nn.DataParallel(model) # 使用torch.nn.DataParallel进行数据并行处理 optimizer = optim.Adam(model.parameters(), lr=0.001) criterion = nn.CrossEntropyLoss()
最后,进行训练和测试循环:
for epoch in range(10):
train(model, optimizer, criterion, train_loader)
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for inputs, targets in test_loader:
outputs = model(inputs)
test_loss += criterion(outputs, targets).item()
_, predicted = outputs.max(1)
correct += predicted.eq(targets).sum().item()
test_loss /= len(test_loader.dataset)
print(f'Epoch {epoch}: Test Loss: {test_loss}, Accuracy: {correct/len(test_loader.dataset)}')
上述例子展示了如何使用torch.nn.DataParallel在多个GPU上进行数据并行的训练和测试。你可以根据自己的需求调整模型结构和参数设置。
