欢迎访问宙启技术站
智能推送

CIFARNet模型在Python中的图像识别应用

发布时间:2024-01-06 15:54:39

CIFARNet是一个用于图像分类的深度学习模型,特别适用于CIFAR-10数据集。CIFAR-10数据集包含60000张32x32像素的彩色图片,共分为10个类别,每个类别有6000张图片。

以下是一个在Python中使用CIFARNet模型进行图像识别的示例:

首先,我们需要安装必要的库。在命令行中输入以下命令进行安装:

pip install torch numpy matplotlib torchvision

接下来,我们将从PyTorch库中导入所需的模块和函数:

import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.optim as optim
import torch
import matplotlib.pyplot as plt

定义一个函数load_cifar10来加载CIFAR-10数据集:

def load_cifar10():
    transform = transforms.Compose(
        [transforms.ToTensor(),
         transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))])

    trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                            download=True, transform=transform)
    trainloader = DataLoader(trainset, batch_size=4,
                              shuffle=True, num_workers=2)

    testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                           download=True, transform=transform)
    testloader = DataLoader(testset, batch_size=4,
                             shuffle=False, num_workers=2)
    
    classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
    
    return trainloader, testloader, classes

接下来,我们需要定义CIFARNet模型。CIFARNet网络结构类似于经典的LeNet-5模型,但做了一些修改以适应CIFAR-10数据集:

class CIFARNet(nn.Module):
    def __init__(self):
        super(CIFARNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

接下来,我们需要定义模型的训练过程:

def train_net(net, trainloader):
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

    for epoch in range(2):
        running_loss = 0.0
        for i, data in enumerate(trainloader, 0):
            inputs, labels = data

            optimizer.zero_grad()

            outputs = net(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()
            if i % 2000 == 1999:    
                print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))
                running_loss = 0.0
    print('Finished Training')

定义一个函数来测试训练好的模型的准确度:

def test_net(net, testloader):
    correct = 0
    total = 0
    with torch.no_grad():
        for data in testloader:
            images, labels = data
            outputs = net(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the 10000 test images: %.2f %%' % (
        100 * correct / total))

现在我们可以定义主函数main,并调用上述定义的函数进行训练和测试:

def main():
    trainloader, testloader, classes = load_cifar10()
    net = CIFARNet()
    train_net(net, trainloader)
    test_net(net, testloader)

if __name__ == "__main__":
    main()

最后,我们可以运行这个Python脚本,并观察输出的准确度和训练损失:

[1,  2000] loss: 2.234
[1,  4000] loss: 1.865
[1,  6000] loss: 1.663
[1,  8000] loss: 1.578
[1, 10000] loss: 1.531
[1, 12000] loss: 1.516
[2,  2000] loss: 1.445
[2,  4000] loss: 1.439
[2,  6000] loss: 1.440
[2,  8000] loss: 1.408
[2, 10000] loss: 1.387
[2, 12000] loss: 1.383
Finished Training
Accuracy of the network on the 10000 test images: 50.73 %

通过上述示例,我们可以使用CIFARNet模型对CIFAR-10数据集中的图像进行分类。可以根据需要对模型进行进一步的调整和改进,以提高准确度。