使用MXNet进行多类别分类中的查准率(precision)和查全率(recall)计算
发布时间:2024-01-07 19:58:47
MXNet是一个深度学习框架,提供了丰富的工具和功能来进行多类别分类任务。在这里,我们将使用MXNet计算多类别分类任务的查准率(precision)和查全率(recall)。
了解查准率和查全率:
- 查准率(precision)表示分类为某个类别的样本中,真正属于该类别的样本所占的比例。
- 查全率(recall)表示真正属于某个类别的样本中,被正确分类为该类别的样本所占的比例。
首先,我们需要加载数据集并进行预处理。我们将使用一个示例数据集MNIST,其中包含了手写数字的图像和对应的标签。
import mxnet as mx
from mxnet import gluon, nd
from mxnet.gluon.data.vision import transforms
from mxnet.gluon.data import DataLoader
from mxnet.gluon import nn
# 加载数据集
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(0.13, 0.31)
])
train_dataset = gluon.data.vision.datasets.MNIST(train=True).transform_first(transform)
test_dataset = gluon.data.vision.datasets.MNIST(train=False).transform_first(transform)
batch_size = 128
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
接下来,我们定义一个简单的卷积神经网络模型。该模型由两个卷积层和两个全连接层组成。
class Net(nn.HybridBlock):
def __init__(self, num_classes, **kwargs):
super(Net, self).__init__(**kwargs)
self.conv1 = nn.Conv2D(channels=16, kernel_size=(3,3), activation='relu')
self.conv2 = nn.Conv2D(channels=32, kernel_size=(3,3), activation='relu')
self.fc1 = nn.Dense(512, activation='relu')
self.fc2 = nn.Dense(num_classes)
def hybrid_forward(self, F, x):
x = self.conv1(x)
x = self.conv2(x)
x = self.fc1(x)
x = self.fc2(x)
return x
num_classes = 10
net = Net(num_classes)
net.hybridize()
定义损失函数和优化器:
loss_fn = gluon.loss.SoftmaxCrossEntropyLoss() optimizer = gluon.Trainer(net.collect_params(), 'adam')
训练模型:
def train(net, train_loader, loss_fn, optimizer, ctx):
cumulative_loss = 0.0
cumulative_accuracy = 0.0
for i, (data, label) in enumerate(train_loader):
data = data.as_in_context(ctx)
label = label.as_in_context(ctx)
with mx.autograd.record():
output = net(data)
loss = loss_fn(output, label)
loss.backward()
optimizer.step(batch_size)
cumulative_loss += nd.mean(loss).asscalar()
predictions = nd.argmax(output, axis=1)
cumulative_accuracy += nd.mean(predictions == label).asscalar()
epoch_loss = cumulative_loss / len(train_loader)
epoch_accuracy = cumulative_accuracy / len(train_loader)
return epoch_loss, epoch_accuracy
ctx = mx.gpu() if mx.context.num_gpus() else mx.cpu()
num_epochs = 10
for epoch in range(num_epochs):
train_loss, train_accuracy = train(net, train_loader, loss_fn, optimizer, ctx)
print(f"Epoch {epoch+1}: Loss={train_loss:.4f}, Accuracy={train_accuracy:.4f}")
注意:以上代码仅展示了模型训练过程,而没有计算查准率和查全率。要计算查准率和查全率,我们需要在进行预测时使用测试集,并比较预测结果与真实标签。
def evaluate(net, data_loader, ctx):
cumulative_precision = 0.0
cumulative_recall = 0.0
for data, label in data_loader:
data = data.as_in_context(ctx)
label = label.as_in_context(ctx)
output = net(data)
predictions = nd.argmax(output, axis=1)
# 计算TP、FP、FN
true_positives = nd.sum(predictions * label)
false_positives = nd.sum((1 - label) * predictions)
false_negatives = nd.sum(label * (1 - predictions))
# 计算查准率和查全率
precision = true_positives / (true_positives + false_positives + 1e-7)
recall = true_positives / (true_positives + false_negatives + 1e-7)
cumulative_precision += nd.mean(precision).asscalar()
cumulative_recall += nd.mean(recall).asscalar()
avg_precision = cumulative_precision / len(data_loader)
avg_recall = cumulative_recall / len(data_loader)
return avg_precision, avg_recall
test_precision, test_recall = evaluate(net, test_loader, ctx)
print(f"Test Precision: {test_precision:.4f}, Test Recall: {test_recall:.4f}")
在上述代码中,我们计算了测试集上的查准率和查全率。每个批次的真正例(true positives)、假正例(false positives)和假负例(false negatives)通过乘法和求和来计算,然后使用这些值计算查准率和查全率。
这就是使用MXNet计算多类别分类任务的查准率和查全率的示例代码。你可以根据自己的需要进行修改和扩展,适应不同的任务。
