Python中常见优化器(Optimizer)算法性能对比
发布时间:2024-01-02 01:54:32
在Python中,常见的优化器算法有梯度下降(Gradient Descent)、随机梯度下降(Stochastic Gradient Descent)、动量优化器(Momentum Optimizer)、Adam优化器(Adam Optimizer)等。本文将使用一个简单的线性回归模型来比较这些优化器算法的性能。
首先,我们导入必要的库和数据集:
import numpy as np from sklearn.datasets import make_regression from sklearn.model_selection import train_test_split # 生成线性回归数据集 X, y = make_regression(n_samples=1000, n_features=10, random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
接下来,我们定义一个简单的线性回归模型:
class LinearRegression:
def __init__(self, lr=0.001):
self.lr = lr
self.weights = None
self.bias = None
def fit(self, X, y, epochs=100):
n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0
for _ in range(epochs):
y_pred = self.predict(X)
dw = (1/n_samples) * np.dot(X.T, (y_pred - y))
db = (1/n_samples) * np.sum(y_pred - y)
self.weights -= self.lr * dw
self.bias -= self.lr * db
def predict(self, X):
return np.dot(X, self.weights) + self.bias
接下来,我们使用上述定义的线性回归模型,并比较不同优化器算法的性能:
from sklearn.metrics import mean_squared_error
lr = 0.001
epochs = 100
optimizers = ['Gradient Descent', 'Stochastic Gradient Descent', 'Momentum Optimizer', 'Adam Optimizer']
mse_results = []
# 梯度下降
model = LinearRegression(lr=lr)
model.fit(X_train, y_train, epochs=epochs)
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_results.append(mse)
# 随机梯度下降
model = LinearRegression(lr=lr)
for _ in range(epochs):
for i in range(len(X_train)):
idx = np.random.randint(len(X_train))
xi, yi = X_train[idx], y_train[idx]
model.fit(xi.reshape(1, -1), yi.reshape(1, -1))
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_results.append(mse)
# 动量优化器
model = LinearRegression(lr=lr)
gamma = 0.9
v = np.zeros(X_train.shape[1])
for _ in range(epochs):
for i in range(len(X_train)):
xi, yi = X_train[i], y_train[i]
y_pred = model.predict(xi.reshape(1, -1))
dw = (1/len(X_train)) * np.dot(xi.T, (y_pred - yi))
db = (1/len(X_train)) * np.sum(y_pred - yi)
v = gamma * v + lr * dw
model.weights -= v
model.bias -= lr * db
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_results.append(mse)
# Adam优化器
model = LinearRegression(lr=lr)
beta1 = 0.9
beta2 = 0.999
epsilon = 1e-8
m = np.zeros(X_train.shape[1])
v = np.zeros(X_train.shape[1])
for _ in range(epochs):
for i in range(len(X_train)):
xi, yi = X_train[i], y_train[i]
y_pred = model.predict(xi.reshape(1, -1))
dw = (1/len(X_train)) * np.dot(xi.T, (y_pred - yi))
db = (1/len(X_train)) * np.sum(y_pred - yi)
m = beta1 * m + (1 - beta1) * dw
v = beta2 * v + (1 - beta2) * (dw ** 2)
m_hat = m / (1 - beta1)
v_hat = v / (1 - beta2)
model.weights -= lr * m_hat / (np.sqrt(v_hat) + epsilon)
model.bias -= lr * db
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_results.append(mse)
# 打印结果
for optimizer, mse in zip(optimizers, mse_results):
print(f"{optimizer}: Mean Squared Error = {mse}")
以上代码中,我们分别使用不同的优化器算法来训练线性回归模型,并计算其在测试集上的均方误差(Mean Squared Error)。最后,我们将不同优化器算法的性能进行比较。
通过运行上述代码,我们可以得到不同优化器算法的性能对比结果。记住,每次运行结果可能会有所不同,因为我们使用随机梯度下降和Adam优化器时,每次迭代都使用了不同的训练样本。
这里的例子只是一个简单的线性回归模型,实际上,在复杂的神经网络模型中,不同的优化器算法可能会有不同的表现。因此,选择合适的优化器算法对于训练模型的性能和收敛速度非常重要。
