LightGBM回归预测实战
【摘要】 使用集成模型LightGBM解决简单的回归预测任务
🔋核心代码🔋
1️⃣基模型
import lightgbm as lgb
# 读取数据集
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")
# 构建训练集和验证集
X = train.drop(columns=['Id', 'SalePrice'], axis=1).values # 说明:Id不是特征,SalePrice是标签,需要屏蔽
y = train['SalePrice'].values # 标签 SalePrice
# K折交叉验证
kf = KFold(n_splits=10)
rmse_scores = []
for train_indices, test_indices in kf.split(X):
X_train, X_test = X[train_indices], X[test_indices]
y_train, y_test = y[train_indices], y[test_indices]
# 初始化模型
LGBR = lgb.LGBMRegressor() # 基模型
# 训练/fit拟合
LGBR.fit(X_train, y_train)
# 预测
y_pred = LGBR.predict(X_test)
# 评估
rmse = mean_squared_error(y_test, y_pred)
# 累计结果
rmse_scores.append(rmse)
print("rmse scores : ", rmse_scores)
print(f'average rmse scores : {np.mean(rmse_scores)}')
2️⃣调参模型
train_data = lgb.Dataset(X_train, label=y_train) # 训练集
test_data = lgb.Dataset(X_test, label=y_test, reference=train_data) # 验证集
# 参数
params = {
'objective':'regression', # 目标任务
'metric':'rmse', # 评估指标
'learning_rate':0.1, # 学习率
'max_depth':15, # 树的深度
'num_leaves':20, # 叶子数
}
# 创建模型对象
model = lgb.train(params=params,
train_set=train_data,
num_boost_round=300,
early_stopping_rounds=30,
valid_names=['test'],
valid_sets=[test_data])
3️⃣预测
score = model.best_score['test']['rmse']
score
test_pred = model.predict(test.drop('Id',axis=1).values)
【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱:
cloudbbs@huaweicloud.com
- 点赞
- 收藏
- 关注作者
评论(0)