- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

【人工智能】机器学习及与智能数据处理之降维算法PCA及其应用手写字体识别以及【自定义数据集】

南蓬幽发表于 2022/05/15 21:16:09 2022/05/15

【摘要】降维算法PCA及其应用利用PCA算法实现手写字体识别，要求：实验步骤 1. 导入数据集 2. 实现手写数字数据集的降维； 3. 比较两个模型（64维和10维）的准确率； 4. 对两个模型分别进行10次10折交叉验证，绘制评分对比曲线。代码详解结果： SVC PCA 降维算法PCA及其应用手写识别【自定义数据集】利用PCA算法实现手写字体识别，要求：实验步骤 1. 导入自定义数据...

降维算法PCA及其应用

利用PCA算法实现手写字体识别，要求：

1、实现手写数字数据集的降维；

2、比较两个模型（64维和10维）的准确率；

3、对两个模型分别进行10次10折交叉验证，绘制评分对比曲线。

实验步骤

1. 导入数据集

from sklearn.datasets import load_digits
digits = load_digits()
train = digits.data
target = digits.target

2. 实现手写数字数据集的降维；

pca = PCA(n_components=10,whiten=True)
pca.fit(x_train,y_train)
x_train_pca = pca.transform(x_train)
x_test_pca = pca.transform(x_test)

3. 比较两个模型（64维和10维）的准确率；

64维

svc = SVC(kernel = 'rbf')
svc.fit(x_train,y_train)
y_predict = svc.predict(x_test)
print('The Accuracy of SVC is', svc.score(x_test, y_test))
print("classification report of SVC\n",classification_report(y_test, y_predict,
target_names=digits.target_names.astype(str)))

10维

svc = SVC(kernel = 'rbf')
svc.fit(x_train_pca,y_train)
y_pre_svc = svc.predict(x_test_pca)
print("The Accuracy of PCA_SVC is ", svc.score(x_test_pca,y_test))
print("classification report of PCA_SVC\n", classification_report(y_test, y_pre_svc,
target_names=digits.target_names.astype(str)))

4. 对两个模型分别进行10次10折交叉验证，绘制评分对比曲线。

for i in range(100):
    # 创建子图
    plt.subplot(10,10,i+1)
    # 显示灰度图像
    plt.imshow(samples[i].reshape(8,8),cmap='gray')
    title = str(y_pre[i])
    plt.title(title,color='red')
    # 关闭坐标轴
    plt.axis('off')
plt.show()

代码详解

import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_digits
digits = load_digits()
train = digits.data
target = digits.target
x_train,x_test,y_train,y_test = train_test_split(train,target,test_size=0.2,random_state=33)
ss = StandardScaler()
x_train = ss.fit_transform(x_train)
x_test = ss.transform(x_test)
svc = SVC(kernel = 'rbf')
svc.fit(x_train,y_train)
y_predict = svc.predict(x_test)
print('The Accuracy of SVC is', svc.score(x_test, y_test))
print("classification report of SVC\n",classification_report(y_test, y_predict,
target_names=digits.target_names.astype(str)))
# 实现手写数字数据集的降维实现手写数字数据集的降维
pca = PCA(n_components=10,whiten=True)
pca.fit(x_train,y_train)
x_train_pca = pca.transform(x_train)
x_test_pca = pca.transform(x_test)
svc = SVC(kernel = 'rbf')
svc.fit(x_train_pca,y_train)
# 比较两个模型（64维和10维）的准确率
y_pre_svc = svc.predict(x_test_pca)
print("The Accuracy of PCA_SVC is ", svc.score(x_test_pca,y_test))
print("classification report of PCA_SVC\n", classification_report(y_test, y_pre_svc,
target_names=digits.target_names.astype(str)))
samples = x_test[:100]
y_pre = y_pre_svc[:100]
plt.figure(figsize=(12,38))
# 对两个模型分别进行10次10折交叉验证，绘制评分对比曲线
for i in range(100):
    plt.subplot(10,10,i+1)
    plt.imshow(samples[i].reshape(8,8),cmap='gray')
    title = str(y_pre[i])
    plt.title(title)
    plt.axis('off')
plt.show()

结果：

SVC

PCA

降维算法PCA及其应用手写识别【自定义数据集】

利用PCA算法实现手写字体识别，要求：

1、实现手写数字数据集的降维；

2、比较两个模型（64维和10维）的准确率；

3、对两个模型分别进行10次10折交叉验证，绘制评分对比曲线。

实验步骤

1. 导入自定义数据集

可以事先下载，也可以联网下载！
下载地址：

http://deeplearning.net/data/mnist/

保存如下：

from pathlib import Path
DATA_PATH = Path("data")
PATH = DATA_PATH / "mnist"
PATH.mkdir(parents=True, exist_ok=True)
URL = "http://deeplearning.net/data/mnist/"
FILENAME = "mnist.pkl.gz"
# 如果未下载，则创建目录下载数据
if not (PATH / FILENAME).exists():
    content = requests.get(URL + FILENAME).content
    (PATH / FILENAME).open("wb").write(content)
# 读取数据集
with gzip.open((PATH / FILENAME).as_posix(), "rb") as f:
    ((x_train, y_train), (x_test, y_test), _) = pickle.load(f, encoding="latin-1")
x_train = x_train[:5000,:]
y_train = y_train[:5000,]
x_test = x_test[:360,:]
y_test = y_test[:360,]

其他步骤和上一个相同【人工智能之手写字体识别】机器学习及与智能数据处理之降维算法PCA及其应用手写字体识别

代码详解

import matplotlib.pyplot as plt
from pathlib import Path
from sklearn.svm import SVC
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
import requests
import pickle
import gzip
DATA_PATH = Path("data")
PATH = DATA_PATH / "mnist"
PATH.mkdir(parents=True, exist_ok=True)
URL = "http://deeplearning.net/data/mnist/"
FILENAME = "mnist.pkl.gz"
if not (PATH / FILENAME).exists():
    content = requests.get(URL + FILENAME).content
    (PATH / FILENAME).open("wb").write(content)
# 读取数据集
with gzip.open((PATH / FILENAME).as_posix(), "rb") as f:
    ((x_train, y_train), (x_test, y_test), _) = pickle.load(f, encoding="latin-1")
x_train = x_train[:5000,:]
y_train = y_train[:5000,]
x_test = x_test[:360,:]
y_test = y_test[:360,]
#################################################################
# Each image is 28 x 28, and is being stored as a flattened row of length
# 784 (=28x28). Let's take a look at one; we need to reshape it to 2d
# first.
ss = StandardScaler()
x_train = ss.fit_transform(x_train)
x_test = ss.transform(x_test)
svc = SVC(kernel = 'rbf')
svc.fit(x_train,y_train)
y_predict = svc.predict(x_test)
print('The Accuracy of SVC is', svc.score(x_test, y_test))
print("classification report of SVC\n",classification_report(y_test, y_predict))
samples = x_test[:100]
y_pre = y_predict[:100]
plt.figure(figsize=(12,38))
for i in range(100):
    # 创建子图
    plt.subplot(10,10,i+1)
    # 显示灰度图像
    plt.imshow(samples[i].reshape(28,28),cmap='gray')
    title = str(y_pre[i])
    plt.title(title,color='red')
    # 关闭坐标轴
    plt.axis('off')
plt.show()
# 实现手写数字数据集的降维实现手写数字数据集的降维
pca = PCA(n_components=10,whiten=True)
pca.fit(x_train,y_train)
x_train_pca = pca.transform(x_train)
x_test_pca = pca.transform(x_test)
svc = SVC(kernel = 'rbf')
svc.fit(x_train_pca,y_train)
# 比较两个模型（64维和10维）的准确率
y_pre_svc = svc.predict(x_test_pca)
print("The Accuracy of PCA_SVC is ", svc.score(x_test_pca,y_test))
print("classification report of PCA_SVC\n", classification_report(y_test, y_pre_svc))
samples = x_test[:100]
y_pre = y_pre_svc[:100]
plt.figure(figsize=(12,38))
# 对两个模型分别进行10次10折交叉验证，绘制评分对比曲线
for i in range(100):
    plt.subplot(10,10,i+1)
    plt.imshow(samples[i].reshape(28,28),cmap='gray')
    title = str(y_pre[i])
    plt.title(title)
    plt.axis('off')
plt.show()

结果：

SVC

PCA

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

【人工智能】机器学习及与智能数据处理之降维算法PCA及其应用手写字体识别以及【自定义数据集】

降维算法PCA及其应用

利用PCA算法实现手写字体识别，要求：

实验步骤

1. 导入数据集

2. 实现手写数字数据集的降维；

3. 比较两个模型（64维和10维）的准确率；

4. 对两个模型分别进行10次10折交叉验证，绘制评分对比曲线。

代码详解

结果：

SVC

PCA

降维算法PCA及其应用手写识别【自定义数据集】

利用PCA算法实现手写字体识别，要求：

实验步骤

1. 导入自定义数据集

代码详解

结果：

SVC

PCA

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

推荐阅读

相关产品