- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

【手写数字识别】基于matlab CNN网络手写数字识别分类【含Matlab源码 1286期】

海神之光发表于 2022/05/29 00:13:06 2022/05/29

【摘要】一、CNN简介 1 机器如何识图先给大家出个脑筋急转弯：在白纸上画出一个大熊猫，一共需要几种颜色的画笔？——大家应该都知道，只需要一种黑色的画笔，只需要将大熊猫黑色的地方涂上黑色，一个大熊猫的图像就可...

一、CNN简介

1 机器如何识图
先给大家出个脑筋急转弯：在白纸上画出一个大熊猫，一共需要几种颜色的画笔？——大家应该都知道，只需要一种黑色的画笔，只需要将大熊猫黑色的地方涂上黑色，一个大熊猫的图像就可以展现出来。

我们画大熊猫的方式，其实和妈妈们的十字绣很接近——在给定的格子里，绣上不同的颜色，最后就可以展现出一幅特定的“图片”。而机器识图的方式正好和绣十字绣的方式相反，现在有了一幅图片，机器通过识别图片中每个格子（像素点）上的颜色，将每个格子里的颜色都用数字类型存储，得到一张很大的数字矩阵，图片信息也就存储在这张数字矩阵中。

上图中每一个格子代表一个像素点，像素点里的数字代表颜色码，颜色码范围是[0，255]，（各式各样的颜色都是由红、绿、蓝三色组成，每个颜色都是0~255之间数字）

我们在得到的一张大数字矩阵的基础上开展卷积神经网络识别工作：
机器识图的过程：机器识别图像并不是一下子将一个复杂的图片完整识别出来，而是将一个完整的图片分割成许多个小部分，把每个小部分里具有的特征提取出来（也就是识别每个小部分），再将这些小部分具有的特征汇总到一起，就可以完成机器识别图像的过程了

2 卷积神经网络原理介绍
用CNN卷积神经网络识别图片，一般需要的步骤有：
（1）卷积层初步提取特征
（2）池化层提取主要特征
（3）全连接层将各部分特征汇总
（4）产生分类器，进行预测识别

2.1 卷积层工作原理
卷积层的作用：就是提取图片每个小部分里具有的特征
假定我们有一个尺寸为66 的图像，每一个像素点里都存储着图像的信息。我们再定义一个卷积核（相当于权重），用来从图像中提取一定的特征。卷积核与数字矩阵对应位相乘再相加，得到卷积层输出结果。

（429 = 181+540+511+550+1211+750+351+240+2041）
卷积核的取值在没有以往学习的经验下，可由函数随机生成，再逐步训练调整
当所有的像素点都至少被覆盖一次后，就可以产生一个卷积层的输出（下图的步长为1）

机器一开始并不知道要识别的部分具有哪些特征，是通过与不同的卷积核相作用得到的输出值，相互比较来判断哪一个卷积核最能表现该图片的特征——比如我们要识别图像中的某种特征（比如曲线），也就是说，这个卷积核要对这种曲线有很高的输出值，对其他形状（比如三角形）则输出较低。卷积层输出值越高，就说明匹配程度越高，越能表现该图片的特征。

卷积层具体工作过程：
比如我们设计的一个卷积核如下左，想要识别出来的曲线如下右：

现在我们用上面的卷积核，来识别这个简化版的图片——一只漫画老鼠

当机器识别到老鼠的屁股的时候，卷积核与真实区域数字矩阵作用后，输出较大：6600

而用同一个卷积核，来识别老鼠的耳朵的时候，输出则很小：0

我们就可以认为：现有的这个卷积核保存着曲线的特征，匹配识别出来了老鼠的屁股是曲线的。我们则还需要其他特征的卷积核，来匹配识别出来老鼠的其他部分。卷积层的作用其实就是通过不断的改变卷积核，来确定能初步表征图片特征的有用的卷积核是哪些，再得到与相应的卷积核相乘后的输出矩阵

2.2 池化层工作原理
池化层的输入就是卷积层输出的原数据与相应的卷积核相乘后的输出矩阵
池化层的目的：
为了减少训练参数的数量，降低卷积层输出的特征向量的维度
减小过拟合现象，只保留最有用的图片信息，减少噪声的传递
最常见的两种池化层的形式：
最大池化：max-pooling——选取指定区域内最大的一个数来代表整片区域
均值池化：mean-pooling——选取指定区域内数值的平均值来代表整片区域
举例说明两种池化方式:（池化步长为2，选取过的区域，下一次就不再选取）

在44的数字矩阵里，以步长22选取区域，比如上左将区域[1,2,3,4]中最大的值4池化输出；上右将区域[1,2,3,4]中平均值5/2池化输出

2.3 全连接层工作原理
卷积层和池化层的工作就是提取特征，并减少原始图像带来的参数。然而，为了生成最终的输出，我们需要应用全连接层来生成一个等于我们需要的类的数量的分类器。
全连接层的工作原理和之前的神经网络学习很类似，我们需要把池化层输出的张量重新切割成一些向量，乘上权重矩阵，加上偏置值，然后对其使用ReLU激活函数，用梯度下降法优化参数既可。

二、部分源代码

%% 准备工作空间
clc
clear all
close all
%% 导入数据
digitDatasetPath = fullfile('./', '/HandWrittenDataset/');
imds = imageDatastore(digitDatasetPath, ...
    'IncludeSubfolders',true,'LabelSource','foldernames');% 采用文件夹名称作为数据标记
%,'ReadFcn',@mineRF

% 数据集图片个数
countEachLabel(imds)

numTrainFiles = 17;% 每一个数字有22个样本，取17个样本作为训练数据
[imdsTrain,imdsValidation] = splitEachLabel(imds,numTrainFiles,'randomize');
% 查看图片的大小
img=readimage(imds,1);
size(img)

%% 定义卷积神经网络的结构
layers = [
% 输入层
imageInputLayer([28 28 1])
% 卷积层
convolution2dLayer(5,6,'Padding',2)
batchNormalizationLayer
reluLayer

maxPooling2dLayer(2,'stride',2)

convolution2dLayer(5, 16)
batchNormalizationLayer
reluLayer

maxPooling2dLayer(2,'stride',2)

convolution2dLayer(5, 120)
batchNormalizationLayer
reluLayer
% 最终层
fullyConnectedLayer(10)
softmaxLayer
classificationLayer];

%% 训练神经网络
% 设置训练参数
options = trainingOptions('sgdm',...
    'maxEpochs', 50, ...
    'ValidationData', imdsValidation, ...
    'ValidationFrequency',5,...
    'Verbose',false,...
    'Plots','training-progress');% 显示训练进度

% 训练神经网络，保存网络
net = trainNetwork(imdsTrain, layers ,options);
save 'CSNet.mat' net

%% 标记数据（文件名称方式，自行构造）
cifar10Data = tempdir;
 
url = 'https://www.cs.toronto.edu/~kriz/cifar-10-matlab.tar.gz';
 
helperCIFAR10Data.download(url,cifar10Data);

[trainingImages,trainingLabels,testImages,testLabels] = helperCIFAR10Data.load('cifar10Data');
size(trainingImages)
numImageCategories = 10;
categories(trainingLabels)
% Create the image input layer for 32x32x3 CIFAR-10 images
[height, width, numChannels, ~] = size(trainingImages);
imageSize = [height width numChannels];
inputLayer = imageInputLayer(imageSize);
% Convolutional layer parameters filter size
filterSize = [5 5];
numFilters = 32;
middleLayers = [   
% The first convolutional layer has a bank of 32 5x5x3 filters. A
% symmetric padding of 2 pixels is added to ensure that image borders
% are included in the processing. This is important to avoid
% information at the borders being washed away too early in the
% network.
convolution2dLayer(filterSize, numFilters, 'Padding', 2)  %(n+2p-f)/s+1
 
% Note that the third dimension of the filter can be omitted because it
% is automatically deduced based on the connectivity of the network. In
% this case because this layer follows the image layer, the third
% dimension must be 3 to match the number of channels in the input
% image.
 
% Next add the ReLU layer:
reluLayer()
 
% Follow it with a max pooling layer that has a 3x3 spatial pooling area
% and a stride of 2 pixels. This down-samples the data dimensions from
% 32x32 to 15x15.
maxPooling2dLayer(3, 'Stride', 2)
 
% Repeat the 3 core layers to complete the middle of the network.
convolution2dLayer(filterSize, numFilters, 'Padding', 2)
reluLayer()
maxPooling2dLayer(3, 'Stride',2)
 
convolution2dLayer(filterSize, 2 * numFilters, 'Padding', 2)
reluLayer()
maxPooling2dLayer(3, 'Stride',2)

];
finalLayers = [
% Add a fully connected layer with 64 output neurons. The output size of
% this layer will be an array with a length of 64.
fullyConnectedLayer(64)
 
% Add an ReLU non-linearity.
reluLayer
 
% Add the last fully connected layer. At this point, the network must
% produce 10 signals that can be used to measure whether the input image
% belongs to one category or another. This measurement is made using the
% subsequent loss layers.
fullyConnectedLayer(numImageCategories)
 
% Add the softmax loss layer and classification layer. The final layers use
% the output of the fully connected layer to compute the categorical
% probability distribution over the image classes. During the training
% process, all the network weights are tuned to minimize the loss over this
% categorical distribution.
softmaxLayer
classificationLayer
];
layers = [
    inputLayer
    middleLayers
    finalLayers
    ];

layers(2).Weights = 0.0001 * randn([filterSize numChannels numFilters]);

% Set the network training options
opts = trainingOptions('sgdm', ...
    'Momentum', 0.9, ...
    'InitialLearnRate', 0.001, ...
    'LearnRateSchedule', 'piecewise', ...
    '

% A trained network is loaded from disk to save time when running the
% example. Set this flag to true to train the network.
doTraining = false;
 
if doTraining    
    % Train a network.
    cifar10Net = trainNetwork(trainingImages, trainingLabels, layers, opts);
else
    % Load pre-trained detector for the example.
    load('rcnnStopSigns.mat','cifar10Net')       
end

% Extract the first convolutional layer weights
w = cifar10Net.Layers(2).Weights;
 
% rescale the weights to the range [0, 1] for better visualization
w = rescale(w);
 
figure
montage(w)

% Run the network on the test set.
YTest = classify(cifar10Net, testImages);
 
% Calculate the accuracy.
accuracy = sum(YTest == testLabels)/numel(testLabels)
%% 使用网络进行分类并计算准确性
% 手写数据
YPred = classify(net,mineSet);
YValidation =mineSet.Labels;
% 计算正确率
accuracy = sum(YPred ==YValidation)/numel(YValidation);
end

% 伸缩+反色
% function data =mineRF(filename)
% img= imread(filename);
% data=uint8(255-rgb2gray(imresize(img,[28 28])));
% 
% end

% 二值化 
% function data =mineRF(filename)
% img= imread(filename);
% data=imbinarize(img);
% 
% end
  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84
  85
  86
  87
  88
  89
  90
  91
  92
  93
  94
  95
  96
  97
  98
  99
  100
  101
  102
  103
  104
  105
  106
  107
  108
  109
  110
  111
  112
  113
  114
  115
  116
  117
  118
  119
  120
  121
  122
  123
  124
  125
  126
  127
  128
  129
  130
  131
  132
  133
  134
  135
  136
  137
  138
  139
  140
  141
  142
  143
  144
  145
  146
  147
  148
  149
  150
  151
  152
  153
  154
  155
  156
  157
  158
  159
  160
  161
  162
  163
  164
  165
  166
  167
  168
  169
  170
  171
  172
  173
  174
  175
  176
  177
  178
  179
  180
  181
  182
  183
  184
  185
  186
  187
  188
  189
  190
  191
  192
  193
  194
  195
  196
  197
  198
  199
  200
  201
  202
  203

三、运行结果

四、matlab版本及参考文献

1 matlab版本
2014a

2 参考文献
[1] 蔡利梅.MATLAB图像处理——理论、算法与实例分析[M].清华大学出版社，2020.
[2]杨丹,赵海滨,龙哲.MATLAB图像处理实例详解[M].清华大学出版社，2013.
[3]周品.MATLAB图像处理与图形用户界面设计[M].清华大学出版社，2013.
[4]刘成龙.精通MATLAB图像处理[M].清华大学出版社，2015.

文章来源: qq912100926.blog.csdn.net，作者：海神之光，版权归原作者所有，如需转载，请联系作者。

原文链接：qq912100926.blog.csdn.net/article/details/120113523

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

【手写数字识别】基于matlab CNN网络手写数字识别分类【含Matlab源码 1286期】

一、CNN简介

二、部分源代码

三、运行结果

四、matlab版本及参考文献

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

【手写数字识别】基于matlab CNN网络手写数字识别分类【含Matlab源码 1286期】

一、CNN简介

二、部分源代码

三、运行结果

四、matlab版本及参考文献

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

推荐阅读

相关产品