ML之Kmeans:利用自定义Kmeans函数实现对多个坐标点(自定义四个点)进行自动(最多迭代10次)分类

举报
一个处女座的程序猿 发表于 2021/03/28 01:46:48 2021/03/28
【摘要】 ML之Kmeans:利用自定义Kmeans函数实现对多个坐标点(自定义四个点)进行自动(最多迭代10次)分类     目录 输出结果 核心代码     输出结果       核心代码 #!/usr/bin/python# -*- coding:utf-8 -*- import numpy as np#ML之Kmeans:利用自定义Kmeans函...

ML之Kmeans:利用自定义Kmeans函数实现对多个坐标点(自定义四个点)进行自动(最多迭代10次)分类

 

 

目录

输出结果

核心代码


 

 

输出结果

 

 

 

核心代码


  
  1. #!/usr/bin/python
  2. # -*- coding:utf-8 -*-
  3. import numpy as np
  4. #ML之Kmeans:利用自定义Kmeans函数实现对多个坐标点(自定义四个点)进行自动(最多迭代10次)分类
  5. def kmeans(X, k, maxIt):
  6. numPoints, numDim = X.shape
  7. dataSet = np.zeros((numPoints, numDim + 1))
  8. dataSet[:, :-1] = X
  9. centroids = dataSet[np.random.randint(numPoints, size = k), :]
  10. #centroids = dataSet[0:2, :]
  11. #Randomly assign labels to initial centorid给初始中心随机分配标签
  12. centroids[:, -1] = range(1, k +1)
  13. iterations = 0
  14. oldCentroids = None
  15. # Run the main k-means algorithm
  16. while not shouldStop(oldCentroids, centroids, iterations, maxIt):
  17. print ("iteration: \n", iterations)
  18. print ("dataSet: \n", dataSet)
  19. print ("centroids: \n", centroids)
  20. # Save old centroids for convergence test. Book keeping.
  21. oldCentroids = np.copy(centroids)
  22. iterations += 1
  23. # Assign labels to each datapoint based on centroids
  24. updateLabels(dataSet, centroids)
  25. # Assign centroids based on datapoint labels
  26. centroids = getCentroids(dataSet, k)
  27. # We can get the labels too by calling getLabels(dataSet, centroids)
  28. return dataSet
  29. # Function: Should Stop
  30. # -------------
  31. # Returns True or False if k-means is done. K-means terminates either
  32. # because it has run a maximum number of iterations OR the centroids
  33. # stop changing.
  34. def shouldStop(oldCentroids, centroids, iterations, maxIt):
  35. if iterations > maxIt:
  36. return True
  37. return np.array_equal(oldCentroids, centroids)
  38. # Function: Get Labels
  39. # -------------
  40. # Update a label for each piece of data in the dataset.
  41. def updateLabels(dataSet, centroids):
  42. # For each element in the dataset, chose the closest centroid.
  43. # Make that centroid the element's label.
  44. numPoints, numDim = dataSet.shape
  45. for i in range(0, numPoints):
  46. dataSet[i, -1] = getLabelFromClosestCentroid(dataSet[i, :-1], centroids)
  47. def getLabelFromClosestCentroid(dataSetRow, centroids):
  48. label = centroids[0, -1];
  49. minDist = np.linalg.norm(dataSetRow - centroids[0, :-1])
  50. for i in range(1 , centroids.shape[0]):
  51. dist = np.linalg.norm(dataSetRow - centroids[i, :-1])
  52. if dist < minDist:
  53. minDist = dist
  54. label = centroids[i, -1]
  55. print ("minDist:", minDist)
  56. return label
  57. # Function: Get Centroids
  58. # -------------
  59. # Returns k random centroids, each of dimension n.
  60. def getCentroids(dataSet, k):
  61. # Each centroid is the geometric mean of the points that
  62. # have that centroid's label. Important: If a centroid is empty (no points have
  63. # that centroid's label) you should randomly re-initialize it.
  64. result = np.zeros((k, dataSet.shape[1]))
  65. for i in range(1, k + 1):
  66. oneCluster = dataSet[dataSet[:, -1] == i, :-1]
  67. result[i - 1, :-1] = np.mean(oneCluster, axis = 0)
  68. result[i - 1, -1] = i
  69. x1 = np.array([1, 1])
  70. x2 = np.array([2, 1])
  71. x3 = np.array([4, 3])
  72. x4 = np.array([5, 4])
  73. testX = np.vstack((x1, x2, x3, x4))
  74. result = kmeans(testX, 2, 10)
  75. print ("final result:")
  76. print (result)

 

 


相关文章
ML之Kmeans:利用自定义Kmeans函数实现对多个坐标点(自定义四个点)进行自动(最多迭代10次)分类

 

文章来源: yunyaniu.blog.csdn.net,作者:一个处女座的程序猿,版权归原作者所有,如需转载,请联系作者。

原文链接:yunyaniu.blog.csdn.net/article/details/80020220

【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。