- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

HarmonyOS APP开发：图像分割与语义理解

Jack20 发表于 2026/06/21 13:56:52 2026/06/21

【摘要】 HarmonyOS APP开发：图像分割与语义理解核心要点：本文深入讲解HarmonyOS端侧图像分割技术，涵盖语义分割、实例分割和全景分割三种范式的原理差异，重点实现DeepLabV3+语义分割模型的端侧部署，以及分割掩码的可视化渲染与语义场景理解。一、背景与动机你用手机拍了一张风景照，想给天空换个颜色——把蓝天变成晚霞。传统做法是手动抠图，沿着天际线一点点描边，费时费力。但如果手机能...

HarmonyOS APP开发：图像分割与语义理解

核心要点：本文深入讲解HarmonyOS端侧图像分割技术，涵盖语义分割、实例分割和全景分割三种范式的原理差异，重点实现DeepLabV3+语义分割模型的端侧部署，以及分割掩码的可视化渲染与语义场景理解。

一、背景与动机

你用手机拍了一张风景照，想给天空换个颜色——把蓝天变成晚霞。传统做法是手动抠图，沿着天际线一点点描边，费时费力。但如果手机能自动理解"这是天空、这是建筑、这是树木"，一键换背景就成了可能。

这就是图像分割的魅力——它不仅知道"图里有什么"，还精确到"每一个像素属于什么"。和目标检测画方框不同，图像分割是像素级的精细分类，输出的不是框，而是一张和原图等大的掩码图，每个像素都标注了类别。

图像分割在移动端的应用越来越广：人像模式背景虚化、证件照自动抠图、AR场景理解、自动驾驶路面识别……这些场景都需要端侧实时分割能力。云端分割延迟太高，隐私也有顾虑，端侧部署是必然趋势。

但分割模型有个"甜蜜的烦恼"——输出分辨率大。一张640×640的图，分类模型输出一个概率向量，检测模型输出几百个框，而分割模型要输出640×640=409600个像素的类别标签。这对端侧算力和内存都是考验。怎么在精度和速度之间找到平衡？怎么高效渲染分割掩码？这就是本文要解决的问题。

二、核心原理

2.1 图像分割的三种范式

图像分割不是一种技术，而是一个技术家族。根据分割粒度的不同，分为三种范式：

2.2 DeepLabV3+模型架构

DeepLabV3+是语义分割领域的经典模型，它的核心创新是**空洞空间卷积池化（ASPP）**模块，通过不同膨胀率的空洞卷积捕获多尺度上下文信息：

Backbone：MobileNetV2（轻量版）或ResNet-101（精度版）
ASPP模块：并行使用1×1卷积、3×3卷积（rate=6/12/18）和全局平均池化
Decoder：逐步上采样恢复分辨率，融合低层特征细化边界

端侧推荐使用**MobileNetV2+DeepLabV3+**组合，模型小、速度快，精度也够用。

2.3 分割输出格式

语义分割模型的输出是一个概率张量，形状为[1, numClasses, H, W]。对于每个像素位置(h, w)，取概率最大的类别作为该像素的预测类别：

argmax_c(output[0, c, h, w]) → 像素(h, w)的类别

这个argmax操作就是从概率图生成掩码图的关键步骤。

三、代码实战

3.1 分割数据结构定义

// SegmentationTypes.ets - 分割数据结构定义

/**
 * 语义分割配置参数
 */
export interface SegmentationConfig {
  modelName: string;            // 模型名称
  modelPath: string;            // 模型文件路径
  inputWidth: number;           // 输入宽度
  inputHeight: number;          // 输入高度
  numClasses: number;           // 类别数量
  labelPath: string;            // 标签文件路径
  outputWidth: number;          // 输出宽度（可能与输入不同）
  outputHeight: number;         // 输出高度
}

/**
 * 像素级分割掩码
 * 每个像素存储类别ID
 */
export class SegmentationMask {
  width: number;
  height: number;
  data: Uint8Array;  // 每个像素的类别ID

  constructor(width: number, height: number) {
    this.width = width;
    this.height = height;
    this.data = new Uint8Array(width * height);
  }

  /**
   * 设置指定位置的类别
   */
  setClass(x: number, y: number, classId: number): void {
    this.data[y * this.width + x] = classId;
  }

  /**
   * 获取指定位置的类别
   */
  getClass(x: number, y: number): number {
    return this.data[y * this.width + x];
  }

  /**
   * 获取指定类别的像素数量
   */
  getClassPixelCount(classId: number): number {
    let count = 0;
    for (let i = 0; i < this.data.length; i++) {
      if (this.data[i] === classId) count++;
    }
    return count;
  }

  /**
   * 获取所有出现过的类别及其像素占比
   */
  getClassDistribution(): Map<number, number> {
    const distribution = new Map<number, number>();
    const total = this.data.length;

    for (let i = 0; i < this.data.length; i++) {
      const cls = this.data[i];
      distribution.set(cls, (distribution.get(cls) || 0) + 1);
    }

    // 转换为百分比
    const result = new Map<number, number>();
    distribution.forEach((count, cls) => {
      result.set(cls, count / total);
    });

    return result;
  }
}

/**
 * ADE20K数据集常用类别标签（150类中的部分）
 */
export const ADE20K_LABELS: Record<number, string> = {
  0: 'wall', 1: 'building', 2: 'sky', 3: 'floor', 4: 'tree',
  5: 'ceiling', 6: 'road', 7: 'bed', 8: 'windowpane', 9: 'grass',
  10: 'cabinet', 11: 'sidewalk', 12: 'person', 13: 'earth',
  14: 'door', 15: 'table', 16: 'mountain', 17: 'plant',
  18: 'curtain', 19: 'chair', 20: 'car', 21: 'water',
  22: 'painting', 23: 'sofa', 24: 'shelf', 25: 'house',
  26: 'sea', 27: 'mirror', 28: 'rug', 29: 'field',
  30: 'armchair', 31: 'seat', 32: 'fence', 33: 'desk',
  34: 'rock', 35: 'wardrobe', 36: 'lamp', 37: 'bathtub',
  38: 'railing', 39: 'cushion', 40: 'base', 41: 'box'
};

/**
 * 分割类别配色方案（RGBA格式）
 * 每个类别对应一个醒目的颜色
 */
export const SEGMENTATION_COLORS: Record<number, number[]> = {
  0:  [128, 128, 128, 180],  // wall - 灰色
  1:  [255, 0, 0, 180],      // building - 红色
  2:  [0, 0, 255, 180],      // sky - 蓝色
  3:  [128, 64, 0, 180],     // floor - 棕色
  4:  [0, 128, 0, 180],      // tree - 绿色
  5:  [192, 192, 0, 180],    // ceiling - 橄榄色
  6:  [64, 64, 64, 180],     // road - 深灰
  7:  [255, 128, 0, 180],    // bed - 橙色
  8:  [0, 255, 255, 180],    // windowpane - 青色
  9:  [0, 255, 0, 180],      // grass - 亮绿
  10: [128, 0, 128, 180],    // cabinet - 紫色
  11: [96, 96, 96, 180],     // sidewalk - 灰
  12: [255, 192, 203, 180],  // person - 粉色
  13: [139, 90, 43, 180],    // earth - 土色
  14: [255, 255, 0, 180],    // door - 黄色
  15: [0, 128, 128, 180],    // table - 深青
  16: [0, 0, 139, 180],      // mountain - 深蓝
  17: [34, 139, 34, 180],    // plant - 森林绿
  18: [218, 112, 214, 180],  // curtain - 兰花紫
  19: [210, 105, 30, 180],   // chair - 巧克力色
  20: [255, 69, 0, 180],     // car - 橙红
  21: [0, 191, 255, 180],    // water - 深天蓝
};

3.2 语义分割推理引擎

// SemanticSegmenter.ets - 语义分割推理引擎
import { mindspore } from '@kit.MindSporeLiteKit';
import { image } from '@kit.ImageKit';
import { common } from '@kit.AbilityKit';
import { fs } from '@kit.CoreFileKit';
import { SegmentationConfig, SegmentationMask } from './SegmentationTypes';

/**
 * 语义分割器
 * 封装DeepLabV3+模型的完整推理流程
 */
export class SemanticSegmenter {
  private context: common.Context;
  private config: SegmentationConfig;
  private session: mindspore.Session | null = null;
  private model: mindspore.Model | null = null;
  private isInitialized: boolean = false;

  constructor(context: common.Context, config: SegmentationConfig) {
    this.context = context;
    this.config = config;
  }

  /**
   * 初始化分割推理引擎
   */
  async initialize(): Promise<boolean> {
    try {
      const modelPath = await this.copyModelToSandbox();

      // 配置推理上下文
      const msContext: mindspore.Context = {};
      const npuDevice: mindspore.DeviceInfo = {
        deviceType: mindspore.DeviceType.kNPU,
        enableFloat16: true
      };
      const cpuDevice: mindspore.DeviceInfo = {
        deviceType: mindspore.DeviceType.kCPU,
        enableFloat16: true,
        cpuCores: [0, 1, 2, 3]
      };
      msContext.deviceInfos = [npuDevice, cpuDevice];

      // 加载模型
      this.model = new mindspore.Model();
      let loadResult = this.model.loadModelFromFile(modelPath, msContext);
      if (loadResult !== mindspore.kMSStatusSuccess) {
        msContext.deviceInfos = [cpuDevice];
        loadResult = this.model.loadModelFromFile(modelPath, msContext);
        if (loadResult !== mindspore.kMSStatusSuccess) {
          console.error('[SemanticSegmenter] 模型加载失败');
          return false;
        }
      }

      // 创建推理会话
      this.session = this.model.createSession(msContext);
      if (this.session === null) {
        console.error('[SemanticSegmenter] Session创建失败');
        return false;
      }

      this.isInitialized = true;
      console.info('[SemanticSegmenter] 初始化成功');
      return true;
    } catch (error) {
      console.error(`[SemanticSegmenter] 初始化异常: ${error}`);
      return false;
    }
  }

  /**
   * 拷贝模型文件到沙箱
   */
  private async copyModelToSandbox(): Promise<string> {
    const sandboxPath = `${this.context.filesDir}/${this.config.modelName}.ms`;
    if (fs.accessSync(sandboxPath)) {
      return sandboxPath;
    }
    const srcPath = `models/${this.config.modelName}.ms`;
    const content = this.context.resourceMgr.getRawFileContentSync(srcPath);
    const file = fs.openSync(sandboxPath, fs.OpenMode.CREATE | fs.OpenMode.WRITE_ONLY);
    fs.writeSync(file.fd, content.buffer);
    fs.closeSync(file);
    return sandboxPath;
  }

  /**
   * 图像预处理
   * DeepLabV3+的预处理与分类模型类似，但不需要LetterBox
   */
  private preprocess(pixelMap: image.PixelMap): Float32Array {
    const { inputWidth, inputHeight } = this.config;
    const imageInfo = pixelMap.getImageInfo();

    // 读取像素数据
    const pixelBytes = new Uint8Array(imageInfo.size.width * imageInfo.size.height * 4);
    pixelMap.readPixelsToBufferSync(pixelBytes.buffer);

    const totalSize = 3 * inputWidth * inputHeight;
    const inputData = new Float32Array(totalSize);

    // ImageNet归一化参数
    const means = [0.485, 0.456, 0.406];
    const stds = [0.229, 0.224, 0.225];

    // 遍历每个像素，执行Resize + Normalize + HWC→CHW
    for (let h = 0; h < inputHeight; h++) {
      for (let w = 0; w < inputWidth; w++) {
        // 最近邻插值映射到源图像
        const srcH = Math.min(Math.floor(h * imageInfo.size.height / inputHeight), imageInfo.size.height - 1);
        const srcW = Math.min(Math.floor(w * imageInfo.size.width / inputWidth), imageInfo.size.width - 1);
        const srcIdx = (srcH * imageInfo.size.width + srcW) * 4;

        const r = pixelBytes[srcIdx] / 255.0;
        const g = pixelBytes[srcIdx + 1] / 255.0;
        const b = pixelBytes[srcIdx + 2] / 255.0;

        // NCHW格式排列
        const dstIdx = h * inputWidth + w;
        inputData[0 * inputWidth * inputHeight + dstIdx] = (r - means[0]) / stds[0];
        inputData[1 * inputWidth * inputHeight + dstIdx] = (g - means[1]) / stds[1];
        inputData[2 * inputWidth * inputHeight + dstIdx] = (b - means[2]) / stds[2];
      }
    }

    return inputData;
  }

  /**
   * 后处理：将模型输出转换为分割掩码
   * 核心操作是对每个像素位置执行argmax
   */
  private postprocess(outputData: Float32Array): SegmentationMask {
    const { outputWidth, outputHeight, numClasses } = this.config;
    const mask = new SegmentationMask(outputWidth, outputHeight);

    // outputData形状：numClasses × outputHeight × outputWidth
    for (let h = 0; h < outputHeight; h++) {
      for (let w = 0; w < outputWidth; w++) {
        let maxProb = -Infinity;
        let maxClass = 0;

        // 对每个类别通道执行argmax
        for (let c = 0; c < numClasses; c++) {
          const idx = c * outputWidth * outputHeight + h * outputWidth + w;
          if (outputData[idx] > maxProb) {
            maxProb = outputData[idx];
            maxClass = c;
          }
        }

        mask.setClass(w, h, maxClass);
      }
    }

    return mask;
  }

  /**
   * 执行语义分割
   * @returns 分割掩码和耗时信息
   */
  async segment(pixelMap: image.PixelMap): Promise<{ mask: SegmentationMask; timeMs: number } | null> {
    if (!this.isInitialized || this.session === null) {
      return null;
    }

    try {
      const startTime = Date.now();

      // 预处理
      const inputData = this.preprocess(pixelMap);

      // 设置输入
      const inputs = this.session.getInputs();
      inputs[0].setData(inputData.buffer);

      // 推理
      this.session.run(inputs);

      // 获取输出
      const outputs = this.session.getOutputs();
      const outputData = new Float32Array(outputs[0].getData());

      // 后处理
      const mask = this.postprocess(outputData);
      const timeMs = Date.now() - startTime;

      console.info(`[SemanticSegmenter] 分割完成，耗时${timeMs}ms`);
      return { mask, timeMs };
    } catch (error) {
      console.error(`[SemanticSegmenter] 分割异常: ${error}`);
      return null;
    }
  }

  /**
   * 释放资源
   */
  release(): void {
    if (this.session !== null) {
      this.model?.freeSession(this.session);
      this.session = null;
    }
    if (this.model !== null) {
      this.model.freeModel();
      this.model = null;
    }
    this.isInitialized = false;
  }
}

3.3 分割掩码可视化渲染器

将分割掩码渲染为彩色叠加图，这是分割应用最核心的展示部分：

// SegmentationVisualizer.ets - 分割掩码可视化渲染器
import { image } from '@kit.ImageKit';
import { SegmentationMask, SEGMENTATION_COLORS, ADE20K_LABELS } from './SegmentationTypes';

/**
 * 分割可视化渲染器
 * 将分割掩码渲染为半透明彩色叠加图
 */
export class SegmentationVisualizer {

  /**
   * 将分割掩码渲染为PixelMap
   * @param mask 分割掩码
   * @param width 输出图像宽度
   * @param height 输出图像高度
   * @param alpha 叠加透明度（0-255）
   * @returns 彩色分割掩码的PixelMap
   */
  static renderMaskToPixelMap(
    mask: SegmentationMask,
    width: number,
    height: number,
    alpha: number = 128
  ): image.PixelMap | null {
    try {
      // 创建PixelMap
      const imageInfo: image.ImageInfo = {
        size: { width: width, height: height },
        pixelFormat: image.PixelFormat.RGBA_8888,
        alphaType: image.AlphaType.OPAQUE
      };
      const pixelMap = image.createPixelMapSync(new ArrayBuffer(width * height * 4), imageInfo);

      // 构建RGBA像素数据
      const pixelData = new Uint8Array(width * height * 4);

      for (let h = 0; h < height; h++) {
        for (let w = 0; w < width; w++) {
          // 从掩码获取类别（需要缩放映射）
          const maskX = Math.floor(w * mask.width / width);
          const maskY = Math.floor(h * mask.height / height);
          const classId = mask.getClass(
            Math.min(maskX, mask.width - 1),
            Math.min(maskY, mask.height - 1)
          );

          // 获取类别颜色
          const color = SEGMENTATION_COLORS[classId] || [128, 128, 128, alpha];

          const pixelIdx = (h * width + w) * 4;
          pixelData[pixelIdx] = color[0];       // R
          pixelData[pixelIdx + 1] = color[1];   // G
          pixelData[pixelIdx + 2] = color[2];   // B
          pixelData[pixelIdx + 3] = alpha;       // A（半透明）
        }
      }

      // 写入PixelMap
      pixelMap.writeBufferToPixelsSync(pixelData.buffer);
      return pixelMap;
    } catch (error) {
      console.error(`[SegmentationVisualizer] 渲染失败: ${error}`);
      return null;
    }
  }

  /**
   * 将分割掩码叠加到原图上
   * @param originalImage 原始图像PixelMap
   * @param mask 分割掩码
   * @param blendRatio 混合比例（0=全原图，1=全掩码）
   * @returns 叠加后的PixelMap
   */
  static blendWithOriginal(
    originalImage: image.PixelMap,
    mask: SegmentationMask,
    blendRatio: number = 0.5
  ): image.PixelMap | null {
    try {
      const imageInfo = originalImage.getImageInfo();
      const width = imageInfo.size.width;
      const height = imageInfo.size.height;

      // 读取原图像素
      const srcPixels = new Uint8Array(width * height * 4);
      originalImage.readPixelsToBufferSync(srcPixels.buffer);

      // 创建输出PixelMap
      const outputInfo: image.ImageInfo = {
        size: { width, height },
        pixelFormat: image.PixelFormat.RGBA_8888,
        alphaType: image.AlphaType.OPAQUE
      };
      const outputPixelMap = image.createPixelMapSync(new ArrayBuffer(width * height * 4), outputInfo);
      const outputPixels = new Uint8Array(width * height * 4);

      // 逐像素混合
      for (let h = 0; h < height; h++) {
        for (let w = 0; w < width; w++) {
          const pixelIdx = (h * width + w) * 4;

          // 原图像素
          const srcR = srcPixels[pixelIdx];
          const srcG = srcPixels[pixelIdx + 1];
          const srcB = srcPixels[pixelIdx + 2];

          // 掩码颜色
          const maskX = Math.floor(w * mask.width / width);
          const maskY = Math.floor(h * mask.height / height);
          const classId = mask.getClass(
            Math.min(maskX, mask.width - 1),
            Math.min(maskY, mask.height - 1)
          );
          const color = SEGMENTATION_COLORS[classId] || [128, 128, 128, 180];

          // Alpha混合
          const maskAlpha = (color[3] / 255.0) * blendRatio;
          const invAlpha = 1.0 - maskAlpha;

          outputPixels[pixelIdx] = Math.round(srcR * invAlpha + color[0] * maskAlpha);
          outputPixels[pixelIdx + 1] = Math.round(srcG * invAlpha + color[1] * maskAlpha);
          outputPixels[pixelIdx + 2] = Math.round(srcB * invAlpha + color[2] * maskAlpha);
          outputPixels[pixelIdx + 3] = 255;
        }
      }

      outputPixelMap.writeBufferToPixelsSync(outputPixels.buffer);
      return outputPixelMap;
    } catch (error) {
      console.error(`[SegmentationVisualizer] 混合失败: ${error}`);
      return null;
    }
  }

  /**
   * 生成场景描述文本
   * 基于分割结果生成自然语言描述
   */
  static generateSceneDescription(mask: SegmentationMask): string {
    const distribution = mask.getClassDistribution();
    const descriptions: string[] = [];

    // 按占比降序排列
    const sorted = Array.from(distribution.entries())
      .sort((a, b) => b[1] - a[1]);

    for (const [classId, ratio] of sorted) {
      if (ratio < 0.01) continue;  // 忽略占比<1%的类别
      const label = ADE20K_LABELS[classId] || `类别${classId}`;
      const percentage = (ratio * 100).toFixed(1);
      descriptions.push(`${label}(${percentage}%)`);
    }

    if (descriptions.length === 0) {
      return '无法识别场景内容';
    }

    return `场景包含：${descriptions.join('、')}`;
  }

  /**
   * 提取指定类别的二值掩码
   * 用于后续的抠图等操作
   */
  static extractBinaryMask(mask: SegmentationMask, targetClass: number): Uint8Array {
    const binary = new Uint8Array(mask.width * mask.height);
    for (let i = 0; i < mask.data.length; i++) {
      binary[i] = mask.data[i] === targetClass ? 255 : 0;
    }
    return binary;
  }
}

3.4 完整的语义分割页面

// SegmentationPage.ets - 语义分割完整页面
import { image } from '@kit.ImageKit';
import { picker } from '@kit.CoreFileKit';
import { common } from '@kit.AbilityKit';
import { SegmentationConfig, SegmentationMask, ADE20K_LABELS, SEGMENTATION_COLORS } from './SegmentationTypes';
import { SemanticSegmenter } from './SemanticSegmenter';
import { SegmentationVisualizer } from './SegmentationVisualizer';

@Entry
@Component
struct SegmentationPage {
  @State originalImage: PixelMap | null = null;
  @State blendedImage: PixelMap | null = null;
  @State sceneDescription: string = '';
  @State isLoading: boolean = false;
  @State isEngineReady: boolean = false;
  @State blendRatio: number = 0.5;
  @State classDistribution: Array<{ label: string; ratio: number; color: string }> = [];

  private segmenter: SemanticSegmenter | null = null;
  private currentMask: SegmentationMask | null = null;

  aboutToAppear() {
    this.initSegmenter();
  }

  aboutToDisappear() {
    this.segmenter?.release();
  }

  async initSegmenter() {
    const context = getContext(this) as common.Context;

    const config: SegmentationConfig = {
      modelName: 'deeplabv3_mnv2',
      modelPath: 'deeplabv3_mnv2.ms',
      inputWidth: 513,
      inputHeight: 513,
      numClasses: 41,  // 简化版ADE20K
      labelPath: 'ade20k_labels.txt',
      outputWidth: 513,
      outputHeight: 513
    };

    this.segmenter = new SemanticSegmenter(context, config);
    this.isEngineReady = await this.segmenter.initialize();
  }

  async pickAndSegment() {
    if (!this.isEngineReady || this.segmenter === null) return;

    try {
      this.isLoading = true;

      const photoSelectOptions = new picker.PhotoSelectOptions();
      photoSelectOptions.MIMEType = picker.PhotoViewMIMETypes.IMAGE_TYPE;
      photoSelectOptions.maxSelectNumber = 1;

      const photoViewPicker = new picker.PhotoViewPicker();
      const result = await photoViewPicker.select(photoSelectOptions);

      if (result.photoUris.length === 0) {
        this.isLoading = false;
        return;
      }

      // 解码图像
      const imageSource = image.createImageSource(result.photoUris[0]);
      const pixelMap = await imageSource.createPixelMap();
      this.originalImage = pixelMap;

      // 执行分割
      const segResult = await this.segmenter.segment(pixelMap);
      if (segResult === null) {
        this.isLoading = false;
        return;
      }

      this.currentMask = segResult.mask;

      // 生成叠加图
      this.updateBlendedImage();

      // 生成场景描述
      this.sceneDescription = SegmentationVisualizer.generateSceneDescription(segResult.mask);

      // 更新类别分布
      this.updateClassDistribution(segResult.mask);

      this.isLoading = false;
    } catch (error) {
      console.error(`[Page] 分割失败: ${error}`);
      this.isLoading = false;
    }
  }

  /**
   * 更新叠加图（混合比例变化时调用）
   */
  updateBlendedImage() {
    if (this.originalImage && this.currentMask) {
      this.blendedImage = SegmentationVisualizer.blendWithOriginal(
        this.originalImage,
        this.currentMask,
        this.blendRatio
      );
    }
  }

  /**
   * 更新类别分布数据
   */
  updateClassDistribution(mask: SegmentationMask) {
    const distribution = mask.getClassDistribution();
    const items: Array<{ label: string; ratio: number; color: string }> = [];

    const sorted = Array.from(distribution.entries())
      .sort((a, b) => b[1] - a[1])
      .slice(0, 8);  // 只显示前8个类别

    for (const [classId, ratio] of sorted) {
      const label = ADE20K_LABELS[classId] || `类别${classId}`;
      const colorArr = SEGMENTATION_COLORS[classId] || [128, 128, 128, 180];
      const color = `rgb(${colorArr[0]}, ${colorArr[1]}, ${colorArr[2]})`;
      items.push({ label, ratio, color });
    }

    this.classDistribution = items;
  }

  build() {
    Scroll() {
      Column() {
        // 标题栏
        Row() {
          Text('语义分割')
            .fontSize(24)
            .fontWeight(FontWeight.Bold)
            .fontColor('#FFFFFF')
        }
        .width('100%')
        .height(56)
        .justifyContent(FlexAlign.Center)
        .backgroundColor('#1A1A2E')

        // 图像展示区域（原图/叠加图切换）
        Stack() {
          if (this.blendedImage !== null) {
            Image(this.blendedImage)
              .width(340)
              .height(340)
              .objectFit(ImageFit.Contain)
              .borderRadius(12)
          } else if (this.originalImage !== null) {
            Image(this.originalImage)
              .width(340)
              .height(340)
              .objectFit(ImageFit.Contain)
              .borderRadius(12)
          } else {
            Column() {
              Text('🖼️')
                .fontSize(48)
              Text('选择图片开始语义分割')
                .fontSize(14)
                .fontColor('#AAAAAA')
                .margin({ top: 12 })
            }
            .width(340)
            .height(340)
            .justifyContent(FlexAlign.Center)
            .borderRadius(12)
            .border({ width: 2, color: '#333355', style: BorderStyle.Dashed })
          }
        }
        .margin({ top: 16 })

        // 混合比例滑块
        if (this.currentMask !== null) {
          Row() {
            Text('透明度')
              .fontSize(13)
              .fontColor('#AAAAAA')
              .width(50)
            Slider({
              value: this.blendRatio * 100,
              min: 0,
              max: 100,
              step: 5,
              style: SliderStyle.OutSet
            })
              .width('55%')
              .trackColor('#2A2A4A')
              .selectedColor('#4FC3F7')
              .onChange((value: number) => {
                this.blendRatio = value / 100;
                this.updateBlendedImage();
              })
            Text(`${Math.round(this.blendRatio * 100)}%`)
              .fontSize(13)
              .fontColor('#4FC3F7')
              .width(40)
          }
          .width('92%')
          .margin({ top: 12 })
        }

        // 操作按钮
        Button('选择图片分割')
          .width(200)
          .height(48)
          .fontSize(16)
          .backgroundColor('#4FC3F7')
          .fontColor('#1A1A2E')
          .borderRadius(24)
          .enabled(this.isEngineReady && !this.isLoading)
          .onClick(() => this.pickAndSegment())
          .margin({ top: 16 })

        // 场景描述
        if (this.sceneDescription) {
          Column() {
            Text('场景理解')
              .fontSize(16)
              .fontWeight(FontWeight.Bold)
              .fontColor('#FFFFFF')
              .margin({ bottom: 8 })
            Text(this.sceneDescription)
              .fontSize(14)
              .fontColor('#CCCCCC')
              .lineHeight(22)
          }
          .width('92%')
          .padding(16)
          .borderRadius(12)
          .backgroundColor('#16213E')
          .margin({ top: 16 })
          .alignItems(HorizontalAlign.Start)
        }

        // 类别分布图
        if (this.classDistribution.length > 0) {
          Column() {
            Text('类别分布')
              .fontSize(16)
              .fontWeight(FontWeight.Bold)
              .fontColor('#FFFFFF')
              .margin({ bottom: 12 })

            ForEach(this.classDistribution, (item: { label: string; ratio: number; color: string }) => {
              Row() {
                Row()
                  .width(12)
                  .height(12)
                  .borderRadius(3)
                  .backgroundColor(item.color)

                Text(item.label)
                  .fontSize(13)
                  .fontColor('#FFFFFF')
                  .margin({ left: 8 })
                  .width(70)

                Row() {
                  Row()
                    .width(`${item.ratio * 100}%`)
                    .height('100%')
                    .borderRadius(3)
                    .backgroundColor(item.color)
                }
                .layoutWeight(1)
                .height(8)
                .borderRadius(3)
                .backgroundColor('#2A2A4A')
                .margin({ left: 8 })

                Text(`${(item.ratio * 100).toFixed(1)}%`)
                  .fontSize(12)
                  .fontColor('#4FC3F7')
                  .width(48)
                  .textAlign(TextAlign.End)
                  .margin({ left: 8 })
              }
              .width('100%')
              .height(28)
              .alignItems(VerticalAlign.Center)
            })
          }
          .width('92%')
          .padding(16)
          .borderRadius(12)
          .backgroundColor('#16213E')
          .margin({ top: 12 })
        }

        if (this.isLoading) {
          LoadingProgress()
            .width(48)
            .height(48)
            .color('#4FC3F7')
            .margin({ top: 20 })
        }
      }
      .width('100%')
    }
    .width('100%')
    .height('100%')
    .backgroundColor('#0F0F23')
  }
}

四、踩坑与注意事项

4.1 输出分辨率不匹配

坑：分割掩码的分辨率和原图不一致，导致叠加时位置偏移。

原因：DeepLabV3+的输出分辨率通常是输入的1/8或1/16（取决于是否使用输出上采样）。如果模型输出是65×65，而原图是513×513，需要8倍上采样。

解：

在模型转换时添加上采样层，让输出和输入同尺寸
或者在后处理中手动双线性插值上采样
推荐方法1，让模型自己处理上采样，精度更好

4.2 argmax性能瓶颈

坑：后处理argmax操作耗时过长（>200ms），成为性能瓶颈。

原因：对513×513×41的输出张量逐像素遍历，纯ArkTS效率不高。

解：

减少类别数量——如果只需要区分"前景/背景"，可以用2类模型
使用降采样输出——接受更低分辨率的分割结果
利用TypedArray的批量操作优化循环
HarmonyOS 6中可使用NPU内置argmax算子

4.3 PixelMap内存压力

坑：同时持有原图、掩码图、叠加图三个PixelMap，内存占用过大。

原因：一张1080p的RGBA图像占4MB，三张就是12MB，加上模型推理的中间数据，很容易触发内存告警。

解：

及时释放不再使用的PixelMap（调用release()）
使用image.createPixelMapSync的同步版本减少异步开销
叠加图生成后立即释放掩码PixelMap
考虑使用降采样后的图像进行分割，结果再映射回原图

4.4 分割边界锯齿

坑：分割掩码的边界呈现锯齿状，视觉效果差。

原因：低分辨率输出上采样到原图尺寸时，最近邻插值产生锯齿。

解：

使用双线性插值或CRF（条件随机场）后处理平滑边界
DeepLabV3+自带的decoder已经有一定的边界细化能力
叠加时使用较高的透明度（blendRatio < 0.5），让原图边界自然过渡

五、HarmonyOS 6适配

5.1 新增特性

特性	说明
NPU内置argmax	分割后处理可在NPU完成，无需CPU argmax
高分辨率输出	支持输出与输入同尺寸，无需上采样
实例分割API	新增`InstanceSegmentationSession`
视频分割	支持实时视频流分割，自动时序平滑

5.2 迁移指南

NPU内置argmax：在模型导出时添加argmax层：

# PyTorch导出时添加argmax
class DeepLabV3WithArgmax(nn.Module):
    def __init__(self, model):
        super().__init__()
        self.model = model

    def forward(self, x):
        output = self.model(x)
        return output.argmax(dim=1)  # 在模型内完成argmax

实例分割API（HarmonyOS 6新增）：

// HarmonyOS 6 实例分割接口
const instanceSession = model.createInstanceSegmentationSession(context);
instanceSession.setConfidenceThreshold(0.5);
instanceSession.setMaskThreshold(0.5);
const instances = instanceSession.segment(pixelMap);
// instances: Array<InstanceMask>，每个实例有独立的掩码

视频流分割：

// HarmonyOS 6 视频分割
const videoSegmenter = model.createVideoSegmenter(context);
videoSegmenter.on('segment', (mask: SegmentationMask) => {
  // 每帧分割结果，自动时序平滑
  this.updateMask(mask);
});
videoSegmenter.start(cameraStream);

六、总结

本文系统讲解了HarmonyOS端侧图像分割的完整技术方案，核心知识点回顾：

图像分割端侧部署
├── 分割范式
│   ├── 语义分割：像素级分类，同类不区分个体
│   ├── 实例分割：像素级分类，同类区分个体
│   └── 全景分割：语义+实例融合
├── 模型选择
│   ├── DeepLabV3+ + MobileNetV2：端侧首选
│   ├── ASPP模块：多尺度空洞卷积
│   └── 输出形状：numClasses × H × W
├── 核心流程
│   ├── 预处理：Resize + Normalize + HWC→CHW
│   ├── 推理：Session.run()
│   └── 后处理：逐像素argmax → 分割掩码
├── 可视化渲染
│   ├── 掩码→彩色PixelMap
│   ├── Alpha混合叠加到原图
│   └── 场景描述文本生成
├── 性能优化
│   ├── 减少类别数量
│   ├── NPU内置argmax（HarmonyOS 6）
│   ├── 及时释放PixelMap
│   └── 降采样分割 + 映射回原图
└── 踩坑要点
    ├── 输出分辨率与原图不匹配
    ├── argmax性能瓶颈
    ├── PixelMap内存压力
    └── 分割边界锯齿

一句话总结：图像分割的精髓在于"像素级的理解"——从概率图到掩码图的argmax、从掩码图到彩色叠加的Alpha混合、从像素统计到场景描述的语义理解，每一步都在把冰冷的数字变成用户能感知的智能体验。端侧分割的关键是控制输出分辨率、优化argmax性能、管理PixelMap内存，三者缺一不可。

【声明】本内容来自华为云开发者社区博主，不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源（华为云社区）、文章链接、文章作者等基本信息，否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容，欢迎发送邮件进行举报，并提供相关证据，一经查实，本社区将立刻删除涉嫌侵权内容，举报邮箱： cloudbbs@huaweicloud.com

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

HarmonyOS APP开发：图像分割与语义理解

HarmonyOS APP开发：图像分割与语义理解

一、背景与动机

二、核心原理

2.1 图像分割的三种范式

2.2 DeepLabV3+模型架构

2.3 分割输出格式

三、代码实战

3.1 分割数据结构定义

3.2 语义分割推理引擎

3.3 分割掩码可视化渲染器

3.4 完整的语义分割页面

四、踩坑与注意事项

4.1 输出分辨率不匹配

4.2 argmax性能瓶颈

4.3 PixelMap内存压力

4.4 分割边界锯齿

五、HarmonyOS 6适配

5.1 新增特性

5.2 迁移指南

六、总结

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

HarmonyOS APP开发：图像分割与语义理解

HarmonyOS APP开发：图像分割与语义理解

一、背景与动机

二、核心原理

2.1 图像分割的三种范式

2.2 DeepLabV3+模型架构

2.3 分割输出格式

三、代码实战

3.1 分割数据结构定义

3.2 语义分割推理引擎

3.3 分割掩码可视化渲染器

3.4 完整的语义分割页面

四、踩坑与注意事项

4.1 输出分辨率不匹配

4.2 argmax性能瓶颈

4.3 PixelMap内存压力

4.4 分割边界锯齿

五、HarmonyOS 6适配

5.1 新增特性

5.2 迁移指南

六、总结

全部回复

设置昵称

关于作者

目录

热门推荐查看更多

相关文章

加入云驻计划，成为创作者

相关产品