HarmonyOS APP开发:端云协同AI推理架构
HarmonyOS APP开发:端云协同AI推理架构
核心要点:端云协同推理让AI在设备端和云端之间智能调度,低延迟场景走端侧、复杂任务走云端,实现"鱼和熊掌兼得"的推理体验。本文深入讲解端云协同的决策引擎、动态分流策略以及在HarmonyOS APP中的完整实现。
一、背景与动机
你有没有遇到过这种情况?用手机上的AI助手识别一张图片,等了3秒结果才出来——因为模型跑在云端,网络一卡就完蛋。又或者,用端侧模型做人脸解锁,虽然快,但光线暗一点就识别不了——因为端侧模型太小,精度不够。
这就是端云协同要解决的问题。
简单说,端云协同推理就是:简单的任务在设备上跑,复杂的任务交给云端,根据实时条件智能决策。就像公司里,简单的审批主管当场拍板,复杂的项目才需要上报CEO。这样既保证了响应速度,又不牺牲模型精度。
HarmonyOS的分布式能力让端云协同更加自然。手机、手表、平板、智慧屏可以组成一个"推理集群"——手表做人脸检测(轻量级),手机做人脸识别(中等),云端做人脸比对(高精度)。设备之间通过分布式软总线高速通信,延迟比传统网络低一个数量级。
二、核心原理
2.1 端云协同推理架构
端云协同推理架构由三个核心层组成:
- 决策引擎层:根据任务复杂度、网络状况、设备算力、电量等因素,决定推理在端侧还是云端执行
- 推理执行层:端侧推理引擎(MindSpore Lite / NNAPI)和云端推理服务(ModelArts / 自建服务)
- 结果融合层:对端云结果进行融合、校验和缓存
flowchart TB
classDef primary fill:#4F46E5,stroke:#3730A3,color:#FFFFFF
classDef warning fill:#F59E0B,stroke:#D97706,color:#FFFFFF
classDef error fill:#EF4444,stroke:#DC2626,color:#FFFFFF
classDef info fill:#06B6D4,stroke:#0891B2,color:#FFFFFF
classDef purple fill:#8B5CF6,stroke:#7C3AED,color:#FFFFFF
A[AI推理请求]:::primary --> B{决策引擎}:::warning
B -->|简单任务 / 网络差| C[端侧推理]:::info
B -->|复杂任务 / 网络好| D[云端推理]:::purple
B -->|中等任务| E[端云并行]:::error
C --> F[端侧结果]:::info
D --> G[云端结果]:::purple
E --> F
E --> G
F --> H{结果融合}:::warning
G --> H
H --> I[最终输出]:::primary
J[网络监测]:::info -.-> B
K[设备状态]:::info -.-> B
L[任务分析]:::purple -.-> B
M[缓存查询]:::warning -.-> B
2.2 智能分流决策模型
决策引擎是端云协同的"大脑",它需要综合考虑多个因素:
| 决策因子 | 端侧倾向 | 云端倾向 |
|---|---|---|
| 任务复杂度 | 低(分类、检测) | 高(生成、分割) |
| 网络延迟 | >200ms | <50ms |
| 设备算力 | NPU可用 | NPU不可用 |
| 电量 | >50% | <20% |
| 隐私要求 | 高(人脸、健康) | 低(通用识别) |
| 请求频率 | 高频实时 | 低频批量 |
决策函数可以简化为:
当 时走端侧,否则走云端。权重 可以通过历史数据在线学习调整。
2.3 端云并行推理
对于中等复杂度的任务,最聪明的做法是端云并行:同时发起端侧和云端推理,谁先返回用谁的结果。如果云端先返回,就用高精度的云端结果;如果端侧先返回且置信度够高,就直接用端侧结果,不等云端了。
这种"双保险"策略在弱网环境下特别有效——网络差的时候端侧兜底,网络好的时候云端提供更高精度。
2.4 结果缓存与预取
缓存是端云协同的"加速器"。对于重复性的推理请求(如同一张图片多次识别),直接返回缓存结果,省去推理开销。预取则更进一步——根据用户行为预测下一步可能需要的推理结果,提前在后台发起。
三、代码实战
3.1 端云协同决策引擎
这是整个架构的核心,负责根据实时条件决定推理路径。
// EdgeCloudDecisionEngine.ets
// 端云协同决策引擎 - 智能推理路径选择
import { connection } from '@kit.NetworkKit';
import { batteryInfo } from '@kit.BasicServicesKit';
// 推理路径枚举
export enum InferencePath {
EDGE_ONLY = 'edge_only', // 仅端侧
CLOUD_ONLY = 'cloud_only', // 仅云端
EDGE_CLOUD_PARALLEL = 'parallel', // 端云并行
EDGE_FIRST = 'edge_first', // 端侧优先,失败回退云端
CLOUD_FIRST = 'cloud_first' // 云端优先,失败回退端侧
}
// 任务描述接口
export interface TaskDescriptor {
taskType: string; // 任务类型:classification/detection/generation/segmentation
inputSize: number; // 输入数据大小(字节)
modelComplexity: number; // 模型复杂度 0-1
privacyLevel: number; // 隐私等级 0-1(1=最高)
latencyRequirement: number; // 延迟要求(毫秒)
accuracyRequirement: number; // 精度要求 0-1
}
// 设备状态接口
interface DeviceStatus {
cpuUsage: number; // CPU使用率 0-1
memoryAvailable: number; // 可用内存(MB)
batteryLevel: number; // 电量 0-100
isCharging: boolean; // 是否充电中
isNPUAvailable: boolean; // NPU是否可用
thermalStatus: number; // 温度状态 0-3
}
// 网络状态接口
interface NetworkStatus {
type: string; // wifi/cellular/none
latency: number; // 网络延迟(毫秒)
bandwidth: number; // 带宽(Mbps)
isStable: boolean; // 网络是否稳定
}
// 决策结果接口
export interface DecisionResult {
path: InferencePath;
confidence: number; // 决策置信度 0-1
reason: string; // 决策理由
estimatedLatency: number; // 预估延迟(毫秒)
fallbackPath: InferencePath; // 回退路径
}
export class EdgeCloudDecisionEngine {
// 决策权重配置
private weights = {
taskComplexity: 0.25,
networkQuality: 0.20,
deviceCapability: 0.20,
privacyRequirement: 0.15,
latencyRequirement: 0.10,
batteryStatus: 0.10
};
// 端侧阈值:得分高于此值倾向端侧
private edgeThreshold: number = 0.55;
// 缓存最近一次的设备/网络状态
private lastDeviceStatus: DeviceStatus | null = null;
private lastNetworkStatus: NetworkStatus | null = null;
// 历史决策记录(用于在线学习)
private decisionHistory: Array<{
task: TaskDescriptor;
decision: DecisionResult;
actualLatency: number;
success: boolean;
}> = [];
// 执行推理路径决策
decide(task: TaskDescriptor): DecisionResult {
// 获取当前设备状态
const deviceStatus = this.getDeviceStatus();
const networkStatus = this.getNetworkStatus();
// 计算各因子得分
const complexityScore = this.calcComplexityScore(task);
const networkScore = this.calcNetworkScore(networkStatus);
const deviceScore = this.calcDeviceScore(deviceStatus);
const privacyScore = this.calcPrivacyScore(task);
const latencyScore = this.calcLatencyScore(task, networkStatus);
const batteryScore = this.calcBatteryScore(deviceStatus);
// 加权求和
const edgeScore =
this.weights.taskComplexity * (1 - complexityScore) + // 复杂度低→端侧
this.weights.networkQuality * (1 - networkScore) + // 网络差→端侧
this.weights.deviceCapability * deviceScore + // 设备强→端侧
this.weights.privacyRequirement * privacyScore + // 隐私高→端侧
this.weights.latencyRequirement * (1 - latencyScore) + // 延迟要求高→端侧
this.weights.batteryStatus * batteryScore; // 电量足→端侧
// 根据综合得分选择推理路径
let path: InferencePath;
let reason: string;
let estimatedLatency: number;
let fallbackPath: InferencePath;
if (edgeScore > this.edgeThreshold + 0.15) {
// 明确倾向端侧
path = InferencePath.EDGE_ONLY;
reason = `端侧优势明显(得分${edgeScore.toFixed(2)})`;
estimatedLatency = this.estimateEdgeLatency(task, deviceStatus);
fallbackPath = InferencePath.CLOUD_ONLY;
} else if (edgeScore < this.edgeThreshold - 0.15) {
// 明确倾向云端
path = InferencePath.CLOUD_ONLY;
reason = `云端优势明显(得分${edgeScore.toFixed(2)})`;
estimatedLatency = this.estimateCloudLatency(task, networkStatus);
fallbackPath = InferencePath.EDGE_ONLY;
} else {
// 得分接近,走端云并行
path = InferencePath.EDGE_CLOUD_PARALLEL;
reason = `端云势均力敌(得分${edgeScore.toFixed(2)}),并行取优`;
estimatedLatency = Math.min(
this.estimateEdgeLatency(task, deviceStatus),
this.estimateCloudLatency(task, networkStatus)
);
fallbackPath = InferencePath.EDGE_ONLY;
}
// 特殊情况覆盖
if (networkStatus.type === 'none') {
path = InferencePath.EDGE_ONLY;
reason = '无网络连接,强制端侧推理';
fallbackPath = InferencePath.EDGE_ONLY;
} else if (task.privacyLevel > 0.8 && path === InferencePath.CLOUD_ONLY) {
path = InferencePath.EDGE_FIRST;
reason = '高隐私任务,端侧优先';
fallbackPath = InferencePath.EDGE_ONLY;
} else if (task.latencyRequirement < 50 && networkStatus.latency > 100) {
path = InferencePath.EDGE_ONLY;
reason = '低延迟要求+高网络延迟,强制端侧';
fallbackPath = InferencePath.EDGE_ONLY;
}
const confidence = Math.abs(edgeScore - this.edgeThreshold) / 0.5;
const result: DecisionResult = {
path,
confidence: Math.min(1, confidence),
reason,
estimatedLatency,
fallbackPath
};
console.info(`[DecisionEngine] 决策: ${path}, 理由: ${reason}, 置信度: ${confidence.toFixed(2)}`);
return result;
}
// 计算任务复杂度得分(越高越复杂→越倾向云端)
private calcComplexityScore(task: TaskDescriptor): number {
const complexityMap: Record<string, number> = {
'classification': 0.2,
'detection': 0.4,
'segmentation': 0.7,
'generation': 0.9
};
return complexityMap[task.taskType] || task.modelComplexity;
}
// 计算网络质量得分(越高→网络越好→倾向云端)
private calcNetworkScore(network: NetworkStatus): number {
if (network.type === 'none') return 0;
let score = 0;
// 延迟评分
if (network.latency < 30) score += 0.4;
else if (network.latency < 100) score += 0.25;
else if (network.latency < 300) score += 0.1;
// 带宽评分
if (network.bandwidth > 50) score += 0.3;
else if (network.bandwidth > 10) score += 0.2;
else score += 0.05;
// 稳定性评分
score += network.isStable ? 0.3 : 0.1;
return Math.min(1, score);
}
// 计算设备能力得分(越高→设备越强→倾向端侧)
private calcDeviceScore(device: DeviceStatus): number {
let score = 0;
score += device.isNPUAvailable ? 0.4 : 0;
score += device.cpuUsage < 0.5 ? 0.2 : 0.05;
score += device.memoryAvailable > 2048 ? 0.2 : 0.05;
score += device.thermalStatus < 2 ? 0.2 : 0;
return Math.min(1, score);
}
// 计算隐私得分(越高→隐私要求越高→倾向端侧)
private calcPrivacyScore(task: TaskDescriptor): number {
return task.privacyLevel;
}
// 计算延迟得分(越高→延迟要求越宽松→倾向云端)
private calcLatencyScore(task: TaskDescriptor, network: NetworkStatus): number {
if (task.latencyRequirement < 50) return 0; // 极低延迟→端侧
if (task.latencyRequirement < 200) return 0.3; // 低延迟→倾向端侧
if (task.latencyRequirement < 1000) return 0.6; // 中等延迟→均可
return 1.0; // 无延迟要求→倾向云端
}
// 计算电量得分(越高→电量越充足→倾向端侧)
private calcBatteryScore(device: DeviceStatus): number {
if (device.isCharging) return 1.0;
if (device.batteryLevel > 60) return 0.8;
if (device.batteryLevel > 30) return 0.4;
return 0.1; // 低电量→倾向云端(省电)
}
// 预估端侧推理延迟
private estimateEdgeLatency(task: TaskDescriptor, device: DeviceStatus): number {
const baseLatency = task.modelComplexity * 200; // 基础延迟
const npuFactor = device.isNPUAvailable ? 0.3 : 1.0; // NPU加速3倍
const thermalFactor = 1 + device.thermalStatus * 0.2; // 降频惩罚
return Math.round(baseLatency * npuFactor * thermalFactor);
}
// 预估云端推理延迟
private estimateCloudLatency(task: TaskDescriptor, network: NetworkStatus): number {
const computeLatency = task.modelComplexity * 50; // 云端计算快
const transferLatency = (task.inputSize / 1024 / 1024) / network.bandwidth * 1000; // 传输延迟
const networkLatency = network.latency * 2; // 往返延迟
return Math.round(computeLatency + transferLatency + networkLatency);
}
// 获取设备状态
private getDeviceStatus(): DeviceStatus {
try {
const battery = batteryInfo.getBatteryInfo();
return {
cpuUsage: 0.3, // 简化:实际应通过系统API获取
memoryAvailable: 4096,
batteryLevel: battery.level,
isCharging: battery.chargingStatus === batteryInfo.BatteryChargeState.ENABLE,
isNPUAvailable: true, // 简化:实际应检测NPU
thermalStatus: 0
};
} catch (error) {
return {
cpuUsage: 0.5,
memoryAvailable: 2048,
batteryLevel: 50,
isCharging: false,
isNPUAvailable: false,
thermalStatus: 0
};
}
}
// 获取网络状态
private getNetworkStatus(): NetworkStatus {
try {
const netConnection = connection.createNetConnection();
// 简化:实际应异步获取网络状态
return {
type: 'wifi',
latency: 30,
bandwidth: 50,
isStable: true
};
} catch (error) {
return {
type: 'none',
latency: 999,
bandwidth: 0,
isStable: false
};
}
}
// 记录决策结果(用于在线学习)
recordDecision(task: TaskDescriptor, decision: DecisionResult,
actualLatency: number, success: boolean): void {
this.decisionHistory.push({ task, decision, actualLatency, success });
// 保留最近100条记录
if (this.decisionHistory.length > 100) {
this.decisionHistory.shift();
}
// 简单的在线学习:调整阈值
this.adaptThreshold();
}
// 自适应阈值调整
private adaptThreshold(): void {
if (this.decisionHistory.length < 10) return;
const recentDecisions = this.decisionHistory.slice(-20);
const edgeSuccessRate = recentDecisions
.filter(d => d.decision.path === InferencePath.EDGE_ONLY)
.filter(d => d.success)
.length / Math.max(1, recentDecisions
.filter(d => d.decision.path === InferencePath.EDGE_ONLY).length);
// 如果端侧成功率低,提高阈值(更倾向云端)
if (edgeSuccessRate < 0.7) {
this.edgeThreshold = Math.min(0.7, this.edgeThreshold + 0.02);
} else if (edgeSuccessRate > 0.9) {
// 如果端侧成功率高,降低阈值(更倾向端侧)
this.edgeThreshold = Math.max(0.3, this.edgeThreshold - 0.01);
}
}
}
3.2 端云协同推理执行器
负责实际的推理执行,支持端侧、云端和并行三种模式。
// EdgeCloudInferenceExecutor.ets
// 端云协同推理执行器 - 统一的推理执行接口
import { http } from '@kit.NetworkKit';
import {
EdgeCloudDecisionEngine,
InferencePath,
TaskDescriptor,
DecisionResult
} from './EdgeCloudDecisionEngine';
// 推理结果接口
export interface InferenceResult {
data: Record<string, Object>; // 推理输出
confidence: number; // 置信度
latency: number; // 实际延迟(毫秒)
path: InferencePath; // 实际推理路径
fromCache: boolean; // 是否来自缓存
timestamp: number; // 时间戳
}
// 端侧推理引擎接口
interface EdgeInferenceEngine {
predict(input: Float32Array): Promise<Record<string, Object>>;
isReady(): boolean;
getModelInfo(): { name: string; version: string; inputSize: number };
}
// 云端推理服务接口
interface CloudInferenceConfig {
endpoint: string;
apiKey: string;
timeout: number;
retryCount: number;
}
export class EdgeCloudInferenceExecutor {
private decisionEngine: EdgeCloudDecisionEngine;
private edgeEngine: EdgeInferenceEngine | null = null;
private cloudConfig: CloudInferenceConfig;
private resultCache: Map<string, InferenceResult> = new Map();
private maxCacheSize: number = 100;
constructor(cloudConfig: CloudInferenceConfig) {
this.decisionEngine = new EdgeCloudDecisionEngine();
this.cloudConfig = cloudConfig;
}
// 注册端侧推理引擎
registerEdgeEngine(engine: EdgeInferenceEngine): void {
this.edgeEngine = engine;
console.info(`[InferenceExecutor] 端侧引擎已注册: ${engine.getModelInfo().name}`);
}
// 统一推理入口
async infer(input: Float32Array, task: TaskDescriptor): Promise<InferenceResult> {
const startTime = Date.now();
// 第一步:检查缓存
const cacheKey = this.generateCacheKey(input, task);
const cachedResult = this.resultCache.get(cacheKey);
if (cachedResult && Date.now() - cachedResult.timestamp < 300000) { // 5分钟缓存有效期
console.info('[InferenceExecutor] 命中缓存');
return { ...cachedResult, fromCache: true };
}
// 第二步:决策引擎选择推理路径
const decision = this.decisionEngine.decide(task);
console.info(`[InferenceExecutor] 推理路径: ${decision.path}, 理由: ${decision.reason}`);
// 第三步:执行推理
let result: InferenceResult;
try {
switch (decision.path) {
case InferencePath.EDGE_ONLY:
result = await this.inferOnEdge(input, task, decision);
break;
case InferencePath.CLOUD_ONLY:
result = await this.inferOnCloud(input, task, decision);
break;
case InferencePath.EDGE_CLOUD_PARALLEL:
result = await this.inferParallel(input, task, decision);
break;
case InferencePath.EDGE_FIRST:
result = await this.inferEdgeFirst(input, task, decision);
break;
case InferencePath.CLOUD_FIRST:
result = await this.inferCloudFirst(input, task, decision);
break;
default:
result = await this.inferOnEdge(input, task, decision);
}
} catch (error) {
// 推理失败,尝试回退路径
console.error(`[InferenceExecutor] 推理失败: ${error}, 尝试回退路径: ${decision.fallbackPath}`);
result = await this.fallbackInfer(input, task, decision.fallbackPath);
}
// 第四步:记录决策结果(用于在线学习)
const actualLatency = Date.now() - startTime;
this.decisionEngine.recordDecision(task, decision, actualLatency, true);
// 第五步:缓存结果
this.cacheResult(cacheKey, result);
return result;
}
// 端侧推理
private async inferOnEdge(input: Float32Array, task: TaskDescriptor,
decision: DecisionResult): Promise<InferenceResult> {
if (!this.edgeEngine || !this.edgeEngine.isReady()) {
throw new Error('端侧推理引擎不可用');
}
const startTime = Date.now();
const output = await this.edgeEngine.predict(input);
const latency = Date.now() - startTime;
return {
data: output,
confidence: 0.85, // 端侧模型通常精度略低
latency,
path: InferencePath.EDGE_ONLY,
fromCache: false,
timestamp: Date.now()
};
}
// 云端推理
private async inferOnCloud(input: Float32Array, task: TaskDescriptor,
decision: DecisionResult): Promise<InferenceResult> {
const startTime = Date.now();
// 将输入数据编码为Base64
const inputBase64 = this.float32ToBase64(input);
// 发送推理请求
const request = http.createHttp();
const response = await request.request(this.cloudConfig.endpoint, {
method: http.RequestMethod.POST,
header: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${this.cloudConfig.apiKey}`
},
extraData: JSON.stringify({
task_type: task.taskType,
input_data: inputBase64,
model_version: 'latest'
}),
expectDataType: http.HttpDataType.STRING,
connectTimeout: this.cloudConfig.timeout,
readTimeout: this.cloudConfig.timeout
});
const latency = Date.now() - startTime;
if (response.responseCode !== 200) {
throw new Error(`云端推理失败,状态码: ${response.responseCode}`);
}
const result = JSON.parse(response.result as string);
return {
data: result.output,
confidence: result.confidence || 0.95,
latency,
path: InferencePath.CLOUD_ONLY,
fromCache: false,
timestamp: Date.now()
};
}
// 端云并行推理 - 谁先返回用谁
private async inferParallel(input: Float32Array, task: TaskDescriptor,
decision: DecisionResult): Promise<InferenceResult> {
const startTime = Date.now();
// 同时发起端侧和云端推理
const edgePromise = this.inferOnEdge(input, task, decision)
.catch(_ => null as InferenceResult | null);
const cloudPromise = this.inferOnCloud(input, task, decision)
.catch(_ => null as InferenceResult | null);
// 使用Promise.race获取最快结果
// 但我们不用race,因为想等两边都完成来选更好的结果
const edgeResult = await edgePromise;
const cloudResult = await cloudPromise;
// 选择策略:如果端侧结果置信度够高且延迟低,优先用端侧
// 否则用云端结果(精度更高)
if (edgeResult && cloudResult) {
if (edgeResult.confidence > 0.9 && edgeResult.latency < cloudResult.latency) {
console.info('[InferenceExecutor] 并行推理:选择端侧结果(高置信度+低延迟)');
return { ...edgeResult, path: InferencePath.EDGE_CLOUD_PARALLEL };
} else {
console.info('[InferenceExecutor] 并行推理:选择云端结果(更高精度)');
return { ...cloudResult, path: InferencePath.EDGE_CLOUD_PARALLEL };
}
}
// 只有一边成功
const finalResult = edgeResult || cloudResult;
if (!finalResult) {
throw new Error('端云并行推理均失败');
}
return { ...finalResult, path: InferencePath.EDGE_CLOUD_PARALLEL };
}
// 端侧优先推理
private async inferEdgeFirst(input: Float32Array, task: TaskDescriptor,
decision: DecisionResult): Promise<InferenceResult> {
try {
const edgeResult = await this.inferOnEdge(input, task, decision);
if (edgeResult.confidence > 0.7) {
return edgeResult;
}
// 端侧置信度不够,回退云端
console.info('[InferenceExecutor] 端侧置信度不足,回退云端');
return await this.inferOnCloud(input, task, decision);
} catch (error) {
return await this.inferOnCloud(input, task, decision);
}
}
// 云端优先推理
private async inferCloudFirst(input: Float32Array, task: TaskDescriptor,
decision: DecisionResult): Promise<InferenceResult> {
try {
return await this.inferOnCloud(input, task, decision);
} catch (error) {
console.info('[InferenceExecutor] 云端推理失败,回退端侧');
return await this.inferOnEdge(input, task, decision);
}
}
// 回退推理
private async fallbackInfer(input: Float32Array, task: TaskDescriptor,
fallbackPath: InferencePath): Promise<InferenceResult> {
const fallbackDecision: DecisionResult = {
path: fallbackPath,
confidence: 0.5,
reason: '回退推理',
estimatedLatency: 0,
fallbackPath: InferencePath.EDGE_ONLY
};
if (fallbackPath === InferencePath.EDGE_ONLY) {
return await this.inferOnEdge(input, task, fallbackDecision);
} else {
return await this.inferOnCloud(input, task, fallbackDecision);
}
}
// 生成缓存键
private generateCacheKey(input: Float32Array, task: TaskDescriptor): string {
// 简化:使用输入数据的哈希和任务类型作为缓存键
let hash = 0;
for (let i = 0; i < Math.min(input.length, 100); i++) {
hash = ((hash << 5) - hash + Math.round(input[i] * 1000)) | 0;
}
return `${task.taskType}_${hash}`;
}
// 缓存推理结果
private cacheResult(key: string, result: InferenceResult): void {
if (this.resultCache.size >= this.maxCacheSize) {
// 删除最早的缓存
const firstKey = this.resultCache.keys().next().value;
if (firstKey) this.resultCache.delete(firstKey);
}
this.resultCache.set(key, result);
}
// Float32Array转Base64(简化实现)
private float32ToBase64(array: Float32Array): string {
const bytes = new Uint8Array(array.buffer);
let binary = '';
for (let i = 0; i < bytes.length; i++) {
binary += String.fromCharCode(bytes[i]);
}
return btoa(binary);
}
// 清除缓存
clearCache(): void {
this.resultCache.clear();
console.info('[InferenceExecutor] 缓存已清除');
}
}
3.3 端云协同推理完整UI
将决策引擎和推理执行器集成到HarmonyOS APP中,提供直观的推理路径可视化和状态监控。
// EdgeCloudInferencePage.ets
// 端云协同推理页面 - 可视化推理路径与状态监控
import {
EdgeCloudInferenceExecutor,
InferenceResult,
CloudInferenceConfig
} from './EdgeCloudInferenceExecutor';
import {
EdgeCloudDecisionEngine,
InferencePath,
TaskDescriptor
} from './EdgeCloudDecisionEngine';
@Entry
@Component
struct EdgeCloudInferencePage {
@State currentPath: string = '待决策';
@State inferenceStatus: string = '空闲';
@State lastLatency: number = 0;
@State lastConfidence: number = 0;
@State edgeCallCount: number = 0;
@State cloudCallCount: number = 0;
@State parallelCallCount: number = 0;
@State cacheHitCount: number = 0;
@State avgLatency: number = 0;
@State networkType: string = 'WiFi';
@State batteryLevel: number = 85;
@State isNPUAvailable: boolean = true;
@State selectedTaskType: number = 0;
// 推理历史
@State inferenceHistory: Array<{
time: string;
path: string;
latency: number;
confidence: number;
}> = [];
private taskTypes: string[] = ['图像分类', '目标检测', '语义分割', '文本生成'];
private executor: EdgeCloudInferenceExecutor | null = null;
private latencySum: number = 0;
private totalCalls: number = 0;
aboutToAppear() {
const cloudConfig: CloudInferenceConfig = {
endpoint: 'https://ai-api.example.com/v1/inference',
apiKey: 'your-api-key',
timeout: 5000,
retryCount: 2
};
this.executor = new EdgeCloudInferenceExecutor(cloudConfig);
}
build() {
Navigation() {
Scroll() {
Column({ space: 16 }) {
// 顶部状态栏
this.StatusBar()
// 推理路径可视化
this.PathVisualization()
// 任务选择
this.TaskSelector()
// 操作按钮
this.ActionButtons()
// 性能统计
this.PerformanceStats()
// 推理历史
this.InferenceHistory()
}
.width('100%')
.padding(16)
}
.width('100%')
.height('100%')
}
.title('端云协同推理')
.titleMode(NavigationTitleMode.Mini)
}
// 顶部状态栏
@Builder StatusBar() {
Row({ space: 12 }) {
// 网络状态
Column({ space: 2 }) {
Text(this.networkType === 'WiFi' ? '📶' : '📡')
.fontSize(20)
Text(this.networkType)
.fontSize(10)
.fontColor('#64748B')
}
.padding(8)
.borderRadius(8)
.backgroundColor('#F0F9FF')
// 电量
Column({ space: 2 }) {
Text('🔋')
.fontSize(20)
Text(`${this.batteryLevel}%`)
.fontSize(10)
.fontColor('#64748B')
}
.padding(8)
.borderRadius(8)
.backgroundColor('#F0FDF4')
// NPU
Column({ space: 2 }) {
Text(this.isNPUAvailable ? '⚡' : '💤')
.fontSize(20)
Text(this.isNPUAvailable ? 'NPU就绪' : 'NPU休眠')
.fontSize(10)
.fontColor(this.isNPUAvailable ? '#10B981' : '#94A3B8')
}
.padding(8)
.borderRadius(8)
.backgroundColor(this.isNPUAvailable ? '#ECFDF5' : '#F8FAFC')
// 当前路径
Column({ space: 2 }) {
Text('🧭')
.fontSize(20)
Text(this.currentPath)
.fontSize(10)
.fontColor('#4F46E5')
.fontWeight(FontWeight.Bold)
}
.padding(8)
.borderRadius(8)
.backgroundColor('#EEF2FF')
}
.width('100%')
.justifyContent(FlexAlign.SpaceAround)
}
// 推理路径可视化
@Builder PathVisualization() {
Column({ space: 12 }) {
Text('🔀 推理路径')
.fontSize(16)
.fontWeight(FontWeight.Bold)
.fontColor('#1E293B')
.width('100%')
// 端云协同流程图
Row({ space: 8 }) {
// 端侧
Column({ space: 4 }) {
Text('📱')
.fontSize(28)
Text('端侧推理')
.fontSize(12)
.fontColor('#475569')
Text('MindSpore Lite')
.fontSize(10)
.fontColor('#94A3B8')
}
.padding(12)
.borderRadius(12)
.backgroundColor(this.currentPath.includes('端侧') ? '#EEF2FF' : '#F8FAFC')
.border({
width: this.currentPath.includes('端侧') ? 2 : 1,
color: this.currentPath.includes('端侧') ? '#4F46E5' : '#E2E8F0'
})
.layoutWeight(1)
.alignItems(HorizontalAlign.Center)
// 箭头
Column() {
Text('⇌')
.fontSize(24)
.fontColor('#94A3B8')
}
.justifyContent(FlexAlign.Center)
// 云端
Column({ space: 4 }) {
Text('☁️')
.fontSize(28)
Text('云端推理')
.fontSize(12)
.fontColor('#475569')
Text('ModelArts')
.fontSize(10)
.fontColor('#94A3B8')
}
.padding(12)
.borderRadius(12)
.backgroundColor(this.currentPath.includes('云端') ? '#F5F3FF' : '#F8FAFC')
.border({
width: this.currentPath.includes('云端') ? 2 : 1,
color: this.currentPath.includes('云端') ? '#8B5CF6' : '#E2E8F0'
})
.layoutWeight(1)
.alignItems(HorizontalAlign.Center)
}
.width('100%')
// 当前推理状态
Row() {
Text('状态:')
.fontSize(13)
.fontColor('#64748B')
Text(this.inferenceStatus)
.fontSize(13)
.fontWeight(FontWeight.Bold)
.fontColor(this.inferenceStatus === '推理完成' ? '#10B981' : '#F59E0B')
}
.width('100%')
}
.width('100%')
.padding(16)
.borderRadius(16)
.backgroundColor('#FFFFFF')
.shadow({ radius: 8, color: 'rgba(0,0,0,0.06)', offsetX: 0, offsetY: 2 })
}
// 任务选择器
@Builder TaskSelector() {
Column({ space: 12 }) {
Text('🎯 任务类型')
.fontSize(16)
.fontWeight(FontWeight.Bold)
.fontColor('#1E293B')
.width('100%')
Row({ space: 8 }) {
ForEach(this.taskTypes, (type: string, index: number) => {
Button(type)
.fontSize(13)
.fontColor(this.selectedTaskType === index ? '#FFFFFF' : '#475569')
.backgroundColor(this.selectedTaskType === index ? '#4F46E5' : '#F1F5F9')
.borderRadius(8)
.layoutWeight(1)
.height(36)
.onClick(() => {
this.selectedTaskType = index;
})
}, (type: string, index: number) => `${index}`)
}
.width('100%')
}
.width('100%')
.padding(16)
.borderRadius(16)
.backgroundColor('#FFFFFF')
.shadow({ radius: 8, color: 'rgba(0,0,0,0.06)', offsetX: 0, offsetY: 2 })
}
// 操作按钮
@Builder ActionButtons() {
Row({ space: 12 }) {
Button('🚀 执行推理')
.fontSize(16)
.fontWeight(FontWeight.Bold)
.fontColor('#FFFFFF')
.backgroundColor('#4F46E5')
.borderRadius(12)
.layoutWeight(1)
.height(48)
.onClick(() => this.runInference())
Button('🗑️ 清除缓存')
.fontSize(14)
.fontColor('#64748B')
.backgroundColor('#F1F5F9')
.borderRadius(12)
.width(100)
.height(48)
.onClick(() => {
this.executor?.clearCache();
this.cacheHitCount = 0;
})
}
.width('100%')
}
// 性能统计
@Builder PerformanceStats() {
Column({ space: 12 }) {
Text('📊 性能统计')
.fontSize(16)
.fontWeight(FontWeight.Bold)
.fontColor('#1E293B')
.width('100%')
Row({ space: 8 }) {
this.StatCard('端侧调用', `${this.edgeCallCount}`, '#4F46E5')
this.StatCard('云端调用', `${this.cloudCallCount}`, '#8B5CF6')
this.StatCard('并行调用', `${this.parallelCallCount}`, '#06B6D4')
}
.width('100%')
Row({ space: 8 }) {
this.StatCard('缓存命中', `${this.cacheHitCount}`, '#10B981')
this.StatCard('平均延迟', `${this.avgLatency}ms`, '#F59E0B')
this.StatCard('最近置信度', `${(this.lastConfidence * 100).toFixed(0)}%`, '#EF4444')
}
.width('100%')
}
.width('100%')
.padding(16)
.borderRadius(16)
.backgroundColor('#FFFFFF')
.shadow({ radius: 8, color: 'rgba(0,0,0,0.06)', offsetX: 0, offsetY: 2 })
}
// 统计卡片
@Builder StatCard(label: string, value: string, color: string) {
Column({ space: 4 }) {
Text(value)
.fontSize(18)
.fontWeight(FontWeight.Bold)
.fontColor(color)
Text(label)
.fontSize(11)
.fontColor('#94A3B8')
}
.padding(8)
.borderRadius(8)
.backgroundColor('#F8FAFC')
.layoutWeight(1)
.alignItems(HorizontalAlign.Center)
}
// 推理历史
@Builder InferenceHistory() {
Column({ space: 8 }) {
Text('📋 推理历史')
.fontSize(16)
.fontWeight(FontWeight.Bold)
.fontColor('#1E293B')
.width('100%')
if (this.inferenceHistory.length === 0) {
Text('暂无推理记录')
.fontSize(13)
.fontColor('#94A3B8')
.width('100%')
.textAlign(TextAlign.Center)
.padding(20)
} else {
List({ space: 4 }) {
ForEach(this.inferenceHistory, (item: {
time: string; path: string; latency: number; confidence: number;
}, index: number) => {
ListItem() {
Row() {
Text(item.time)
.fontSize(11)
.fontColor('#94A3B8')
.width(60)
Text(item.path)
.fontSize(12)
.fontColor('#475569')
.layoutWeight(1)
Text(`${item.latency}ms`)
.fontSize(11)
.fontColor('#F59E0B')
.width(50)
.textAlign(TextAlign.End)
Text(`${(item.confidence * 100).toFixed(0)}%`)
.fontSize(11)
.fontColor('#10B981')
.width(35)
.textAlign(TextAlign.End)
}
.width('100%')
.padding({ left: 8, right: 8, top: 4, bottom: 4 })
}
}, (item: Object, index: number) => `${index}`)
}
.width('100%')
.height(160)
.borderRadius(8)
.backgroundColor('#F8FAFC')
.padding(4)
}
}
.width('100%')
.padding(16)
.borderRadius(16)
.backgroundColor('#FFFFFF')
.shadow({ radius: 8, color: 'rgba(0,0,0,0.06)', offsetX: 0, offsetY: 2 })
}
// 执行推理
private async runInference() {
if (!this.executor) return;
this.inferenceStatus = '推理中...';
const taskTypeMap: Record<string, string> = {
'图像分类': 'classification',
'目标检测': 'detection',
'语义分割': 'segmentation',
'文本生成': 'generation'
};
const complexityMap: Record<string, number> = {
'classification': 0.2,
'detection': 0.4,
'segmentation': 0.7,
'generation': 0.9
};
const taskType = taskTypeMap[this.taskTypes[this.selectedTaskType]] || 'classification';
const task: TaskDescriptor = {
taskType,
inputSize: 1024 * 100,
modelComplexity: complexityMap[taskType] || 0.5,
privacyLevel: 0.3,
latencyRequirement: 200,
accuracyRequirement: 0.8
};
// 模拟输入数据
const input = new Float32Array(100);
for (let i = 0; i < 100; i++) {
input[i] = Math.random();
}
try {
const result = await this.executor.infer(input, task);
// 更新UI状态
this.currentPath = this.getPathDisplayName(result.path);
this.lastLatency = result.latency;
this.lastConfidence = result.confidence;
this.inferenceStatus = '推理完成';
// 更新统计
if (result.path === InferencePath.EDGE_ONLY) this.edgeCallCount++;
else if (result.path === InferencePath.CLOUD_ONLY) this.cloudCallCount++;
else this.parallelCallCount++;
if (result.fromCache) this.cacheHitCount++;
this.totalCalls++;
this.latencySum += result.latency;
this.avgLatency = Math.round(this.latencySum / this.totalCalls);
// 添加历史记录
this.inferenceHistory.unshift({
time: new Date().toLocaleTimeString(),
path: this.getPathDisplayName(result.path),
latency: result.latency,
confidence: result.confidence
});
if (this.inferenceHistory.length > 20) {
this.inferenceHistory.pop();
}
} catch (error) {
this.inferenceStatus = '推理失败';
console.error(`推理失败: ${error}`);
}
}
// 获取路径显示名称
private getPathDisplayName(path: InferencePath): string {
const nameMap: Record<string, string> = {
'edge_only': '端侧推理',
'cloud_only': '云端推理',
'parallel': '端云并行',
'edge_first': '端侧优先',
'cloud_first': '云端优先'
};
return nameMap[path] || path;
}
}
四、踩坑与注意事项
坑1:云端推理超时导致UI卡死
问题:云端推理是网络请求,如果直接在主线程调用,一旦网络超时(默认60秒),UI就冻住了。用户等不了60秒,3秒没反应就会以为APP挂了。
解决方案:所有推理调用必须异步执行,并设置合理的超时时间。同时提供"取消推理"功能,让用户可以主动中断。
// 带超时和取消的推理调用
async inferWithTimeout(input: Float32Array, task: TaskDescriptor,
timeoutMs: number = 5000): Promise<InferenceResult> {
// 创建AbortController用于取消
const abortController = new AbortController();
const timeoutPromise = new Promise<never>((_, reject) => {
setTimeout(() => {
abortController.abort();
reject(new Error('推理超时'));
}, timeoutMs);
});
const inferPromise = this.executor!.infer(input, task);
// 超时和推理竞争
return Promise.race([inferPromise, timeoutPromise]);
}
坑2:端云模型版本不一致
问题:端侧模型是v1.2,云端模型已经更新到v2.0,两边输出的特征维度或语义不同,融合结果完全错误。
解决方案:在推理请求中携带模型版本号,云端根据版本号选择对应模型。或者约定版本兼容策略——v2.0必须向后兼容v1.2的输出格式。
坑3:缓存导致隐私风险
问题:推理结果缓存在设备上,如果缓存了人脸识别结果,手机被root后缓存文件可能泄露用户身份信息。
解决方案:敏感推理结果不缓存,或使用设备安全存储(HarmonyOS的HUKS密钥管理服务)加密缓存。
坑4:网络切换导致推理中断
问题:WiFi切蜂窝时,正在进行的云端推理TCP连接断开,请求失败。
解决方案:监听网络变化事件,网络切换时自动重试推理请求,或临时切换到端侧推理。
坑5:端云并行时的资源竞争
问题:端云并行推理时,端侧推理占用NPU和内存,同时云端推理占用网络带宽和CPU(序列化数据),可能导致设备发热和卡顿。
解决方案:限制并行推理的并发度,设置资源使用上限。在低电量或高温时自动降级为串行推理。
五、HarmonyOS 6适配
5.1 API差异
| 功能 | HarmonyOS 5.0 | HarmonyOS 6 Beta |
|---|---|---|
| AI推理 | @ohos.ai.mindSpore |
@ohos.ai.inferenceEngine(重构) |
| 网络监测 | @ohos.net.connection |
新增网络质量实时评估API |
| 设备状态 | 分散在各模块 | 统一DeviceStatusManager |
| 安全存储 | @ohos.security.huks |
新增AI推理结果专用安全缓存 |
| 分布式通信 | @ohos.data.distributedData |
新增推理数据专用高速通道 |
5.2 迁移指南
// HarmonyOS 5.0 - MindSpore推理
import mindSpore from '@ohos.ai.mindSpore';
const model = await mindSpore.createModel(modelPath);
// HarmonyOS 6 - 统一推理引擎
import { inferenceEngine } from '@kit.AiKit';
const model = await inferenceEngine.loadModel({
uri: modelPath,
accelerator: inferenceEngine.Accelerator.NPU, // 指定NPU加速
precision: inferenceEngine.Precision.FP16 // FP16精度
});
// 新增:网络质量实时评估
import { networkQuality } from '@kit.NetworkKit';
const quality = await networkQuality.assess();
// quality.score: 0-100 网络质量评分
// quality.recommendPath: 'edge' | 'cloud' | 'auto' 推荐推理路径
5.3 HarmonyOS 6新增特性
- 智能推理调度器:系统级端云协同调度,自动选择最优推理路径
- 推理结果安全缓存:基于TEE的加密缓存,防泄露
- 分布式推理集群:多设备组成推理集群,手机NPU+平板NPU协同计算
- 网络质量预测:基于历史数据预测未来30秒网络质量,提前决策推理路径
- 自适应模型切换:根据设备状态动态切换不同大小的端侧模型
六、总结
| 知识点 | 核心内容 |
|---|---|
| 端云协同架构 | 决策引擎+推理执行+结果融合三层架构 |
| 智能分流决策 | 多因子加权评分,综合考虑复杂度/网络/算力/隐私/延迟/电量 |
| 端云并行推理 | 同时发起端侧和云端推理,取更优结果 |
| 端侧优先/云端优先 | 带回退的推理策略,兼顾速度和精度 |
| 结果缓存 | LRU缓存+TTL过期,减少重复推理开销 |
| 在线学习 | 根据历史决策结果自适应调整分流阈值 |
| 隐私保护 | 敏感数据不缓存,安全存储加密 |
| HarmonyOS 6 | 系统级推理调度器、分布式推理集群、网络质量预测 |
端云协同推理的本质是一种"智能路由"——不是简单地二选一,而是根据实时条件动态决策。这就像导航软件不是永远推荐同一条路线,而是根据实时路况帮你选最快的路。好的端云协同架构,应该让用户完全感知不到"推理在哪里发生",只感受到"结果又快又准"。
在HarmonyOS 6中,系统级推理调度器的引入将大大简化开发者的工作。但理解底层决策逻辑依然重要——因为只有理解了"为什么选择这条路",才能在出问题时快速定位"哪里走错了"。端云协同不是银弹,但它是当前AI落地最务实的架构选择。
- 点赞
- 收藏
- 关注作者
评论(0)