TensorFlow 3D你了解多少?

举报
大道无形 发表于 2021/03/11 15:56:47 2021/03/11
【摘要】 TF 3D概览

TF 3D背景和意义

AR的兴起

歌推出增强现实平台ARKitARCore 

三星为其Galaxy Note 10Galaxy S10 5G恢复了飞行时间(ToF传感器 

谷歌也在其Pixel 4中的Project Soli对雷达进行了简短介绍

苹果在其TrueDepth前置摄像头取得突破后,为最新的旗舰系列机型iPhone 12 ProiPad Pro系列产品添加了LiDAR传感器 

机器学习对于创建高级AR体验是必不可少 

基于AI研究的关注,谷歌在AR的未来中扮演着与苹果、FacebookSnap以及微软一样重要的角色

TF 3D概览

提供3D操作和工具

TF 3D provides a set of popular operations, loss functions, data processing tools, models and metrics that enables the broader research community to develop, train and deploy state-of-the-art 3D scene understanding models 

支持3大类3D算法 

TF 3D contains training and evaluation pipelines for state-of-the-art 3D semantic segmentation3D object detection and 3D instance segmentation, with support for distributed training.

提供数据集管理功能,提供3大经典数据集使用案例 

It offers a unified dataset specification and configuration for training and evaluation of the standard 3D scene understanding datasets. It currently supports the Waymo OpenScanNet, and Rio datasets.

3D 稀疏卷积网络

稀疏卷积的背景 

scene that contains a set of objects of interest (e.g. cars, pedestrians, etc.) surrounded mostly by open space, which is of limited (or no) interest. As such, 3D data is inherently sparse

image.png

3D数据集支持

Frame 

frame level data like color and depth camera images, point cloud, camera intrinsicsgroundtruth semantic and instance segmentations annotations 

Scene 

point-cloud/mesh data of a whole scene and a lightweight information to all frames in the scene

数据集样例:Waymo Open dataset

Scene 

peach 20 second snippet of car's journey 

Frame pdata collected within a small time range (approximately single timestamp) 

vehicle frame poses are provided w.r.t a global world frame for a particular scene

image.png

image.png

数据集样例:RIO: 3D Object Instance Re-Localization

multiple 3D snapshots of naturally changing indoor environments, designed for benchmarking emerging tasks such as long-term SLAMscene change detection and object instance re-localization

 

image.png

image.png

TF3D Models(1)

l模型定义 

输入:a call function that receives a dictionary of input tensors 

运行: runs a deep network with potentially some postprocessingapplying the loss functions to the outputs

输出:a dictionary of the computed output tensors 

深度学习网络

输入:sparse set of voxel indices 

输出:the length, height, width, rotation matrix, center, and logits for each voxel 

后处理:predictions are aggregated and a set of objects are proposed

TF3D Models(2)

Unet 网络

An encoder that downsamples the input sparse voxels.

A bottleneck, 

A decoder (with skip connections) that upsamples the sparse voxel features back to original resolution 

l调优 

pThe computation scale can be adjusted by changing SparseConvUNet's conv_filter_size and encoder/bottleneck/decoder_dimensions parameter 

HourGlass Network (tf3d/layers/sparse_voxel_hourglass.py) is one or multiple stacked UNet Networks.

输出head 

output head takes in (sparse) voxel features as input, and produces per-voxel prediction with a specific dimension

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。