![[ICLR’24] MGIE](https://www.niaorui.com/jietu/home/20250908/mllm-iegithubio-ico.jpg)
Depth Anything
This work presents Depth Anything, a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, we aim to build a simple yet powerful foundation model dealing with any images under any circumstances. To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error. We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability. Further, through fine-tuning it with metric depth information from NYUv2 and KITTI, new SOTAs are set. Our better depth model also results in a much better depth-conditioned ControlNet. All models have been released.
We thank the MagicEdit team for providing some video examples for video depth estimation, and Tiancheng Shen for evaluating the depth maps with MagicEdit. The middle video is generated by MiDaS-based ControlNet, while the last video is generated by Depth Anything-based ControlNet.
数据统计
数据评估
关于Depth Anything特别声明
本站鸟瑞导航提供的Depth Anything数据都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由鸟瑞导航实际控制,在2025年9月10日 下午7:03收录时,该网页上的内容,都属于合法合规,后期网页的内容如出现违规,请联系本站网站管理员进行举报,我们将进行删除,鸟瑞导航不承担任何责任。
相关导航
![[ICLR’24] MGIE](https://www.niaorui.com/jietu/home/20250908/mllm-iegithubio-ico.jpg)
[ICLR'24] MGIE

Clip Interrogator
Run open-source machine learning models with a cloud API

aigccafe.net
aigccafe.net

icraft edit
iCraft 是一个在线 3D 信息图、场景图设计平台,用 iCraft 轻松快速构建 3D 架构图、流程图、产品演示和 3D 信息场景,让你的想法更直观、出色。既可可视化展示,也可作为前端组件直接集成。

吐司AI绘画
可免费在线生图的 AI 模型分享社区,支持 Stable Diffusion Model & LoRA, ComfyUI Workflow, Tencent Hunyuan-DiT

Vega AI
AI绘画社区--

Hitems
HITEMS is an AI-based design platform that provides one-click product generation and delivery solutions. We are committed to helping everyone bring their creativity to life and making design simple and accessible.Whether it’s a personal project or a brand concept, HITEMS ensures that every idea has the opportunity to become a reality

AI motion capture and 3D scene design with RADiCAL
Create animation from video to 3D with RADiCAL's AI motion capture solution, and design scenes and environments in real-time.
暂无评论...
