
Instruction-based image editing improves the controllability and flexibility of image manipulation via natural commands without elaborate descriptions or regional masks. However, human instructions are sometimes too brief for current methods to capture and follow. Multimodal large language models (MLLMs) show promising capabilities in cross-modal understanding and visual-aware response generation via LMs. We investigate how MLLMs facilitate edit instructions and present MLLM-Guided Image Editing (MGIE). MGIE learns to derive expressive instructions and provides explicit guidance. The editing model jointly captures this visual imagination and performs manipulation through end-to-end training. We evaluate various aspects of Photoshop-style modification, global photo optimization, and local editing. Extensive experimental results demonstrate that expressive instructions are crucial to instruction-based image editing, and our MGIE can lead to a notable improvement in automatic metrics and human evaluation while maintaining competitive inference efficiency.
👇 press the tab for different datasets
数据统计
数据评估
关于[ICLR’24] MGIE特别声明
本站鸟瑞导航提供的[ICLR’24] MGIE数据都来源于网络,不保证外部链接的准确性和完整性,同时,对于该外部链接的指向,不由鸟瑞导航实际控制,在2025年9月10日 下午7:04收录时,该网页上的内容,都属于合法合规,后期网页的内容如出现违规,请联系本站网站管理员进行举报,我们将进行删除,鸟瑞导航不承担任何责任。
相关导航

Plask offers AI motion capture from video, transforming your videos into stunning animations. Dive into our step-by-step guide and learn how to use our motion capture camera for the best results.

Depth Anything
Depth Anything

Vega AI
AI绘画社区--
站酷ZCOOL
站酷ZCOOL,中国设计师互动平台.深耕设计领域十八年,站酷聚集了1800万设计师、摄影师、插画师、艺术家、创意人,设计创意群体中具有较高的影响力与号召力.

51建模网
51建模网是深圳积木易搭科技技术有限公司旗下3D数据服务平台,包含3D建模业务对接与制作分发,3D模型数据云存储与调用展示,提供真正的一站式整体解决方案,加快推动各地区各行各业的3D数字化技术应用.

CSM — The fastest way to create 3D with AI
Common Sense Machines builds industry-leading 3D generative-AI models that transform images, text, and sketches into game-ready 3D assets and worlds. Trusted by world leading game studios, product designers and industrial designers.

IMI Prompt推荐
IMI Prompt Builder is a comprehensive Midjourney v5 prompt generator with thousands of options available on web, Android, and iOS. With just a few clicks, users can create unique Midjourney v5 artworks that reflect their personal style and artistic vision.

PromptNice
交易优质Midjourney、Stable Diffusion、DALL·E、ChatGPT提示词的市场。找到优质的Prompt,产生更好的AIGC结果,节省API成本,通过销售Prompt提示词获利。
暂无评论...
