DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents

内容分享1个月前发布
0 0 0

OpenAI Text2Image based on CLIP and Diffusion Model

DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents

虚线以上部分表明CLIP的训练过程,预训练CLIP模型在生成图片的过程中是固定的

虚线以下部分表明利用CLIP的text encoder生成图片的过程,在获取输入文本描述的text embedding之后,将其输入一个prior(autoregressive or diffusion),来获取image embedding,然后将image embedding送入diffusion model(decoder,改善版GLIDE)来生成图像。

prior网络的训练过程,对一个图片文本对DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents和已经训练好的CLIP模型(text encoder && image encoder),将文本描述DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents输入text encoder,得到文本编码DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents,将图片DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents输入image encoder,得到图像编码DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents,不妨设DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents输入prior模型得到预测的图像编码DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents,希望DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP LatentsDALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents越接近越好,以此来更新prior模块。最终训练好的prior,将与CLIP的text encoder串联起来,即可根据输入文本DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents生成对应的图像编码特征DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents

在DALL-E 2 模型中,作者团队尝试了两种prior模型:自回归式Autoregressive (AR) prior 和扩散模型Diffusion prior。实验效果上发现两种模型的性能类似,而由于扩散模型效率较高,因此最终选择了扩散模型作为prior模块。

DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents

DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents表明prior网络,输入文本描述DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents,产生image embeddingDALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents

DALL-E-2:Hierarchical Text-Conditional Image Generation with CLIP Latents表明decoder,输入image embedding,生成图片,同时生成过程条件于文本描述。

© 版权声明

相关文章

暂无评论

您必须登录才能参与评论!
立即登录
none
暂无评论...