CS Colloquium: “Image Synthesis using Diffusion Models” (Rupayan Mallick, Georgetown University)
Abstract
Deep generative models have dominated the literature in the vision and language domains. This has enabled the researchers to develop large text-to-image generative models that are able to generate photorealistic high fidelity images. Many image editing tasks have leveraged this capability to achieve state-of-the-art results. Textual description as a form of instruction has been a key factor for the success of these image editing models. Generating images from textual description is a comparatively easier task than image editing. Image editing requires preservation of most of the features of a reference/original image with subtle changes based on the instruction.
In this talk, I will present works both from the literature and from our lab on (1) image editing tasks: inpainting and relighting, and (2) text inversion tasks: given an image can we find a natural language prompt that can be used to generate a similar image when instructed on a static model.
The event will be followed by a lunch.
Zoom Link: https://georgetown.zoom.us/j/3658354070?omn=91477324327
Speaker’s Biography
Rupayan (he/him) is a Postdoctoral Fellow working with MDI and the Department of Computer Science. His research interests include Generative AI, Explainable AI, and Multimodal Learning. Rupayan is currently exploring the generative models of deep learning such as diffusion models for different data modalities.