A Review Analysis on Realistic Image Generation of Faces From Text Descriptions Using Multi-modal GAN Inversion
Volume: 13 - Issue: 03 - Date: 01-03-2024
Approved ISSN: 2278-1412
Published Id: IJAECESTU375 | Page No.: 6-10
Author: Ganesh Chandra
Co- Author: Dr. Anita Soni
Abstract:-Text-to-face generation, a sub-domain of text-to-image synthesis, holds significant promise for various
research areas and applications, particularly in the realm of public safety. However, the progress in this field has been
hindered by the scarcity of datasets, leading to limited research efforts. Most existing approaches for text-to-face
generation rely on partially trained generative adversarial networks (GANs), where a pre-trained text encoder extracts
semantic features from input sentences, which are then utilized to train the image decoder. In this study, we propose a
fully trained GAN framework to generate realistic and natural images. Our approach simultaneously trains both the text
encoder and the image decoder to improve accuracy and efficiency in generating images. Additionally, we contribute to
the field by creating a novel dataset through the fusion of existing datasets such as LFW and CelebA, along with locally
curated data, which is labeled based on defined classes. Through extensive experimentation, our fully trained GAN model
has demonstrated superior performance in generating high-quality images based on input sentences. The visual results
further validate the effectiveness of our approach in accurately generating facial images corresponding to the provided
queries.
Key Words:- Realistic Image Generation, Faces, Text Descriptions, Multi-modal GAN, Inversion
Area:-Engineering
Download Paper:
Preview This Article