I am currently working in the field of computer vision, with a particular focus on multimodal AI. My current work involves research in the intersection of language and vision, recognizing its profound potential in tackling intricate real-world issues, such as the detection of fake images, text, and speech. Moreover, I am also enthusiastic about applying computer vision, multimodal AI and Reinforcement Learning methodologies to the medical domain. My research interests extend to leveraging these technologies to help build robots that can help aid people in daily lives.
I recently completed my Master's degree in AI at Boston University, where I had the privilege of being advised by Dr. Bryan Plummer. During my time there, I collaborated with Dr. Plummer on a project focused on image manipulation detection using text. Additionally, I contributed to the research efforts of SemaFor, led by Dr. Kate Saenko and Dr. Bryan Plummer, where I concentrated on synthetic image detection. The goal of this work was to use deep learning to combat the spread of fake news on social media platforms.
Prior to my graduate studies, I earned a Bachelor's degree in Engineering Science from Indian Institute of Technology (IIT) Hyderabad in India. I am currently working as a Software Developer at Barclays, where I contributed to innovative financial applications, and previously, as a Software Development Engineer II at Oravel Stays Pvt Ltd (OYO), I worked on development of microservices and architectural improvements.
Publications
MIMIC: Multimodal Image Manipulation Dataset (Submitted to CVPR’24)
D. Appapogu, K. Nichols, G. Biamby, A. Rohrbach, B. Plummer, “MIMIC: Multimodal Image Manipulation with Rich Context,” submitted to Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
(Currently Under Review)
D. Spoorthy, S. R. Manne, V. Dhyani, et al., “Automatic identification of mixed retinal cells in time-lapse fluorescent microscopy images using high-dimensional dbscan,” in 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019, pp. 4783–4786. doi: 10.1109/EMBC.2019.8857375.
Academic Research
Synthetic Media Attribution
This project focuses on addressing the misuse of synthetic media by developing techniques to attribute generated images to their specific creator or generator model. We used an end-to-end classifier with adversarial strategies to improve model robustness, aiming to enhance our understanding of generative models and mitigate the risks associated with synthetic media proliferation. The goal is to combat misinformation and preserve the integrity of digital information.
Manipulating Stochastic Gradient Descent with Data Ordering Attacks
Perfomed additional experiments on existing research on data ordering attacks leveraging the stochastic nature of SGD, which exploited the influence of data order on the training process. These attacks included Batch reordering, Batch reshuffling, and Backdoor methods, involving simultaneous training of surrogate models with source models to reorder data points based on losses or gradients.
Multilingual Emoji Prediction
Implemented and fine-tuned various discriminative and generative Large Language Models (LLMs), on English and Spanish tweets. Using a dataset of 500K English tweets and 100K Spanish tweets, the models were tested on 50K English tweets and 10K Spanish tweets, surpassing the baseline of SemEval Competition in terms of overall accuracy.
COVID-19 Instagram posts emotion detection using Sentiment140 dataset
Built a machine learning model for COVID-19 Instagram posts, using Naive Bayes, Logistic Regression and BERT base models. Our analysis uncovered a correlation between fear and anger emotions and the presence of an East Asian person in the images.
Work Experience
Barclays Software Developer (2023 - current)
OYO Senior Software Developer (2019 - 2021)