← Back to News List

Talk: Advancing Multimodal Retrieval & Generation, 4 pm 12/4

From General to Biomedical Domains

Advancing Multimodal Retrieval and Generation: From General to Biomedical Domains


Dr. Man Luo, Postdoctoral Research Fellow, Mayo Clinic
Monday, Dec. 4, 2023, 4:00pm ET, via Webex and in ENGR 231 

Abstract:  This talk explores advancements in multimodal retrieval and generation across general and biomedical domains. The first work introduces a multimodal retriever and reader pipeline for vision-based question answering, using image-text queries to retrieve and interpret relevant textual knowledge. The second work simplifies this approach with an efficient end-to-end retrieval model, removing dependencies on intermediate models like object detectors. The final part presents a biomedical-focused multimodal generation model, capable of classifying and explaining labels in images with text prompts. Together, these works demonstrate significant progress in integrating visual and textual data processing in diverse applications.

Bio:  Dr Man Luo is a Postdoctoral Research Fellow at Mayo Clinic with Dr. Imon Banerjee and Dr. Bhavik Patel. Her research is at the intersection of information retrieval and reading comprehension within natural language processing (NLP) and multimodal domains, to retrieve and utilize external knowledge with efficiency and generalization. Currently she is interested in knowledge retrieval, multimodal understanding, and applications of LLMs and VLMs in biomedical/healthcare applications. She earned her Ph.D. in 2023 from Arizona State University advised by Dr. Chitta Baral, and has collaborated with industrial research labs at Salesforce, Meta, and Google.

Posted: December 4, 2023, 11:12 AM