About Me
Hi! I am Weimin Lyu, a final year Ph.D. student in Computer Science at Stony Brook University, advised by Prof. Chao Chen. I am also very fortunate to collaborate with esteemed professors: Haibin Ling, Fusheng Wang, and Tengfei Ma. I am currently an Applied Scientist Intern at Amazon, focusing on foundation model (LLaMA, Mistral) continuously pre-training and fine-tuning with a strong emphasis on numerical and text features.
Research Interests
My research includes multiple directions addressing text-based problems (BERT variants, LLMs), image classification (CNNs, Vision Transformers, CLIP), as well as multimodal image-to-text generation using Vision-Language Models (BLIP-2, MiniGPT-4, LLaVA, InstructBLIP). I also have expertise in explainability for clinical language models using Electronic Health Records.
News
- 2024-07: TrojVLM is accepted by ECCV 2024! We investigate the vulnerabilities in the generative capabilities of Vision-Language Models, with a focus on image captioning and visual question answering (VQA) tasks.
- 2024-06: BadCLM is nominated as the Best Student Paper by AMIA 2024! We investigate the clinical language model’s vulnerabilities.
- 2024-03: One paper is accepted by NAACL 2024! We introduce a task-agnostic method for detecting textual backdoors, targeting a range of language models and traditional NLP tasks.
- 2023-10: TAL is accepted by EMNLP 2023!
- 2023-03: Two papers are accepted by ICLR 2023 Workshop on BANDS!
- 2022-10: Paper “An Integrated LSTM-HeteroRGNN Model for Interpretable Opioid Overdose Risk Prediction” is accepted by Artificial Intelligence in Medicine!
- 2022-06: One paper is nominated as the Best Student Paper by AMIA 2022! We propose a multimodal transformer to fuse clinical notes and traditional EHR data for interpretable mortality prediction. AMIA is the world’s premier meeting for research and practice of biomedical and health informatics.
- 2022-04: Paper “A Study of the Attention Abnormality in Trojaned BERTs” is accepted by NAACL 2022!
- 2020-09: Start my Computer Science Ph.D. at Stony Brook University!
Industry Experience
Applied Scientist Intern, Full Time
- Focused on foundation model training, with a strong emphasis on numerical and text features.
- Developed the entire pre-training and fine-tuning pipeline, supporting both small-scale and large-scale model training.
- Developed strategies to address multi-task real-world Amazon's user case.
- Delivered a developed model for business review, aimed at production launch.
Selected Publications
Full publications can be found in Google Scholar.
Preprint
Weimin Lyu, Jiachen Yao, Saumya Gupta, Lu Pang, Tao Sun, Lingjie Yi, Lijie Hu, Haibin Ling, Chao Chen
Tech Report
Conference/Workshop/Journal
Lingjie Yi, Tao Sun, Yikai Zhang, Songzhu Zheng, Weimin Lyu, Haibin Ling, Chao Chen
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025)
Weimin Lyu, Lu Pang, Tengfei Ma, Haibin Ling, Chao Chen
The 18th European Conference on Computer Vision (ECCV 2024)
[ECCV]
Weimin Lyu, Zexin Bi, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2024)
[arXiv]
Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen
The Findings of 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)
[NAACL]
Weimin Lyu, Songzhu Zheng, Lu Pang, Haibin Ling, Chao Chen
The Findings of 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) (A short version is accepted as Oral at ICLR 2023 Workshop on BANDS)
[EMNLP][Code]
Xinyu Dong, Rachel Wong, Weimin Lyu, Kayley Abell-Hart, Janos G Hajagos, Richard N Rosenthal, Chao Chen, Fusheng Wang
Artificial Intelligence in Medicine (AIIM 2022)
[AIIM]
Weimin Lyu, Xinyu Dong, Rachel Wong, Songzhu Zheng , Kayley Abell-Hart, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2022) (Student Paper Finalist-Equal to Best Student Paper Nomination)
[AMIA][Code]