About Me
Hi! I am currently an Applied Scientist at Amazon Search, where I focus on leveraging Large Language Models (LLMs) to enhance the shopping experience. I completed my PhD Study in Computer Science at Stony Brook University, advised by Prof. Chao Chen. I have also been fortunate to collaborate with Professors Haibin Ling, Fusheng Wang, and Tengfei Ma.
Research Interests
My research centers on building safe, efficient, and interpretable AI systems across language, vision, and multimodal domains:
- AI Security & Trustworthiness: Backdoor attack and detection techniques for Language Models (e.g., BERT variations), Vision Models (e.g., ViT) and Vision-Language Models (e.g., CLIP, BLIP-2, LLaVA, etc).
- Efficient Multimodal Learning: Token compression and resource-efficient model design for whole slide pathology image analysis using vision-language models like LLaVA.
- Model Explainability in Clinical AI: Interpretable modeling of electronic health records using transformers and multimodal fusion.
- LLM-based Personalization: Designing and deploying end-to-end infrastructure for online personalized navigation models, with a focus on LLMs training, real-time signal compression, and customer understanding within Amazon shopping experience.
News
- 2025-08: Successfully defended my PhD thesis!
- 2025-06: I’m honored to receive the prestigious Catacosinos Fellowship, awarded annually to outstanding PhD students at Stony Brook University - only three winners this year!
- 2025-05: Two papers accepted to ACL 2025! Congrats to all collaborators! CalD, led by Chenlu and Prof. Ritwik Banerjee, proposes an efficient framework for detecting deviant or nuanced language using smaller models. RPA Evaluation, led by Chaoran, Bingsheng and Prof. Dakuo Wang, presents a comprehensive guideline for evaluating LLM-based role-playing agents.
- 2025-01: Three papers are accepted by ICLR 2025, including one first-authored paper: VLOOD!
- 2024-07: My first-authored paper, TrojVLM, is accepted by ECCV 2024! We investigate the vulnerabilities in the generative capabilities of Vision-Language Models, with a focus on image captioning and visual question answering (VQA) tasks.
- 2024-07: One paper is accepted by WACV 2025!
- 2024-06: My first-authored paper, BadCLM, is nominated as the Best Student Paper by AMIA 2024! We investigate the clinical language model’s vulnerabilities.
- 2024-03: One first-authored paper is accepted by NAACL 2024! We introduce a task-agnostic method for detecting textual backdoors, targeting a range of language models and traditional NLP tasks.
- 2023-10: My first-authored TAL is accepted by EMNLP 2023!
- 2023-03: Two papers are accepted by ICLR 2023 Workshop on BANDS!
- 2022-10: Paper “An Integrated LSTM-HeteroRGNN Model for Interpretable Opioid Overdose Risk Prediction” is accepted by Artificial Intelligence in Medicine!
- 2022-06: One first-authored paper is nominated as the Best Student Paper by AMIA 2022! We propose a multimodal transformer to fuse clinical notes and traditional EHR data for interpretable mortality prediction. AMIA is the world’s premier meeting for research and practice of biomedical and health informatics.
- 2022-04: One first-authored paper “A Study of the Attention Abnormality in Trojaned BERTs” is accepted by NAACL 2022!
- 2020-09: Start my Computer Science Ph.D. journey at Stony Brook University!
Industry Experience

Amazon, Seattle, USA (May 2025 - Present)
Applied Scientist
- Currently working on LLM-based personalization solutions to enhance customer understanding and improve navigation relevance within Amazon's shopping experience.
- Focus areas include building end-to-end modeling and training infrastructure for personalized navigation models in an online setting, and data-driven personalization signals compression.
Applied Scientist
- Currently working on LLM-based personalization solutions to enhance customer understanding and improve navigation relevance within Amazon's shopping experience.
- Focus areas include building end-to-end modeling and training infrastructure for personalized navigation models in an online setting, and data-driven personalization signals compression.

Amazon, Seattle, USA (May 2024 - May 2025)
Applied Scientist Intern
- Led the development of the first-generation seller foundation model, a transformer decoder-based LLM designed for seller-related data, with a strong emphasis on tabular inputs and numerical formats.
- Proposed and implemented an end-to-end two-stage training pipeline, including continuous pre-training (LM loss + attribution-level contrastive loss) and downstream fine-tuning (classification, multi-task, regression).
- Built the full modeling and training infrastructure.
- Delivered production-ready model now running in shadow mode, pre-staged for full launch.
Applied Scientist Intern
- Led the development of the first-generation seller foundation model, a transformer decoder-based LLM designed for seller-related data, with a strong emphasis on tabular inputs and numerical formats.
- Proposed and implemented an end-to-end two-stage training pipeline, including continuous pre-training (LM loss + attribution-level contrastive loss) and downstream fine-tuning (classification, multi-task, regression).
- Built the full modeling and training infrastructure.
- Delivered production-ready model now running in shadow mode, pre-staged for full launch.
Selected Publications
Full publications can be found in Google Scholar.
Conference/Workshop/Journal

Efficient Whole Slide Pathology VQA via Token Compression
Weimin Lyu, Qingqiao Hu, Kehan Qi, Zhan Shi, Wentao Huang, Saumya Gupta, Chao Chen
In Submission
[ArXiv]
Weimin Lyu, Qingqiao Hu, Kehan Qi, Zhan Shi, Wentao Huang, Saumya Gupta, Chao Chen
In Submission
[ArXiv]

Class Distillation with Mahalanobis Contrast: An Efficient Training Paradigm for Pragmatic Language Understanding Tasks
Chenlu Wang, Weimin Lyu, Ritwik Banerjee
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)
[ACL]
Chenlu Wang, Weimin Lyu, Ritwik Banerjee
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)
[ACL]

Towards a Design Guideline for RPA Evaluation: A Survey of Large Language Model-Based Role-Playing Agents
Chaoran Chen, Bingsheng Yao, Ruishi Zou, Wenyue Hua, Weimin Lyu, Toby Jia-Jun Li, Dakuo Wang
Findings of the Association for Computational Linguistics: ACL 2025 (ACL 2025)
[ACL]
Chaoran Chen, Bingsheng Yao, Ruishi Zou, Wenyue Hua, Weimin Lyu, Toby Jia-Jun Li, Dakuo Wang
Findings of the Association for Computational Linguistics: ACL 2025 (ACL 2025)
[ACL]

Geometry of Long-Tailed Representation Learning: Rebalancing Features for Skewed Distributions
Lingjie Yi, Jiachen Yao, Weimin Lyu, Haibin Ling, Raphael Douady, Chao Chen
The Thirteenth International Conference on Learning Representations (ICLR 2025)
[ICLR]
Lingjie Yi, Jiachen Yao, Weimin Lyu, Haibin Ling, Raphael Douady, Chao Chen
The Thirteenth International Conference on Learning Representations (ICLR 2025)
[ICLR]

ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Language
Yuxin Wang, Xiaomeng Zhu*, Weimin Lyu*, Saeed Hassanpour, Soroush Vosoughi
The Thirteenth International Conference on Learning Representations (ICLR 2025)(Spotlight)
[ICLR]
Yuxin Wang, Xiaomeng Zhu*, Weimin Lyu*, Saeed Hassanpour, Soroush Vosoughi
The Thirteenth International Conference on Learning Representations (ICLR 2025)(Spotlight)
[ICLR]

Backdooring Vision-Language Models with Out-Of-Distribution Data
Weimin Lyu, Jiachen Yao, Saumya Gupta, Lu Pang, Tao Sun, Lingjie Yi, Lijie Hu, Haibin Ling, Chao Chen
The Thirteenth International Conference on Learning Representations (ICLR 2025)
[ICLR]
Weimin Lyu, Jiachen Yao, Saumya Gupta, Lu Pang, Tao Sun, Lingjie Yi, Lijie Hu, Haibin Ling, Chao Chen
The Thirteenth International Conference on Learning Representations (ICLR 2025)
[ICLR]

PivotAlign: Improve Semi-Supervised Learning by Learning Intra-Class Heterogeneity and Aligning with Pivots
Lingjie Yi, Tao Sun, Yikai Zhang, Songzhu Zheng, Weimin Lyu, Haibin Ling, Chao Chen
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025)
[WACV]
Lingjie Yi, Tao Sun, Yikai Zhang, Songzhu Zheng, Weimin Lyu, Haibin Ling, Chao Chen
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025)
[WACV]

TrojVLM: Backdoor Attack Against Vision Language Models
Weimin Lyu, Lu Pang, Tengfei Ma, Haibin Ling, Chao Chen
The 18th European Conference on Computer Vision (ECCV 2024)
[ECCV]
Weimin Lyu, Lu Pang, Tengfei Ma, Haibin Ling, Chao Chen
The 18th European Conference on Computer Vision (ECCV 2024)
[ECCV]

BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records
Weimin Lyu, Zexin Bi, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2024) (Best Student Paper Finalist)
[arXiv]
Weimin Lyu, Zexin Bi, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2024) (Best Student Paper Finalist)
[arXiv]

Task-Agnostic Detector for Insertion-Based Backdoor Attacks
Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen
The Findings of 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)
[NAACL]
Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen
The Findings of 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)
[NAACL]

Attention-Enhancing Backdoor Attacks Against BERT-based Models
Weimin Lyu, Songzhu Zheng, Lu Pang, Haibin Ling, Chao Chen
The Findings of 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) (A short version is accepted as Oral at ICLR 2023 Workshop on BANDS)
[EMNLP][Code]
Weimin Lyu, Songzhu Zheng, Lu Pang, Haibin Ling, Chao Chen
The Findings of 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) (A short version is accepted as Oral at ICLR 2023 Workshop on BANDS)
[EMNLP][Code]

An Integrated LSTM-HeteroRGNN Model for Interpretable Opioid Overdose Risk Prediction
Xinyu Dong, Rachel Wong, Weimin Lyu, Kayley Abell-Hart, Janos G Hajagos, Richard N Rosenthal, Chao Chen, Fusheng Wang
Artificial Intelligence in Medicine (AIIM 2022)
[AIIM]
Xinyu Dong, Rachel Wong, Weimin Lyu, Kayley Abell-Hart, Janos G Hajagos, Richard N Rosenthal, Chao Chen, Fusheng Wang
Artificial Intelligence in Medicine (AIIM 2022)
[AIIM]

A Multimodal Transformer: Fusing Clinical Notes With Structured EHR Data for Interpretable In-Hospital Mortality Prediction
Weimin Lyu, Xinyu Dong, Rachel Wong, Songzhu Zheng , Kayley Abell-Hart, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2022) (Best Student Paper Finalist)
[AMIA][Code]
Weimin Lyu, Xinyu Dong, Rachel Wong, Songzhu Zheng , Kayley Abell-Hart, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2022) (Best Student Paper Finalist)
[AMIA][Code]