About Me
I am an Applied Scientist at Amazon Search, working on search and navigation - designing LLM- and agent-based approaches that operate at scale. My current work focuses on personalized LLM systems, including post-training, personalization effectiveness, agentic evaluation, self-evolving agent, and latency-aware model serving. I bridge research and customer-facing production by developing LLM-powered systems that improve how customers express intent, query understanding and reformulation.
I received my Ph.D. in Computer Science from Stony Brook University, advised by Prof. Chao Chen. I received my M.S Degree from Chinese Academy of Sciences, and my B.S degree from Shandong University. My doctoral research studied how training objectives shape model behavior and reliability, with a focus on backdoor attacks and defenses, attention behavior, concept manipulation, and vulnerability analysis for language and vision-language models. I also work on efficient multimodal training and explainability, including clinical AI and whole-slide pathology vision-language modeling.
Across my Ph.D. and industry work, my research path centers on model behavior: how it fails, how it can be controlled, and how it should be evaluated.
News
- 2026-05: My co-first-authored paper Towards Representation Backdoor on CLIP via Concept Confusion is accepted by TMLR! We propose backdoor attack method by concept manipulation on CLIP.
- 2026-05: Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations is accepted by CVPR Workshop. We investigate the long-tail backdoor attack with data augmentation.
- 2026-04: OPERA is accepted by ACL Main 2026! We propose a benchmark dataset for online shopping action simulation.
- 2026-02: CVPR 2026 accepted: Act Like a Pathologist: Tissue-Aware Whole Slide Image Reasoning, focusing on whole slides image efficiency and reasoning.
- 2025-08: Successfully defended my PhD thesis!
- 2025-06: I’m honored to receive the prestigious Catacosinos Fellowship, awarded annually to outstanding PhD students at Stony Brook University - only three winners this year!
- 2025-05: Two papers accepted to ACL 2025! CalD, led by Chenlu and Prof. Ritwik Banerjee, proposes an efficient framework for detecting deviant or nuanced language using smaller models. RPA Evaluation, led by Chaoran, Bingsheng and Prof. Dakuo Wang, presents a comprehensive guideline for evaluating LLM-based role-playing agents.
- 2025-01: Three papers are accepted by ICLR 2025, including one first-authored paper: VLOOD! VLOOD proposes a backdoor attack training method that works on Out-of-Distribution data.
- 2024-07: My first-authored paper, TrojVLM, is accepted by ECCV 2024! We investigate the vulnerabilities in the generative capabilities of Vision-Language Models, and propose a backdoor training algorithm, with a focus on image captioning and visual question answering (VQA) tasks.
- 2024-07: One paper is accepted by WACV 2025!
- 2024-06: My first-authored paper, BadCLM, is nominated as the Best Student Paper by AMIA 2024! We investigate the clinical language model’s vulnerabilities.
- 2024-03: One first-authored paper is accepted by NAACL 2024! We introduce a task-agnostic method for detecting textual backdoors, targeting a range of language models and traditional NLP tasks.
- 2023-10: My first-authored TAL is accepted by EMNLP 2023! We enhance backdoor attack by manipulating the attention mechanism.
- 2023-03: Two papers are accepted by ICLR 2023 Workshop on BANDS!
- 2022-10: Paper “An Integrated LSTM-HeteroRGNN Model for Interpretable Opioid Overdose Risk Prediction” is accepted by Artificial Intelligence in Medicine!
- 2022-06: One first-authored paper is nominated as the Best Student Paper by AMIA 2022! We propose a multimodal transformer to fuse clinical notes and traditional EHR data for interpretable mortality prediction.
- 2022-04: One first-authored paper “A Study of the Attention Abnormality in Trojaned BERTs” is accepted by NAACL 2022!
- 2020-09: Start my Computer Science Ph.D. journey at Stony Brook University!
Industry Experience
Applied Scientist @ Amazon Search
- Design LLM systems for personalized shopping search and navigation, focusing on post-training, personalization effectiveness, and agentic evaluation.
Applied Scientist Intern
- Built the first-gen LLM Seller Foundation Model for seller-risk automation, exploring how LLMs can jointly model textual, numerical signals.
- Designed continuous pre-training and multi-task post-training to align representations and improve downstream prediction performance.
- Delivered a production-oriented foundation-model prototype.
Selected Publications
This publication list is outdated. Full and latest publications can be found in Google Scholar.
Conference/Workshop/Journal
Weimin Lyu, Qingqiao Hu, Kehan Qi, Zhan Shi, Wentao Huang, Saumya Gupta, Chao Chen
In Submission
[ArXiv]
Chenlu Wang, Weimin Lyu, Ritwik Banerjee
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)
[ACL]
Chaoran Chen, Bingsheng Yao, Ruishi Zou, Wenyue Hua, Weimin Lyu, Toby Jia-Jun Li, Dakuo Wang
Findings of the Association for Computational Linguistics: ACL 2025 (ACL 2025)
[ACL]
Lingjie Yi, Jiachen Yao, Weimin Lyu, Haibin Ling, Raphael Douady, Chao Chen
The Thirteenth International Conference on Learning Representations (ICLR 2025)
[ICLR]
Yuxin Wang, Xiaomeng Zhu*, Weimin Lyu*, Saeed Hassanpour, Soroush Vosoughi
The Thirteenth International Conference on Learning Representations (ICLR 2025)(Spotlight)
[ICLR]
Weimin Lyu, Jiachen Yao, Saumya Gupta, Lu Pang, Tao Sun, Lingjie Yi, Lijie Hu, Haibin Ling, Chao Chen
The Thirteenth International Conference on Learning Representations (ICLR 2025)
[ICLR]
Lingjie Yi, Tao Sun, Yikai Zhang, Songzhu Zheng, Weimin Lyu, Haibin Ling, Chao Chen
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025)
[WACV]
Weimin Lyu, Lu Pang, Tengfei Ma, Haibin Ling, Chao Chen
The 18th European Conference on Computer Vision (ECCV 2024)
[ECCV]
Weimin Lyu, Zexin Bi, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2024) (Best Student Paper Finalist)
[arXiv]
Weimin Lyu, Xiao Lin, Songzhu Zheng, Lu Pang, Haibin Ling, Susmit Jha, Chao Chen
The Findings of 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)
[NAACL]
Weimin Lyu, Songzhu Zheng, Lu Pang, Haibin Ling, Chao Chen
The Findings of 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) (A short version is accepted as Oral at ICLR 2023 Workshop on BANDS)
[EMNLP][Code]
Xinyu Dong, Rachel Wong, Weimin Lyu, Kayley Abell-Hart, Janos G Hajagos, Richard N Rosenthal, Chao Chen, Fusheng Wang
Artificial Intelligence in Medicine (AIIM 2022)
[AIIM]
Weimin Lyu, Xinyu Dong, Rachel Wong, Songzhu Zheng , Kayley Abell-Hart, Fusheng Wang, Chao Chen
American Medical Informatics Association Annual Symposium (AMIA 2022) (Best Student Paper Finalist)
[AMIA][Code]