Sen Fang

Ph.D. Candidate in Computer Science — NC State University · Seeking Research Internship
(+1) 984-379-1827
sfang9@ncsu.edu
tomasandersonfang.github.io · Google Scholar
Research Interests

My research lies at the intersection of large language models and software engineering, with a focus on three directions: (1) LLM-based agents for vulnerability detection — designing multi-agent systems that combine static analysis (CodeQL, Joern, CPGs) with LLM reasoning to automatically find and verify security flaws; (2) LLM robustness and reliability — evaluating and improving the consistency of code LLMs under quantization and diverse inputs; and (3) automated program repair and code optimization — leveraging fine-tuned LLMs (LoRA/QLoRA) and retrieval-augmented pipelines to fix bugs and improve code performance.

Research Highlights

ACM SIGPLAN OOPSLA 2026 Under Review
A multi-agent framework that decouples vulnerability detection into clue discovery and vulnerability reasoning. Uses Joern CPG queries for repo-level context augmentation on suspicious code lines, precisely controlling context length and accuracy for scalable detection.
ACM TOSEM Major Revision
A self-consistency-centered framework for evaluating LLM robustness in code generation. Benchmarks 20+ LLMs across quantization levels and prompt perturbations, revealing systematic reliability gaps in code LLMs with practical implications for model deployment.

Education

NC State University Aug 2024 – Present
Ph.D. in Computer Science · Advisor: Prof. Bowen Xu (SoftMax Lab)
Focus: LLM-driven vulnerability detection, AI agents for software security, LLM robustness
Central China Normal University Sep 2018 – Jun 2020
M.Sc. in Electronics & Communication Engineering · Advisor: Prof. Shaocheng Qu
Wuhan Polytechnic University Sep 2014 – Jun 2018
B.Sc. in Electronic Information Engineering Outstanding Student 2018

Research & Industry Experience

Hedra May 2025 – Aug 2025
Machine Learning Engineer Intern · Supervised by Hongwei Yi
Built petabyte-scale video data processing pipeline; curated millions of high-quality samples for post-training generative models (text-to-video, image-to-video). Collaborated with research and infrastructure teams on data quality metrics and filtering strategies.
KTH Royal Institute of Technology Mar 2023 – Mar 2024
Research Engineer · Advisor: Prof. Martin Monperrus
Co-developed RepairLLaMA (LoRA fine-tuning for program repair) and Supersonic (LLM-driven C/C++ optimization); both published in IEEE TSE. Designed code representation strategies and training pipelines. Contributed to generative-AI-based test data generation.
Macau University of Science and Technology Sep 2020 – Nov 2022
Research Assistant · Advisor: Prof. Tao Zhang
Conducted research on deep-learning-based bug report understanding (ICSE '23), code clone detection via intermediate-code graphs (IEEE TR), and pull request description generation (JSS). Published 5 first-/co-first-author papers.

Selected Publications

Vulnerability Detection & Software Security
[1] S. Fang, W. Ding, B. Xu. "AEGIS: Multi-Agent CPG-Augmented Framework for Automated Vulnerability Detection." Submitted to ACM SIGPLAN OOPSLA, 2026. Under Review
[2] Y. Huang, S. Fang, J. Li, J. Tao, B. Hu, T. Zhang. "Deep Smart Contract Intent Detection." IEEE SANER 2025. A
[3] Y. Li, S. Fang, et al. "Enhancing Android Malware Detection: The Influence of ChatGPT on Decision-centric Tasks." ACM TOSEM, 2025. A*
LLM Robustness
[4] S. Fang, W. Ding, A. Mastropaolo, B. Xu. "Smaller = Weaker? Benchmarking Robustness of Quantized LLMs in Code Generation." IEEE TSE. Major Revision A*
[5] S. Fang, W. Ding, B. Xu. "EVALOOOP: A Self-Consistency-Centered Framework for Assessing Large Language Model Robustness in Programming." ACM TOSEM. Major Revision A*
[6] Y. Li, T. Zhang, X. Luo, H. Cai, S. Fang, D. Yuan. "Do Pre-trained Language Models Indeed Understand Software Engineering Tasks?" IEEE TSE. A*
Automated Program Repair & Code Optimization
[7] S. Fang*, A. Silva*, M. Monperrus. "RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair." IEEE TSE, 2025. A* (*equal contribution)
[8] Z. Chen, S. Fang, M. Monperrus. "Supersonic: Learning to Generate Source Code Optimizations in C/C++." IEEE TSE, 2024. A*
[9] B. Baudry, K. Etemadi, S. Fang, et al. "Generative AI to Generate Test Data Generators." IEEE Software.
Mining Software Engineering & Code Intelligence
[10] S. Fang, T. Zhang, Y. Tan, H. Jiang, X. Xia, X. Sun. "RepresentThemAll: A Universal Learning Representation of Bug Reports." ICSE 2023. A*
[11] D. Yuan*, S. Fang*, T. Zhang, Z. Xu, X. Luo. "Java Code Clone Detection by Exploiting Semantic and Syntax Information from Intermediate Code-Based Graph." IEEE TR, 2022. A (*equal contribution)
[12] S. Fang, T. Zhang, Y. Tan, Z. Xu, Z. Yuan, L. Meng. "PRHAN: Automated Pull Request Description Generation Based on Hybrid Attention Network." JSS, 2022. A
[13] S. Fang*, Y. Tan*, T. Zhang, Z. Xu, H. Liu. "Effective Prediction of Bug-Fixing Priority via Weighted Graph Convolutional Networks." IEEE TR, 2021. A (*equal contribution)
[14] S. Fang, Y. Tan, T. Zhang, Y. Liu. "Self-Attention Networks for Code Search." IST, 2021. A
[15] Y. Tan, J. Chen, W. Shang, T. Zhang, S. Fang, X. Luo, Z. Chen, S. Qi. "STRE: An Automated Approach to Suggesting App Developers When to Stop Reading Reviews." IEEE TSE. A*

Professional Service

Reviewer: IEEE TSC, ACM TOSEM, IST, EAAI, JSS, ASE, AIR, JCC, and others

Honors & Awards

Qualcomm Innovation Fellowship 2026 — Finalist
North America, Final Round

Technical Expertise

Vulnerability & Program Analysis
Joern, CodeQL, SpotBugs, FindSecBugs, Code Property Graphs, AST / CFG / DFG analysis
LLM Training & Inference
PyTorch, Transformers, LoRA / QLoRA / PEFT, vLLM, DeepSpeed, SLURM, JAX
Agent & Retrieval Systems
Multi-agent orchestration, RAG pipelines, tool-augmented LLM agents
Languages & Platforms
Python, C/C++, Java, LaTeX · Linux, Jupyter, Cursor, Claude Code