Research Interests

We focus on AI methods development for computational biology and healthcare applications, especially recent Agentic AI development and its potential in scientific discovery. We believe one of the most important factors in data-driven biomedical applications is that the learned knowledge is not just spefic to datasets or experiment protocols.

I am constantly talking to motivated students interested in the below topics, either about the admission or virtual research collaborations. Feel free to reach out.

For my publication records, please check [google scholar page] or the often belately updated [publication page]

Agentic AI and Its Security

We study the core ML topics such as agentic AI and its security issues, with a focus on the building novel agentic AI architecture, enabling the LLMs to automate and function in complex data scenarios. We also study the issues of its security, protecting AI from being maliciously used.

Computational Biology

We study computational biology because every progress we make has the potential to free millions from suffering.

We are devoted to developing methods that help understand the genetic basis of human complex traits. Our previous studies mostly focus on Alzheimer's disease and cancer.

Toolkit Development

We develop softwares for two purposes: 1) we always seek to deliver our innovations for domain experts to use free of any technical barriers; 2) we believe scientific discovery topic is the best leveraged when there are actual users uing it, and generating real-world relevant hypothesis. We are recruiting talents for MyDataPilot now.

Recent News

Updates
Mar. 2025	I gave a short talk at NIH CFDE meeting, titled "Spatially Varying Cell-specific Gene Regulation Network Inference"
Feb. 2025	I gave a short talk at NAIRR, titled "GenoAgent: LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians"
July. 2024	I gave an invited talk at NIH/NLM, titled "Towards LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians" [Slides]
April. 2024	I gave an talk at Trustworthiness of Large Language Models, CS@Virgia Tech, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
April. 2024	I gave an talk at Recent developments in large language models for biology, EMBL-EBI Workshop, titled "A Team of AI-made Scientists from LLMs for Scientific Discovery from Transcriptomics" [Slides]
April. 2024	I gave an talk at AI Talks, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
March. 2024	I gave an talk at Large Language Model Reading Group at UIUC, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
March. 2024	I gave an talk at Computer Science Department at UIUC, titled "Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research" [Slides]
March. 2024	I gave an talk at Biostatistics & Medical Informatics Seminar at University of Wisconsin-Madison, titled "Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research" [Slides]
Feb. 2024	I gave an talk at Spatial Genomics Semina at UIUC, titled "Deep Learning Methods to Navigate Heterogeneous Data Landscapes for Genetic Insights" [Slides]
Jan. 2024	I gave an talk at VALSE: Vision and Learning Seminar, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
Dec. 2023	I gave an talk at Stanford AI+Health Seminar, titled "Understanding Structural Patterns for Early-Diagnosis of Alzheimer’s Disease" [Slides]
Oct. 2023	I gave an talk at Champaign-Urbana Data Science User Group, titled "Role Playing to Improve the Trustworthiness of Large Language Models" [Slides]
Aug. 2023	We gave a Tutorial on Trustworthy Machine Learning at KDD 2023 [overview]
June 2023	I gave a Tutorial on Trustworthy ML on Biomedical Computing at IEEE International Conference on Healthcare Informatics [Slides]
March. 2022	We got "Best Paper Honorable Mention" at WSDM 2023 on our work Efficiently leveraging multi-level user intent for session-based recommendation via atten-mixer network
Dec. 2022	I gave an talk at “Xia Peisu Young Scholars Forum” at Chinese Academy of Sciences Institute of Computing Technology, titled "Toward a Principled Understanding of Trustworthy Methods Machine Learning" [Slides]
Oct. 2022	I gave an keynote talk at TrustLog WorkShop at CIKM 2023, titled "Causality and Multiple Aspects of Trustworthy Machine Learning" [Slides]
Aug. 2022	I started my appointment as an assistant professor at iSchool at UIUC
Aug. 2022	We presented our work on learning robust and invariant representations with data augmentation at KDD 2022. [Slides][Poster]
Aug. 2022	We presented our work on a unified theme of robust machine learning titled toward learning human-aligned robust models at UAI 2022. [Poster]
July 2022	We released the initial version of our software, Robustar, a GUI software that helps the user to indentify spurious features [Video][Code]
July 2022	I gave an invited talk on Trustworthy AI-diagnosis of Alzheimer's Disease from MRI by Stanford University CNS lab [Slides]
July 2022	I gave an invited talk on A Principled Unverstanding of Robust Machine Learning Methods by RIKEN Center for Advanced Intelligence Project [Video][Slides]
June 2022	We presented our work on The Two Dimensions of Worst-case Training and the Integrated Effect for OOD Generalization at CVPR 2022. [Poster]
May 2022	We presented our work on Gene Set Prioritization Guided by Regulatory Networks with p-values through KMM at RECOMB 2022. [Slides][Software]
April 2022	I'm recognized as one of the top 50 AI+X rising young scholars by Baidu. Inc.
Dec. 2021	I gave an invited talk on Toward Trustworthy Machine Learning to Understand the Personalized Genetic Basis of Alzheimer's Disease by Department of Bioinformatics at University of Pittsburgh [Slides]
Dec. 2021	I defended my thesis on Toward Robust Machine Learning by Countering Superficial Features at LTI CMU [Thesis][Slides]

With #MLCB happening, I hope to share a line of work we’ve been developing over the past couple of years:

🧬 Agentic AI systems for genomic discovery.

A series of algorithms, pipelines, frameworks, and tools. pic.twitter.com/lsC7jfM0Yf
— Haohan Wang (@HaohanWang) September 10, 2025

Recent Talks
Mar. 2025	Spatially Varying Cell-specific Gene Regulation Network Inference at NIH Common Fund Data Ecosystem (CFDE) Meeting, Bethesda [Slides]
Mar. 2025	Toward Agentic AI Scientist for Biomedical Discovery at “Generative AI and the Future of Research” CIRSS Speaker Series, UIUC [Slides]
Feb. 2025	GenoAgent: LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians at NAIRR Pilot, Washington, D.C. [Slides]
Jan. 2025	Toward Agentic AI Scientist for Biomedical Discovery at Reading Group, Australian National University (Virtual) [Slides]
Dec. 2024	AI Powered Transformations in Finance: Toward Underwriting Automation at Southwestern Ohio Winter Meeting, Cincinnati [Slides]
July. 2024	Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense at Computer Science Seminar, University of Illinois, Chicago (Virtual) [Slides]
July. 2024	Towards LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians at NIH/NLM (Virtual) [Slides]
Jun. 2024	Understanding Variations in Regulatory Networks Across Cell Types through Transformer Model with Knowledge on Regulatory Interactions at Genetics and Epigenetics Cross-Cutting Research Team Meeting, NIDA, NIH, Bethesda [Slides]
May. 2024	Technical Advances of Jailbreaks at Analytics Lunch & Learn, Protiviti (Virtual) [Slides]
Apr. 2024	Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense at Trustworthiness of Large Language Models, CS@Virginia Tech (Virtual) [Slides]
Apr. 2024	A Team of AI-made Scientists from LLMs for Scientific Discovery from Transcriptomics at EMBL-EBI Workshop, Boston [Slides]
Apr. 2024	Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense at AI Talks (Virtual) [Slides]
Mar. 2024	Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense at Large Language Model Reading Group, UIUC [Slides]
Mar. 2024	Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research at Computer Science Department, UIUC [Slides]
Feb. 2024	Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research at Biostatistics & Medical Informatics Seminar, University of Wisconsin-Madison [Slides]
Feb. 2024	Deep Learning Methods to Navigate Heterogeneous Data Landscapes for Genetic Insights at Spatial Genomics Seminar, Carl R. Woese Institute for Genomic Biology, UIUC [Slides]
Jan. 2024	Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense at VALSE: Vision and Learning Seminar (Virtual) [Slides]
Dec. 2023	Understanding Structural Patterns for Early-Diagnosis of Alzheimer’s Disease at Stanford AI+Health Seminar (Virtual) [Slides]
Oct. 2023	Role Playing to Improve the Trustworthiness of Large Language Models at Champaign-Urbana Data Science User Group, Champaign, IL [Slides]
July. 2023	Tutorial in Trustworthy Machine Learning at 2023 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining [Slides]
July. 2023	Robust Machine Learning Techniques Helps Understand Alzheimer’s Disease Subtypes at HealthX Club of Shanghai Jiao Tong University Seminar [Slides]
June. 2023	Tutorial in Trustworthy Machine Learning for Biomecial Challenges at 2023 IEEE International Conference on Healthcare Informatic [Slides]
Oct. 2022	Toward a Principled Understanding of Trustworthy Methods Machine Learning at “Xia Peisu Young Scholars Forum” at Chinese Academy of Sciences Institute of Computing Technology [Slides]
Oct. 2022	Causality and Multiple Aspects of Trustworthy Machine Learning at Keynote, TrustLog WorkShop at CIKM [Slides]
Sept. 2022	Toward the Alignment Between Machine Learning Models’ and Users’ Perceptions of Data at Champaign-Urbana Data Science User Group, UIUC [Slides]
Sept. 2022	Toward a Principled Understanding of Robust ML and Its Connection to Multiple Aspects at Doctoral ProSeminar at iSchool, UIUC [Slides]
Aug 2022	Toward a Principled Understanding of Robust Machine Learning Methods and Its Connection to Multiple Aspects at Seminar of Machine Learning and NLP (MLNLP seminar series) [Slides]
July 2022	Trustworthy AI-diagnosis of Alzheimer's Disease from MRI at Stanford University CNS lab [Slides]
July 2022	A Principled Unverstanding of Robust Machine Learning Methods and Its Connections to Multiple Methods at RIKEN Center for Advanced Intelligence Project [Video][Slides]
Dec. 2021	Toward Trustworthy Machine Learning to Understand the Personalized Genetic Basis of Alzheimer's Disease at Department of Bioinformatics at University of Pittsburgh [Slides]
Mar. 2021	Robust Machine Learning with Emphasis on Countering Spurious Features at Data Science Initiative at Brown University
Jan. 2021	High-frequency Component Helps Explain the Generalization of CNN at Aggregate Intellect [Video][Slides]
Nov. 2020	Towards Trustworthy Machine Learning Inspired by High-frequency Data at Robotics Institute at Carnegie Mellon University [Slides]
July. 2020	Toward Trustworthy Machine Learning for Scientific Discovery at Doctoral Symposium at ACM Conference on Health, Inference, and Learning
April 2020	A Brief Overview of Trustworthy Machine Learning at Probabilistic Graphic Models lecture at Carnegie Mellon University [Slides]
Feb. 2020	Learning Deconfounded Representations through Neural Networks, with Applications in Genetic Data at Center of Excellence for Computational Drug Abuse Research [Slides]
Sept. 2019	Dealing with Confounding Factors in Deep Learning at Next Generation in Biomedicien Symposium at the Broad Institute [Slides]
Sept. 2019	Deep Learning over Heterogeneous Data at Department of Bioinformatics at University of Pittsburgh [Slides]

About Me

Haohan Wang is an assistant professor in the School of Information Sciences at the University of Illinois Urbana-Champaign. His research focuses on the development of trustworthy machine learning methods for computational biology and healthcare applications, such as decoding the genomic language of Alzheimer's disease. In his work, he uses statistical analysis and deep learning methods, with an emphasis on data analysis using methods least influenced by spurious signals. Wang earned his PhD in computer science through the Language Technologies Institute of Carnegie Mellon University where he works with Professor Eric Xing. In 2019, Wang was recognized as the Next Generation in Biomedicine by the Broad Institute of MIT and Harvard because of his contributions in dealing with confounding factors with deep learning.

The Chinese spelling of the name is 汪浩瀚

Haohan Wang

Assistant Professor

School of Information Sciences

Siebel School of Computing and Data Science

Carl R. Woese Institute for Genomic Biology

National Center for Supercomputing Applications

University of Illinois Urbana-Champaign

Email: haohanw at illinois.edu

Tel: (217) 333-3280

Office: RM 4123, 614 E. Daniel St.

Research Interests

Agentic AI and Its Security

Computational Biology

Toolkit Development

Recent News

Updates

Mar. 2025

I gave a short talk at NIH CFDE meeting, titled "Spatially Varying Cell-specific Gene Regulation Network Inference"

Feb. 2025

I gave a short talk at NAIRR, titled "GenoAgent: LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians"

July. 2024

I gave an invited talk at NIH/NLM, titled "Towards LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians" [Slides]

April. 2024

I gave an talk at Trustworthiness of Large Language Models, CS@Virgia Tech, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]

April. 2024

I gave an talk at Recent developments in large language models for biology, EMBL-EBI Workshop, titled "A Team of AI-made Scientists from LLMs for Scientific Discovery from Transcriptomics" [Slides]

April. 2024

I gave an talk at AI Talks, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]

March. 2024

I gave an talk at Large Language Model Reading Group at UIUC, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]

March. 2024

I gave an talk at Computer Science Department at UIUC, titled "Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research" [Slides]

March. 2024

I gave an talk at Biostatistics & Medical Informatics Seminar at University of Wisconsin-Madison, titled "Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research" [Slides]

Feb. 2024

I gave an talk at Spatial Genomics Semina at UIUC, titled "Deep Learning Methods to Navigate Heterogeneous Data Landscapes for Genetic Insights" [Slides]

Jan. 2024

I gave an talk at VALSE: Vision and Learning Seminar, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]

Dec. 2023

I gave an talk at Stanford AI+Health Seminar, titled "Understanding Structural Patterns for Early-Diagnosis of Alzheimer’s Disease" [Slides]

Oct. 2023

I gave an talk at Champaign-Urbana Data Science User Group, titled "Role Playing to Improve the Trustworthiness of Large Language Models" [Slides]

Aug. 2023

We gave a Tutorial on Trustworthy Machine Learning at KDD 2023 [overview]

June 2023

I gave a Tutorial on Trustworthy ML on Biomedical Computing at IEEE International Conference on Healthcare Informatics [Slides]

March. 2022

We got "Best Paper Honorable Mention" at WSDM 2023 on our work Efficiently leveraging multi-level user intent for session-based recommendation via atten-mixer network

Dec. 2022

I gave an talk at “Xia Peisu Young Scholars Forum” at Chinese Academy of Sciences Institute of Computing Technology, titled "Toward a Principled Understanding of Trustworthy Methods Machine Learning" [Slides]

Oct. 2022

I gave an keynote talk at TrustLog WorkShop at CIKM 2023, titled "Causality and Multiple Aspects of Trustworthy Machine Learning" [Slides]

Aug. 2022

I started my appointment as an assistant professor at iSchool at UIUC

Aug. 2022

We presented our work on learning robust and invariant representations with data augmentation at KDD 2022. [Slides][Poster]

Aug. 2022

We presented our work on a unified theme of robust machine learning titled toward learning human-aligned robust models at UAI 2022. [Poster]

July 2022

We released the initial version of our software, Robustar, a GUI software that helps the user to indentify spurious features [Video][Code]

July 2022

I gave an invited talk on Trustworthy AI-diagnosis of Alzheimer's Disease from MRI by Stanford University CNS lab [Slides]

July 2022

I gave an invited talk on A Principled Unverstanding of Robust Machine Learning Methods by RIKEN Center for Advanced Intelligence Project [Video][Slides]

June 2022

We presented our work on The Two Dimensions of Worst-case Training and the Integrated Effect for OOD Generalization at CVPR 2022. [Poster]

May 2022

We presented our work on Gene Set Prioritization Guided by Regulatory Networks with p-values through KMM at RECOMB 2022. [Slides][Software]

April 2022

I'm recognized as one of the top 50 AI+X rising young scholars by Baidu. Inc.

Dec. 2021

I gave an invited talk on Toward Trustworthy Machine Learning to Understand the Personalized Genetic Basis of Alzheimer's Disease by Department of Bioinformatics at University of Pittsburgh [Slides]

Dec. 2021

I defended my thesis on Toward Robust Machine Learning by Countering Superficial Features at LTI CMU [Thesis][Slides]

Recent Open Collaborations

We are exploring a new form of research collaborations by uniting the scholars of the trustworthy ML and Computational Biology community globally for discussion and collaborations.

Reach out if you are interested

Talks and Publications

Recent Talks

Mar. 2025