Trustworthy Machine Learning

We study the core topics regarding trustworthy machine learning, with a focus on the statistical understructure of various topics under this umbrella topic, and also branching out the central understanding to solve multiple problems in topics such as robustness, causality, and interpretability, with a main application domain of vision.

Computational Biology

We study computational biology because every progress we make has the potential to free millions from suffering.

We are devoted to developing methods that help understand the genetic basis of human complex traits. Our previous studies mostly focus on Alzheimer's disease and cancer.

Toolkit Development

We develop softwares for two purposes: 1) we always seek to deliver our innovations for domain experts to use free of any technical barriers; 2) we believe the trustworthiness of ML, by its definition, has users involved, thus incorporating users in the loop will strengthen the trustworthiness of methods. We are recruiting talents for Robustar now.

Recent News

Updates

April. 2024
I gave an talk at Trustworthiness of Large Language Models, CS@Virgia Tech, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
April. 2024
I gave an talk at Recent developments in large language models for biology, EMBL-EBI Workshop, titled "A Team of AI-made Scientists from LLMs for Scientific Discovery from Transcriptomics" [Slides]
April. 2024
I gave an talk at AI Talks, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
March. 2024
I gave an talk at Large Language Model Reading Group at UIUC, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
March. 2024
I gave an talk at Computer Science Department at UIUC, titled "Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research" [Slides]
March. 2024
I gave an talk at Biostatistics & Medical Informatics Seminar at University of Wisconsin-Madison, titled "Advancing Precision Medicine: Tailored Genomic Insights and AI-Driven Automation in Complex Disease Research" [Slides]
Feb. 2024
I gave an talk at Spatial Genomics Semina at UIUC, titled "Deep Learning Methods to Navigate Heterogeneous Data Landscapes for Genetic Insights" [Slides]
Jan. 2024
I gave an talk at VALSE: Vision and Learning Seminar, titled "Guardian of Trust in Language Models: Automatic Jailbreak and Systematic Defense" [Slides]
Dec. 2023
I gave an talk at Stanford AI+Health Seminar, titled "Understanding Structural Patterns for Early-Diagnosis of Alzheimer’s Disease" [Slides]
Oct. 2023
I gave an talk at Champaign-Urbana Data Science User Group, titled "Role Playing to Improve the Trustworthiness of Large Language Models" [Slides]
Aug. 2023
We gave a Tutorial on Trustworthy Machine Learning at KDD 2023 [overview]
June 2023
I gave a Tutorial on Trustworthy ML on Biomedical Computing at IEEE International Conference on Healthcare Informatics [Slides]
March. 2022
We got "Best Paper Honorable Mention" at WSDM 2023 on our work Efficiently leveraging multi-level user intent for session-based recommendation via atten-mixer network
Dec. 2022
I gave an talk at “Xia Peisu Young Scholars Forum” at Chinese Academy of Sciences Institute of Computing Technology, titled "Toward a Principled Understanding of Trustworthy Methods Machine Learning" [Slides]
Oct. 2022
I gave an keynote talk at TrustLog WorkShop at CIKM 2023, titled "Causality and Multiple Aspects of Trustworthy Machine Learning" [Slides]
Aug. 2022
I started my appointment as an assistant professor at iSchool at UIUC
Aug. 2022
We presented our work on learning robust and invariant representations with data augmentation at KDD 2022. [Slides][Poster]
Aug. 2022
We presented our work on a unified theme of robust machine learning titled toward learning human-aligned robust models at UAI 2022. [Poster]
July 2022
We released the initial version of our software, Robustar, a GUI software that helps the user to indentify spurious features [Video][Code]
July 2022
I gave an invited talk on Trustworthy AI-diagnosis of Alzheimer's Disease from MRI by Stanford University CNS lab [Slides]
July 2022
I gave an invited talk on A Principled Unverstanding of Robust Machine Learning Methods by RIKEN Center for Advanced Intelligence Project [Video][Slides]
June 2022
We presented our work on The Two Dimensions of Worst-case Training and the Integrated Effect for OOD Generalization at CVPR 2022. [Poster]
May 2022
We presented our work on Gene Set Prioritization Guided by Regulatory Networks with p-values through KMM at RECOMB 2022. [Slides][Software]
April 2022
I'm recognized as one of the top 50 AI+X rising young scholars by Baidu. Inc.
Dec. 2021
I gave an invited talk on Toward Trustworthy Machine Learning to Understand the Personalized Genetic Basis of Alzheimer's Disease by Department of Bioinformatics at University of Pittsburgh [Slides]
Dec. 2021
I defended my thesis on Toward Robust Machine Learning by Countering Superficial Features at LTI CMU [Thesis][Slides]

Recent Talks

July. 2023
Tutorial in Trustworthy Machine Learning at 2023 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining [Slides]
July. 2023
Robust Machine Learning Techniques Helps Understand Alzheimer’s Disease Subtypes at HealthX Club of Shanghai Jiao Tong University Seminar [Slides]
June. 2023
Tutorial in Trustworthy Machine Learning for Biomecial Challenges at 2023 IEEE International Conference on Healthcare Informatic [Slides]
Oct. 2022
Toward a Principled Understanding of Trustworthy Methods Machine Learning at “Xia Peisu Young Scholars Forum” at Chinese Academy of Sciences Institute of Computing Technology [Slides]
Oct. 2022
Causality and Multiple Aspects of Trustworthy Machine Learning at Keynote, TrustLog WorkShop at CIKM [Slides]
Sept. 2022
Toward the Alignment Between Machine Learning Models’ and Users’ Perceptions of Data at Champaign-Urbana Data Science User Group, UIUC [Slides]
Sept. 2022
Toward a Principled Understanding of Robust ML and Its Connection to Multiple Aspects at Doctoral ProSeminar at iSchool, UIUC [Slides]
Aug 2022
Toward a Principled Understanding of Robust Machine Learning Methods and Its Connection to Multiple Aspects at Seminar of Machine Learning and NLP (MLNLP seminar series) [Slides]
July 2022
Trustworthy AI-diagnosis of Alzheimer's Disease from MRI at Stanford University CNS lab [Slides]
July 2022
A Principled Unverstanding of Robust Machine Learning Methods and Its Connections to Multiple Methods at RIKEN Center for Advanced Intelligence Project [Video][Slides]
Dec. 2021
Toward Trustworthy Machine Learning to Understand the Personalized Genetic Basis of Alzheimer's Disease at Department of Bioinformatics at University of Pittsburgh [Slides]
Mar. 2021
Robust Machine Learning with Emphasis on Countering Spurious Features at Data Science Initiative at Brown University
Jan. 2021
High-frequency Component Helps Explain the Generalization of CNN at Aggregate Intellect [Video][Slides]
Nov. 2020
Towards Trustworthy Machine Learning Inspired by High-frequency Data at Robotics Institute at Carnegie Mellon University [Slides]
July. 2020
Toward Trustworthy Machine Learning for Scientific Discovery at Doctoral Symposium at ACM Conference on Health, Inference, and Learning
April 2020
A Brief Overview of Trustworthy Machine Learning at Probabilistic Graphic Models lecture at Carnegie Mellon University [Slides]
Feb. 2020
Learning Deconfounded Representations through Neural Networks, with Applications in Genetic Data at Center of Excellence for Computational Drug Abuse Research [Slides]
Sept. 2019
Dealing with Confounding Factors in Deep Learning at Next Generation in Biomedicien Symposium at the Broad Institute [Slides]
Sept. 2019
Deep Learning over Heterogeneous Data at Department of Bioinformatics at University of Pittsburgh [Slides]

About Me

Haohan Wang is an assistant professor in the School of Information Sciences at the University of Illinois Urbana-Champaign. His research focuses on the development of trustworthy machine learning methods for computational biology and healthcare applications, such as decoding the genomic language of Alzheimer's disease. In his work, he uses statistical analysis and deep learning methods, with an emphasis on data analysis using methods least influenced by spurious signals. Wang earned his PhD in computer science through the Language Technologies Institute of Carnegie Mellon University where he works with Professor Eric Xing. In 2019, Wang was recognized as the Next Generation in Biomedicine by the Broad Institute of MIT and Harvard because of his contributions in dealing with confounding factors with deep learning.

The Chinese spelling of the name is 汪浩瀚