CS Peer Talks

Artificial Intelligence for Disease Diagnostics and Drug Discovery

  • Sheng Wang, University of Washington
  • Time: 2020-06-22 10:30-12:00
  • Host: PKU Turing Class Research Committee
  • Venue: Online Talk


The goal of modern medical science is essentially to address a machine learning problem, that is, how to convert large-scale and complicated medical datasets into knowledge. Despite the success of machine learning in other areas (e.g., image and text), there are still many unsolved medical problems, such as understanding disease mechanism (e.g., COVID-19), cancer early identification and developing new drugs. These problems seem to be independent of each other and have so far been tackled by different scientists. In this talk, I will argue that behind these different medical problems is the same machine learning challenge, that is, how to understand and predict in never-before-seen situations. In addition to powerful predictive models, what is really needed are tools that generalize well to new drugs, new diseases, and new cohorts. I will first introduce how we classify samples into never-before-seen classes by embedding noisy and large-scale directed acyclic graph, resulting in new discoveries in protein functions, cell types, and rare diseases. Next, I will introduce our solution to understand and characterize a never-before-seen cohort. Instead of finding which features are important, we answer the question of why these features are important using a novel multiscale medical knowledge graph. I will conclude with a vision of future directions for medical science, which requires collaboration between scientists from different computer science areas including robotics, security, human-computer interaction, computational design, and ubiquitous computing.


Sheng Wang ( will be starting as an assistant professor at University of Washington Paul G. Allen School of Computer Science & Engineering in Jan 2021. He is currently a postdoc in the School of Medicine at Stanford University. He is also a Chan Zuckerberg Biohub scholar. He received his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign in 2018 and B.S. in Computer Science from Peking University in 2013. He is interested in using machine learning and natural language processing to advance medical sciences. The current focus of his research is new methods for predicting and interpreting never-before-seen situations in biomedicine. His research has resulted in real-world impact in disease modeling and drug discovery, and is used by major biomedical institutions, including Chan Zuckerberg Biohub, NIH National Center for Advancing Translational Sciences, UCSD School of Medicine, China Academy of Medical Sciences, Stanford School of Medicine, and Mayo Clinic. His research is also recognized through several paper awards and fellowships, including AMIA 2020 Year-In-Review, ISMB best student paper candidate, C. L. and Jane W.-S. Liu Award, and 3M Foundation Fellowship.