CFCS Youth Forum

Efficient Algorithms for Large-scale Transcriptomics

  • Mingfu Shao, Carnegie Mellon University
  • Time: 2018-04-02 16:35
  • Host: CFCS
  • Venue: Room 101, Courtyard No.5, Jingyuan


I will present modeling and algorithmic designs for the challenging problems in transcriptomics and show that efficient computational methods enable significant advances in our understanding of cell machinery. The problem is the assembly of full-length transcripts -- the collection of expressed gene products in cells -- from noisy and highly fragmented data obtained through RNA sequencing. I first formulate this problem as a graph decomposition problem, and then design an efficient algorithm for it, which can guarantee to preserve all long-range information. Integrating and assembling 7000 RNA-seq samples using this algorithm yields a more complete human transcriptome and reveals many potential novel transcripts.


Mingfu Shao is currently a Lane Fellow at Computational Biology Department, School of Computer Science, Carnegie Mellon University. He obtained his Ph.D. from EPFL (Swiss Institute of Technology, Lausanne), Switzerland. His research interests include the development of efficient algorithms for combinatorial optimization and machine-learning problems, and their applications to computational biology and precision medicine. At CMU, he works on large-scale transcriptomics; he has developed a new transcript assembler called Scallop. His Ph.D. research focused on comparative genomics. He was awarded the prestigious Dimitris N. Chorafas foundation award for his contribution in designing innovative algorithms for problems in genome evolution.