Project 1: NETZEN
Two major difficulties in computational biology are: (i) How to set a threshold cutoff level to maximize sensitivity while minimizing the false discovery rate?, and (ii) How to integrate ranking parameters known individually to influence network hierarchy to maximize predictive accuracy?
To resolve these difficulties, we have developed a suite of tandem platforms, NETZEN v.1 includes 3 main engines, GeneRep, nSCORE, and TeraView that integrate precise and comprehensive global gene network generation with an automated node importance scoring framework with limitless sets of parameters and comprehensive fully annotated visualizing platform and thus applicable to any type of networks and node statistics inputs. NETZEN v.3
Project 2: Deep Learning in Biology
Recently, advances in artificial intelligence (AI) have made tremendous breakthroughs in many areas of science. In many cases, AI systems surpass human capability in complex pattern recognition tasks, especially in medical imaging and diagnostic data such as radiographic images, EKG, histologic slides.
We are developing advanced AI systems, particularly deep learning algorithms to solve complex problems in science and medicine, in close integration with our network analysis platform. Four major areas of our interests in AI in translational medical research are:
- Precision Medicine: We are interested in developing a novel system to identify best fit drugs with the best probable response for cancer patients using integrated data: next generation DNA sequencing, RNA sequencing, imaging, histology and electronic health records. The uniqueness of our approach is a powerful combination of our innovative network analysis and deep learning algorithm.
- Target identification: We are creating the CELL deep learning model that will integrate omics data (RNAseq, microarray expression profiles) with scientific literature using Natural Language Processing algorithm. This model would allow us to interrogate cells using computers (e.g. in silico knock down) to identify the causal gene(s) of a particular disease.
- Acceleration of drug development: We are studying virtual drug screening using deep learning networks to identify new drugs or to re-purpose existing drugs for new indications.
- Discovery applications: We are developing AI engines to address fundamental questions in biology, such as in silico ChipSeq, predictor of binding sites of new transcription factors based on their DNA sequences, predictor of tertiary protein folding based on primary amino acid sequence.