Gordon Bruce C. – BASIS Scottsdale

BASIS Scottsdale Senior Gordon Bruce C.

Hypergraph Knowledge Representation of Genomic Anomaly Detection

Project/blog link: Hypergraph Knowledge Representation of Genomic Anomaly Detection
BASIS Advisor: Mr. Paul McClernon
Internship location: Systems Imagination, Inc.
Onsite Mentor: Mr. David Schneider, IT Director

Project Abstract

The National Cancer Institute’s Cancer Genome Atlas project recently cataloged clinical and genomic data for over ten thousand patients across 33 cancer types. The genomic analysis consists of data on every biological molecule involved in genetic expression, from DNA, to RNA, to the proteins that are ultimately created. While potentially extremely useful, the sheer volume of data (over 450 terabytes) makes it cumbersome for humans to draw significant conclusions about what genetic factors or combinations of factors seem to cause these cancers. The bioinformatics company Systems Imagination proposes one solution to this problem involving the use of hypergraph-based data modeling. Hypergraphs consist of vertices and “hyperedges” which each connect two or more vertices. They are very useful for representing complex, layered data, such as the genomic data used in this project. This data system is developed and applied to bladder cancer data with the goal of representing and visualizing cancer-causing variables, thus allowing for improved cancer analysis and ultimately prevention.

Hypergraph Knowledge Representation of Genomic Anomaly Detection