Scientific Computing Platform
The Scientific computing platform employs modern data analysis techniques to extract scientific insights from large datasets. This is accomplished through computer programming, which allows the systematic application of statistical and mathematical methods to perform principled manipulations on the source data. Often, this takes the form of identifying simple underlying structures that are latent within complex high-dimensional data. Once identified, these structures can provide useful ways to understand the data such as clustering related observations or identifying an underlying axis of variation between healthy and diseased tissues. In order to scale these analyses to match very large data sets, the platform implements analyses as failure-robust workflows which are deployed onto cloud computing infrastructure. A recent example is the creation of an atlas of human retinal cell types. Here, the underlying data was billions of short RNA sequences derived from a complex tissue. From these sequences, the platform identified over a quarter million individual cells and determined which genes those cells were expressing. The resulting ~7 billion observations were further analyzed to identify that 65 types of cells were present, each expressing a distinct set of genes. And by mapping known disease-associated genes to these types, the platform could associate human eye diseases with specific types of cells.
News & Insights