Bioinformatics Shared Resource

Jean-Pierre A. Kocher, Ph.D.
Director

The Bioinformatics Shared Resource was created in mid-2006 to answer the increasing demand for bioinformatics support from investigators of the Cancer Center and across Mayo Clinic. The growing adoption of genomics tools to probe the cellular machinery at the molecular level and the fast development pace of 'omics' technology platform had led to a new need for data management, data analysis and data interpretation.

The mission of the Shared Resource is to develop and maintain state of the art services to help investigators take advantage of new 'omics' technologies to enable discoveries that will impact patient care. To this end, the Shared Resource has assembled a team of informatics experts specialized in the analysis of high throughput genomics, proteomics and metabolomics data. This team includes computer scientists and bioinformatics specialists with strong backgrounds in molecular biology, genetics, biophysics, chemistry and statistics. The Shared Resource also works collaboratively with the Division of Biomedical Statistics and Informatics and Experimental Cores to ensure coordination of experimental and analytics through the different phases of a project.

The Shared Resource provides services for Mayo Clinic Cancer Center investigators in the following areas:

Investigator Support - Area of Activity Service Offered
Preliminary Data for Grant
  • Sequence analysis & comparative genomics
  • Gene selection based on function
  • SNP selection and related annotations
Data Manipulation and Data Annotation
  • Data parsing and data formatting
  • Gene ID conversion
  • Tissue specificity of genes
  • Protein secretory pathway
Data Interpretation
  • Biological pathway and biological networks analysis
  • Regulatory networks analysis
Data Preprocessing Infrastructure
  • Quality control workflow for genotyping data
  • Preprocessing workflow for gene expression profile
Bioinformatics Databases, Tools and Education Programs
  • Access to public databases and maintenance of local databases
  • Access to bioinformatics applications and associated education programs
  • Shared Resource-sponsored bioinformatics seminars, courses and conferences

The Shared Resource also contributes to the development of the IT infrastructure needed to manage and analyze data. Some of these developments are done in collaboration with other IT teams at Mayo Clinic. Current activities include:

Bioinformatics IT Infrastructure Development - Area of Activity Services
Data Preprocessing Infrastructure
  • Quality control workflow for genotyping data
  • Preprocessing workflow for gene expression profile
Experimental Techniques and Analytical Methods Validation
  • Copy number variant (CNV) project
  • High throughput sequencing validation
Cancer Genomics
  • Repository System for Data Integration and Data Mining (BORA)
  • Integration of BORA with analytical workflow
Other Activities
  • Method development
  • Analytical IT infrastructure to enhance Investigators support services
  • Development and deployment of quality control procedures
  • Rochester Laboratory Information Management System (RLIMS)
  • Management of Extramural Request for Experimental Core Services
  • Contribution to the development of Mayo's Enterprise Data Trust

Facilities and Equipment

The Bioinformatics Shared Resources is part of the Division of Biomedical Statistics and Informatics. It operates in a state of the art facility including an Access Grid® enabled media room for multi-site multi-media conferencing. Additional conferencing technologies are deployed for supporting standard videoconferencing, web casting, and video on demand. Multiple OC-3 connections to Mayo Clinic group practices across all three Mayo Clinic campuses in Scottsdale, Ariz., Jacksonville, Fla. And Rochester, Minn., and DS3 connections to Internet2 Consortium members ensure fast data transfer between remote sites.

Bioinformatics takes advantage of the multiple compute resources available at Mayo including blade-server computer clusters, comprising roughly of 40 IBM Intel nodes running Linux. They are maintained by Mayo's IT server support group explicitly for moderately compute-intensive tasks. The Shared Resource also has access to a Beowulf-style Linux cluster with production research storage that is administered and maintained by the Mayo's Research Computer Facility. This cluster is designed specifically for distributed computing. It includes 40 Athlon nodes, each consisting with 1-2 gigabytes (Gb) memory, 50Gb individual storage and 1.5 terabytes shared storage. A number of smaller Xeon-based 'compute engines' are also available for ad-hoc processor-intensive tasks. The bioinformatics software commonly used by the Shared Resource are installed on this system.

Recent Publications

Following are a selection of recent publications involving staff of the Bioinformatics Shared Resource: