The Role of Ontology in Big Cancer Data
Date: May 12-13, 2015
Venue
PLEASE NOTE CHANGE OF VENUE ON DAY 1
- Day 1 (10am-4:30pm) will be in FISHERS LANE CONFERENCE CENTER, 5635 Fisher Lane, Rockville, MD 20852
- Day 2 (9am-3pm) will be a public session in Balcony A, Natcher Conference Center, NIH Building 45, Bethesda, MD 20892
PLEASE NOTE CHANGE OF VENUE ON DAY 1
Participation
Space is limited on both days of this meeting. Access via webex will be available. Interested persons should write to Barry Smith.
Session 1: Addressing cancer big data challenges through imaging ontologies
Tuesday, May 12 in NCI Shady Grove building (9609 Medical Center Dr, Rockville, MD, 20850), Room 2W-32/34
10:00-12:45
Barry Smith (Buffalo): The cancer research ontology space: An introduction
- The goal of the meeting: to better understand the challenges involved in using big data for cancer research, and to explore the utility of ontologies in addressing these challenges.
- An introduction to existing ontology resources in the cancer domain, including NCI Thesaurus and the OBO Foundry; addressing opportunities and reasons for scepticism as concerns the use of ontologies in addressing cancer big data; topics to address will include the ontology of tissues, of tissue banks, and of image banks
- What are the desired outcomes from this meeting?
Ilya Goldberg (NIA): The role of imaging ontologies in cancer big data
- As image data accumulates descriptive metadata using controlled vocabularies (ontologies), two key applications form a key "Big Data" technology that will impact cancer research and medicine as a whole:
- 1. semantic search, which allows image retrieval independently of image content, for example, retrieval of images spanning different imaging modalities and scales.
- 2. automated annotation, which addresses the scaling problem in associating metadata with ever larger image collections by using ontological terms as categories and the images carrying them as sample data to train machine classifiers.
Metin Gurcan (Ohio State) and John Tomaszewski (University at Buffalo): How ontologies can help in addressing the big data challenges of pathology imaging
- Widespread availability of whole slide scanners has a transformative effect on pathology, resulting in data that is "big” not only in terms of size but also in richness of information. Computational algorithms have been developed to tap into this data both to assist pathologists and to improve diagnosis, prognosis and treatment. Now, however, there is an urgent need to provide a set of terms and formal definitions necessary to characterize both the histopathological images and the algorithms that operate on them. We will present in this light our on-going work to create an ontology for histopathological imaging.
- Goals:
- 1. Describe the transformative role of digital pathology
- 2. Explain the big data challenges in clinical research and computational algorithms
- 3. Outline how ontologies can provide solutions to address the big data challenges of pathology imaging
13:00 Lunch
Session 2: Addressing cancer big data challenges with the Ontology for Biomedical Investigations (OBI)
13:45-16:30
Chris Stoeckert (Penn): Integration and alignment of ontologies for cancer metadata collection based on OBI
- tasks: address the challenge that cancer research is multidisciplinary and requires standard terminology from multiple domains.
- briefly describe OBI and show how it has been used for collecting clinical and -omic metadata highlighting relevance to cancer data and integration of other ontologies for that purpose.
Gully Burns (UCSD): Applying OBI to cancer pathways via Knowledge Engineering from Experimental Design (KEfED)
- addresses the challenge that much of what is known about cancer is only available in publications and requires text mining including the experimental basis for that knowledge.
- application of OBI as semantic base for text mining and knowledge engineering.
Christopher R. Kinsinger (NCI): Ontology Considerations for Cancer Proteomics
- NCI’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) conducts proteomic and genomic analysis on clinical tissue samples. While CPTAC has utilized controlled vocabularies for biospecimen resources, development and adoption of ontology tools for rapidly advancing technologies such as proteomics and genomics remains a challenge.
Jingshan Huang (University of South Alabama / ITCR): The OMIT ontology
Session 3: Cancer big data and the Ontology of Disease: Addressing Cancer Big Data Challenges
Wednesday, May 13 in Balcony A, Natcher, 9:00-12:00
9:00 Roles of Ontologies in Cancer Big Data
Larry Wright (NCI Enterprise Vocabulary Services): NCI Thesaurus and Enterprise Vocabulary Services: Some Paths and Lessons in Coding for Cancer Research
Lynn Schriml (University of Maryland Baltimore): The Human Disease Ontology: DO_cancer_slim - a unified representation of cancer disease terms
Ada Hamosh (Hopkins): OMIM
Olivier Bodenreider (NLM): Oncology in SNOMED CT
10:00 Break
10:15 The Role of Ontology in Cancer Big Data Use Cases
Lindsay Cowell (UT Southwestern): HPV and cervical cancer data in Electronic Health Records -- A Big Data challenge
- what types of data do we need to represent? HPV and cervical cancer and the Infectious Disease Ontology; challenges involved in keeping and using large collections of samples and of sample data
Raja Mazumder (George Washington University): The need for cancer disease ontology for pan-cancer data integration and analysis.
Susan Mockus (The Jackson Laboratory, Genomic Medicine): Using the Disease Ontology to Translate Pathology Reports to NGS Clinical Reports.
Peter Elkin (University of Buffalo): Ontology-based cancer biomarker discovery
11:15 Discussion and Session Wrap Up
- Ontology/vocabulary challenges, Use case challenges
- Challenges, Needs, Action Item Solutions, short term steps, long term goals
12:00 Lunch
Public Session: Cancer Big Data to Knowledge
13:00-15:30
Barry Smith (Chair)
Philip E. Bourne (NIH / ADDS): The NIH Big Data Strategy
Cathy Wu (University of Delaware / PRO): Ontology and the Precision Medicine Initiative: The Role of OBO Foundry Ontologies in Protein-Centric Cancer Knowledge Network Discovery
Mark Musen (Stanford / NCBO): CEDAR: Making it Easier to Use Ontologies to Author Clinical Metadata
Judith Blake (GO / Jackson Lab): The Impact of Ontologies on Comparative Genomics for Cancer: The Human-Mouse Connection
Warren Kibbe (NIH / NCI): TBD
Sponsors
- National Cancer Institute Center for Biomedical Informatics and Information Technology (CBIIT)
- National Center for Biomedical Ontology (NCBO)
- National Center for Ontological Research (NCOR)
- Center for Expanded Data Annotation and Retrieval (CEDAR)
Participants
will include:
- Carol Bean (NCBO / Stanford)
- Nancy Beck (Reagan-Udall Foundation)
- Judith Blake (GO / PRO / The Jackson Laboratory)
- Evan Bolton (NIH / NLM / NCBI)
- Jonathan Bona (PRO / Buffalo)
- Philip E. Bourne (NIH / ADDS)
- Gully Burns (Information Sciences Institute, University of Southern California)
- Kisha Coa (NIH/NCI)
- Sherri de Coronado (National Cancer Institute)
- Lindsay Cowell (UT Southwestern Medical Center)
- Rina Das ((NIH/NIMHD)
- Valentina di Francesco (NIH/NHGRI)
- Rao L. Divi (NCI Division of Cancer Control and Population Sciences)
- Mary E. Dolan (The Jackson Laboratory)
- Peter Elkin (Department of Biomedical Informatics, University at Buffalo)
- Jianwen Fang (NIH/NCI)
- Gilberto Fragoso (National Cancer Institute)
- Gang Fu (NIH / NLM / NCBI)
- Ilya Goldberg (Image Informatics and Computational Biology Unit, National Institute on Aging)
- Sharmistha Ghosh-Janjigian (NIH/NCI)
- Metin Gurcan (College of Medicine, Ohio State University)
- Jingshan Huang (University of South Alabama / ITCR)
- Rebecca Jacobson (University of Pittsburgh / ITCR)
- Sonia B Jakowlew (BPRB/NCI)
- Guoqian Jiang (Mayo Clinic)
- Warren Kibbe (NIH / NCI / Disease Ontology)
- Christopher Kinsinger (NIH/NCI)
- Prasad Konka (NIH/NCI)
- Jerry Li (NIH / NCI)
- Zhengwu Lu (NIH/NCI)
- Diana Ma (Hippocampus Analytics)
- Hala R. Makhlouf (Cancer Diagnosis Program / NCI)
- Cheryl Marks (NIH/NCI)
- Anna Maria Masci (Duke)
- Raja Mazumder (Georgetown University / Protein Information Resource)
- Elvira Mitraka (University of Maryland, Baltimore)
- Susan Mockus (Jackson Laboratory for Genomic Medicine, Farmington, CT)
- Mark Musen (Stanford / National Center for Biomedical Ontology and Center for Expanded Data Annotation and Retrieval)
- Darren Natale (Georgetown University / Protein Ontology Consortium)
- Natsuko Miura (NIH/NCI)
- Lauren Neal (Booz Allen Hamilton)
- Miguel Ossandon (NCI/NIH)
- Lisa Paradis (NIH/NCI)
- Praveen Arany (NIH/NIDCR)
- Thomas Prince
- Thomas C. Radman (National Institute on Drug Abuse/NIH)
- Lynn Schriml (University of Maryland, Baltimore / Disease Ontology)
- Mukul Sherekar (NIH/NCI)
- Barry Smith (Buffalo / Open Biomedical Ontologies Foundry)
- Sriram Sridhar (Booz Allen Hamilton)
- Feng Tao (NIH/CSR)
- John Tomaszewski (Pathology and Anatomical Sciences, Buffalo)
- Eric Weitz (NIH/NCBI)
- Larry Wright (NIH/NCI)
- Tsung-Jung Wu (NIH/NCI)
- Cathy Wu (Delaware / Protein Ontology)
- Hong Yu (University of Massachusetts, Worcester)
- Wenjin J. Zheng (Center for Computational Biomedicine, University of Texas Health Science Center at Houston)