MCBO: Mammalian Cell Bioprocessing Ontology

CI/CD Documentation Status YouTube Video

MCBO is a hub-and-spoke, IOF-anchored application ontology for mammalian cell bioprocessing and RNA-seq data curation. It builds on IOF process patterns and BFO foundations, with domain-specific extensions that reference OBO ontology classes for measurement, sequencing, and biological entities.

The ontology is designed to support:

  • RNA-seq analysis and gene expression integration

  • Culture condition optimization (temperature, pH, dissolved oxygen)

  • Product development for CHO and HEK293 cell bioprocessing

  • Integration of curated bioprocessing samples from published studies

Repository: https://github.com/lewiscelllabs/mcbo

New Term Request: Please visit the GitHub Issues and select “MCBO Term Request” to submit your request.

Citation

Please cite:

Robasky, K., Morrissey, J., Riedl, M., Dräger, A., Borth, N., Betenbaugh, M. J., & Lewis, N. E. (2025). MCBO: Mammalian Cell Bioprocessing Ontology, A Hub-and-Spoke, IOF-Anchored Application Ontology. ICBO-EAST 2025.

Authors:

  • Kimberly Robasky 1,* - University of Georgia

  • James Morrissey 2 - Johns Hopkins University

  • Markus Riedl 3 - BOKU University, Vienna

  • Andreas Dräger 4 - Martin Luther University Halle-Wittenberg

  • Nicole Borth 3 - BOKU University, Vienna

  • Michael J. Betenbaugh 2 - Johns Hopkins University

  • Nathan E. Lewis 1 - University of Georgia

* Corresponding author: kimberly.robasky@uga.edu

Evaluation Summary

MCBO has been evaluated with real curated bioprocessing data:

  • 724 cell culture process instances curated from published studies

  • 326 unique bioprocess samples across culture runs

  • Process breakdown: Batch (518), Fed-batch (135), Perfusion (49), Unknown (22)

  • 100% competency question coverage with sub-second query times; synthetic sample data provided for demonstration; real data curation in progress.

Competency Questions

MCBO is evaluated against 8 competency questions (CQs):

  1. CQ1: Under what culture conditions (pH, dissolved oxygen, temperature) do the cells reach peak recombinant protein productivity?

  2. CQ2: Which cell lines have been engineered to overexpress gene Y?

  3. CQ3: Which nutrient concentrations in cell line K are most associated with viable cell density above Z at day 6 of culture?

  4. CQ4: How does the expression of gene X vary between clone A and clone B?

  5. CQ5: What pathways are differentially expressed under Fed-batch vs Perfusion in cell line K?

  6. CQ6: Which are the top genes correlated with recombinant protein productivity in the stationary phase of all experiments?

  7. CQ7: Which genes have the highest fold change between cells with viability (>90%) and those without (<50%)?

  8. CQ8: Which cell lines or subclones are best suited for glycosylation profiles required for therapeutic protein X?

All 8 CQs have SPARQL query implementations in eval/queries/.

License

MCBO is released under the MIT License. See the LICENSE file for details.

Indices and tables