API Reference
This section documents the MCBO Python package API.
Installation
pip install -e python/
After installation, import the package:
from mcbo import (
# Namespaces
MCBO, OBO, RDF, RDFS, XSD,
RO_HAS_PARTICIPANT, RO_HAS_QUALITY, BFO_HAS_PART,
# Graph utilities
create_graph, load_graph, load_graphs,
iri_safe, safe_numeric,
ensure_dir, ensure_parent_dir,
# CSV conversion
convert_csv_to_rdf,
load_expression_matrix,
add_expression_data,
)
mcbo.namespaces
RDF namespace definitions used throughout MCBO.
- MCBO: rdflib.Namespace
The MCBO ontology namespace:
http://example.org/mcbo#
- OBO: rdflib.Namespace
The OBO Foundry namespace:
http://purl.obolibrary.org/obo/
- RO_HAS_PARTICIPANT: rdflib.URIRef
Relation Ontology “has participant” (RO_0000057)
- RO_HAS_QUALITY: rdflib.URIRef
Relation Ontology “has quality” (RO_0000086)
- BFO_HAS_PART: rdflib.URIRef
BFO “has part” relation (BFO_0000051)
mcbo.graph_utils
Utilities for creating and loading RDF graphs.
- create_graph() rdflib.Graph
Create a new RDF graph with MCBO namespaces pre-bound.
- Returns:
An empty rdflib Graph with standard namespace bindings
- load_graph(path: Path) rdflib.Graph
Load an RDF graph from a Turtle file.
- Parameters:
path – Path to the TTL file
- Returns:
The loaded rdflib Graph
- load_graphs(*paths: Path) rdflib.Graph
Load and merge multiple RDF graphs.
- Parameters:
paths – Paths to TTL files to merge
- Returns:
A single merged rdflib Graph
- iri_safe(text: str) str
Convert text to an IRI-safe identifier.
Replaces spaces and special characters with underscores.
- Parameters:
text – Input text
- Returns:
IRI-safe string
- safe_numeric(value, default=None)
Safely convert a value to a number.
- Parameters:
value – Value to convert (string, int, float, or None)
default – Default value if conversion fails
- Returns:
Converted number or default
- ensure_dir(path: Path) Path
Ensure a directory exists, creating it if necessary.
- Parameters:
path – Directory path
- Returns:
The path (for chaining)
- ensure_parent_dir(path: Path) Path
Ensure the parent directory of a file path exists.
- Parameters:
path – File path
- Returns:
The path (for chaining)
mcbo.csv_to_rdf
CSV to RDF conversion for bioprocessing metadata.
- convert_csv_to_rdf(csv_file: str, output_file: str, expression_matrix: str = None, expression_dir: str = None) rdflib.Graph
Convert a CSV metadata file to RDF instances.
- Parameters:
csv_file – Path to input CSV file
output_file – Path for output TTL file
expression_matrix – Optional path to expression matrix CSV
expression_dir – Optional directory with per-study expression CSVs
- Returns:
The generated rdflib Graph
- load_expression_matrix(path: Path) pandas.DataFrame
Load a gene expression matrix from CSV.
The CSV should have SampleAccession as the first column and gene symbols as remaining columns.
- Parameters:
path – Path to expression matrix CSV
- Returns:
DataFrame with expression data
- add_expression_data(graph: rdflib.Graph, sample_uri: rdflib.URIRef, expression_df: pandas.DataFrame, sample_accession: str)
Add gene expression measurements to a sample in the graph.
Creates GeneExpressionMeasurement instances for each gene-sample pair.
- Parameters:
graph – The RDF graph to add to
sample_uri – URI of the sample
expression_df – DataFrame with expression data
sample_accession – Sample accession ID to look up in the DataFrame
Usage Examples
Creating a Graph from Scratch
from mcbo import create_graph, MCBO, RDF, RDFS
# Create a new graph with MCBO namespaces
g = create_graph()
# Add a cell line instance
cell_line = MCBO["CHO-K1"]
g.add((cell_line, RDF.type, MCBO.CellLine))
g.add((cell_line, RDFS.label, Literal("CHO-K1")))
# Serialize to file
g.serialize("my_instances.ttl", format="turtle")
Loading and Querying Graphs
from mcbo import load_graph
from pathlib import Path
# Load evaluation graph
g = load_graph(Path("data.sample/graph.ttl"))
# Run a SPARQL query
query = """
PREFIX mcbo: <http://example.org/mcbo#>
SELECT ?process ?type WHERE {
?process a ?type .
?type rdfs:subClassOf* mcbo:CellCultureProcess .
}
"""
results = g.query(query)
for row in results:
print(f"{row.process} is a {row.type}")
Converting CSV to RDF
from mcbo import convert_csv_to_rdf
# Convert metadata with expression data
g = convert_csv_to_rdf(
csv_file=".data/sample_metadata.csv",
output_file=".data/mcbo-instances.ttl",
expression_dir=".data/expression/"
)
print(f"Generated {len(g)} triples")
Module Reference
For complete API documentation, see the source code in python/mcbo/:
namespaces.py- RDF namespace definitionsgraph_utils.py- Graph loading/creation utilitiescsv_to_rdf.py- CSV-to-RDF conversion logicbuild_graph.py- Graph building CLIrun_eval.py- SPARQL evaluationstats_eval_graph.py- Statistics generation