Inputs¶
Ontobio is designed to work with either local files or with remote information accessed via Services.
Access is generally mediated using a factory object. The client requests an ontology via a handle to the factory, and the factory will return with the relevant implementation instantiated.
Local JSON ontology files¶
You can load an ontology from disk (or a URL) that conforms to the obographs JSON standard.
Command line example:
ogr.py -r path/to/my/file.json
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("/path/to/my/file.json")
Local OWL and OBO-Format files¶
Requirement: OWLTools
Command line example:
ogr.py -r path/to/my/file.owl
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("/path/to/my/file.owl")
Local SKOS RDF Files¶
SKOS is an RDF data model for representing thesauri and terminologies.
See the SKOS primer for more details.
Command line example:
ogr.py -r path/to/my/skosfile.ttl
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("skos:/path/to/my/skosfile.ttl")
Remote SPARQL ontology access¶
The default SPARQL service used is the OntoBee one, which provides access to all OBO library ontologies
Warning
May change in future
Command line example:
ogr.py -r cl
Note that the official OBO library prefix must be used, e.g. cl
,
go
, hp
. See http://obofoundry.org/
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("cl")
Remote SciGraph ontology access¶
Warning
Experimental
Command line example:
ogr.py -r scigraph:ontology
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("scigraph:ontology")
Warning
Since SciGraph contains multiple graphs interwoven together, care must be taken on queries that don’t use relationship types, as ancestor/descendant lists may be large
Local GAF or GPAD association files¶
The ontobio.AssociationSet
class provides a lightweight way
of storing sets of associations.
Code example: parse all associations from a GAF, and filter according to provider:
p = GafParser()
assocs = p.parse(open(POMBASE,"r"))
pombase_assocs = [a for a in assocs if a['provided_by'] == 'UniProt']
Code example, creating AssociationSet objects, using an AssociationSetFactory
afactory = AssociationSetFactory()
aset = afactory.create_from_file(file=args.assocfile,ontology=ont)
Remote association access via GOlr¶
GOlr is the name given to the Solr instance used by the Gene Ontology and Planteome projects. This has been generalized for use with the Monarch Initiative project.
GOlr provides fast access and faceted search on top of Associations (see the Basic Concepts section for more on the concept of associations). Ontobio provides both a transparent facade over GOlr, and also direct access to advanced queries.
By default an eager loading strategy is used: given a set of query criteria (minimally, subject and object categories plus a taxon, but optionally including evidence etc), all asserted pairwise associations are loaded into an association set. E.g.
aset = afactory.create(ontology=ont,
subject_category='gene',
object_category='function',
taxon=MOUSE)
Additionally, this is cached so future calls will not invoke the service overhead.
For performing advanced analytic queries over the complete GOlr
database, see the GolrAssociationQuery
class. TODO
provide examples.
Remote association access via wikidata¶
TODO
Use of caching¶
When using remote services to access ontology or association set objects, caching is automatically used to avoid repeated access. Currently an eager strategy is used, in which large blocks are fetched in advance, though in future lazy strategies are optionally employed.
To be implemented¶
- Remote access to SciGraph/Neo4J
- Remote access to Chado databases
- Remote access to Knowledge Beacons