Dairy farming is being intensively computerized, whereby the goal is to use the recorded
data to optimize production processes. This requires extensive analytics, which needs a
good understanding of the data. It is also necessary that the datasets be federated to be
able to get an integrated view. Although conventional database tools are helpful in that
process, it is believed that linked data and ontologies can provide seamless integration
of different sources while providing a semantic layer allowing deeper introspection of
data. The objective was to build an ontology to provide such a semantic layer to dairy
herd improvement (DHI) data.
A large dataset of milk production data was provided by Lactanet, Canadian Network
for Dairy Excellence. This data is typically heterogeneous, i.e., covering partially or
thoroughly health, nutrition, yield, and genetics. It also possesses a complex structure,
with a large variety of data for a unique animal, dispersed in many records and multiple
tables. A dedicated domain ontology, referred to as the Dairy Cattle Performance
Ontology (DCPO), was built from a semantic analysis of the datasets. The initial core
set of entities was determined using the definitions and minimal attribute sets for traits
provided by ICAR guidelines and CDN documents. This core was gradually enriched
with lower-level entities and aligned to more abstract concepts from the Basic Formal
Ontology (BFO) to provide a foundational theory. The process was validated by domain
experts. DCPO provides a rich and extensible data schema, a vocabulary based on
international standards to support stakeholder collaboration. It federates external data
sources and provides a semantic interface to query the obtained integrated linked
data. Finally, DCPO underlies a knowledge base supporting analytics and decision
making. Preliminary evaluation followed a query-based approach: SPARQL queries
were designed reflecting typical questions experts might ask to assess the practical
usability of DCPO.
Mining structural regularities, or patterns, in data may lead an expert to discover
unknown phenomena or to confirm an already formulated hypothesis. The benefit of
using DCPO as vocabulary for patterns is to enable seemingly unrelated yet isomorphic
sub-graphs in the data with diverging vertex and edge labels, to become identical
once their labels are generalized to DCPO classes and properties. Key benefit thereof
was the patterns were described using the domain expert language to increase their
interpretability. Next, we plan to use the ontology to support the deep learning-based
inference of predictive models for milk production.
See more...
Description
DCPO is an ontology for representing dairy farming processes, the cattle performance indicators used to improve them and the associated recording procedures.
See more...
Initial created on
November 27, 2024.
For additional information, contact
Victor Fuentes (fuentes.victor_eduardo@courrier.uqam.ca).
DCPO uses a three-tiered semantic architecture with BFO as foundational or upper-level ontology, CCO as core or mid-level ontologies and DCPO as the domain-level ontology deriving its semantics from the upper levels.