Knowledge graph
A governed network of identified entities, relationships, claims, semantics, provenance, and time designed to answer domain questions.
K · Knowledge Graphs deep dive
Use competency questions to model the relationships that tables and prompts cannot safely reconstruct on demand.
Core concepts
A governed network of identified entities, relationships, claims, semantics, provenance, and time designed to answer domain questions.
A formal domain model of classes/concepts, properties, relationships, constraints, and potentially inference rules.
Nodes and relationships with properties, commonly queried with Cypher/Gremlin; practical for traversal and application graphs.
Subject-predicate-object triples identified by IRIs, queried with SPARQL and supported by W3C semantic-web standards.
A precise business question the graph must answer, used to justify schema and test usefulness.
Determining which records refer to the same real entity, with explainable rules, confidence, and human correction.
Who asserted or derived a claim, from what evidence, when, with which method/version and confidence.
A new claim derived from asserted facts and governed rules; it must remain distinguishable and traceable.
Logistics example
| Competency question | Required relationships | Expected proof |
|---|---|---|
| Which closure threatens the highest-value commitment? | closure–segment–route–truck–shipment–customer–SLA | Answer traces every edge to source and valid time. |
| Can inventory be substituted? | product–equivalent–inventory–warehouse–contract–customer | Inference cites equivalence and contract policy. |
| Who may approve a hazardous-cargo reroute? | cargo–risk–policy–jurisdiction–role–person | Authorization reflects current role and policy version. |
Canonical definitions, reference codes, semantic contracts, source facts, data quality, ownership, lineage, and permitted use.
Cross-domain identity, relationships, temporal claims, ontology constraints, graph authorization, competency questions, and inference.
Technology map
| Need | Open / specialized | Databricks | Snowflake | Microsoft Fabric / Azure |
|---|---|---|---|---|
| Property graph | Neo4j, Amazon Neptune, JanusGraph, Memgraph | GraphFrames on Spark for graph processing; integrate specialized graph DB for transactional traversal | Model edges in tables or integrate a graph database/service; relational recursive queries are not a full graph platform | Azure Cosmos DB Gremlin or specialized graph; Fabric data as governed source |
| RDF/ontology | Apache Jena, RDF4J, GraphDB, Stardog; RDF/OWL/SHACL/SPARQL | Store/process triples in lakehouse or integrate RDF store; Unity Catalog governs source assets | Store triples relationally or integrate semantic store; Semantic Views are not formal OWL ontologies | Fabric IQ Ontology (Preview); use external RDF tooling for standards/interoperability needs |
| Analytical semantics | dbt Semantic Layer, Cube, metric stores | Unity Catalog metric views | Semantic Views | Power BI semantic models |
| Validation | SHACL/ShEx, graph unit tests, constraint engines | Pipeline expectations plus graph-specific validation code | DMFs/constraints plus graph-specific validation | Ontology validation capabilities plus external SHACL where standards are required |
| Agent retrieval | GraphRAG patterns, hybrid graph/vector/search | Mosaic AI retrieval with governed graph-derived context | Cortex Search/Agents with governed graph-derived context | Fabric data agents/Foundry with ontology or graph-derived context |
Choose by question: semantic models standardize metrics; ontologies formalize domain meaning; knowledge graphs operationalize connected claims. They can complement one another but are not interchangeable labels.
Implementation
Write 5–10 competency questions and prove why joins, search, or the semantic layer alone are insufficient.
Define stable entity identifiers, source-of-truth rules, aliases, matching confidence, and correction workflow.
Create the smallest ontology/schema that answers the questions; define time, provenance, cardinality, constraints, and authority.
Map governed V data to claims; preserve source, transformation, version, time, confidence, and asserted/inferred status.
Authorize nodes, relationships, attributes, queries, exports, and sensitive inferences; prevent cross-boundary leakage.
Validate constraints, golden queries, conflicts, stale facts, inference, migrations, performance, and unauthorized paths.