True Data Fabric vs. Data Lakehouse
This document provides a technically detailed comparison between LegalFab’s Knowledge Fabric and Microsoft's Fabric, including Microsoft’s November 2025 Fabric IQ announcement. The core argument is architectural: LegalFab implements a true data fabric pattern — keeping data in place, federating queries across sources, and building intelligence through a persistent knowledge graph with built-in entity resolution — while Microsoft Fabric implements a data lakehouse pattern that centralizes data into OneLake. Fabric IQ, Microsoft’s recent semantic intelligence layer, adds ontology and graph capabilities on top of this lakehouse, but does not change the underlying architecture: data still moves to OneLake, and entity resolution still requires third-party tools.
Key Distinction: Architecture Defines Capability
Microsoft does not primarily market Fabric as a “data fabric.” Their own documentation positions it as a “unified analytics platform” built on a lakehouse architecture. The name “Fabric” is a product brand, not an architectural descriptor. The November 2025 Fabric IQ announcement adds a semantic layer for AI agent grounding, but the foundational pattern remains: data is ingested into OneLake rather than federated in place. LegalFab, by contrast, implements the data fabric pattern as defined by Gartner: automated metadata management, knowledge graph–driven intelligence, data virtualization, and federated governance across distributed sources.
Microsoft Fabric is a SaaS analytics platform built on a data lakehouse architecture. Its foundational storage layer, OneLake, is logically unified but physically backed by Azure Data Lake Storage Gen2 (ADLS Gen2). All data in OneLake is stored in Delta Lake format (Parquet files with ACID transaction logs). Multiple compute engines — Spark, T-SQL (Synapse), Analysis Services (Power BI) — read from the same physical OneLake copy, avoiding interengine duplication. This is engine virtualization, not source data virtualization.
The primary data integration pattern is ETL/ELT via Data Factory (the Azure Data Factory engine rebadged inside Fabric) and Dataflows Gen2. Data is extracted from source systems, transformed, and loaded into OneLake in Delta Parquet format. This is traditional data movement — data leaves its source system and is copied into a centralized repository.
LegalFab’s Knowledge Fabric is a metadata-driven integration layer that implements the data fabric pattern. Source data remains in its original systems. The Knowledge Fabric stores only: entity resolution, cross-system relationships, annotations, insights, and temporal snapshots. Actual records — documents, transaction records, communications, files — stay in their source systems and are accessed at runtime.
At the center of the Knowledge Fabric is a persistent knowledge graph that provides corporate memory with schema-bounded extraction and source provenance tracking. The graph models entities (Persons, Organizations, Matters, Documents, Addresses, Identifiers) and relationships (OWNS, CONTROLS, RELATED_TO, EMPLOYS, REPRESENTS, LOCATED_AT, HAS_IDENTIFIER). This is not a metadata catalog — it is an active intelligence layer that resolves entities across sources, tracks lineage, and enables graph traversal for investigative queries.
Microsoft Fabric has no equivalent. Its metadata management relies on Purview — a separate Azure service, not built into Fabric’s core — that provides catalog-based governance: tagging, classification, sensitivity labels, and lineage tracking. Purview is a metadata catalog with governance workflows. It does not perform entity resolution, does not build graph-based relationships across data sources, and does not provide the semantic intelligence layer that a knowledge graph delivers.
LegalFab integrates with 200+ data sources via Model Context Protocol (MCP) connectors supporting federated queries across databases (PostgreSQL, MySQL), SaaS platforms (Salesforce, HubSpot, Slack, Gmail), cloud storage (AWS S3, Azure Blob), and legal-specific systems (Aderant, Elite, Clio).
LegalFab’s Knowledge Fabric integrates natively with open-source intelligence (OSINT) sources to support its AML compliance and due diligence capabilities. This includes: corporate registries (Companies House, SEC EDGAR), beneficial ownership databases (PSC registers, UBO databases), sanctions lists (OFAC, OFSI, UN, EU), PEP (Politically Exposed Persons) databases, adverse media monitoring, court records and legal databases, identity verification providers, and credit bureaus.
These external sources are integrated into the knowledge graph through the entity resolution engine, which performs blocking, matching, and clustering. A query about a person or organization doesn’t just search internal data — it traverses the graph across all connected OSINT sources, resolving identities and surfacing cross-source relationships that would be invisible in a centralized data store.
Microsoft Fabric lacks native OSINT capabilities. It is a general-purpose analytics platform — it does not integrate with regulatory registries, sanctions lists, or beneficial ownership databases. Any such integration would require custom ETL pipelines to ingest this data into OneLake, which means copying sensitive regulatory data into yet another location and losing the real-time, federated query capability that OSINT investigations require.
Status: Public Preview — No GA Date Announced
Fabric IQ was announced at Microsoft Ignite in November 2025 and entered public preview on November 19, 2025. As of February 2026, it has no announced general availability date. Organizations evaluating this capability should expect breaking changes and evolving feature scope. Production deployments carry the inherent risk of preview-stage software.
Fabric IQ is a semantic intelligence layer added on top of the existing Fabric lakehouse. It introduces two components: an Ontology (entity type definitions, relationship types, business rules, and constraints) and a Fabric Graph (a graph-compute layer for traversals and algorithms). Microsoft positions Fabric IQ as the foundation for “agentic AI” — grounding Copilot and custom AI agents in business-specific semantics so they can reason about concepts like “Customer” or “Order” rather than raw table names.
This is architecturally important to understand: Fabric IQ is an analytical capability layered on top of the lakehouse. It does not change where data lives (still OneLake), how data gets there (still ETL/ELT), or the fundamental centralization pattern. It adds semantic meaning to data that has already been moved into OneLake.
The Fabric IQ ontology stores entity type definitions (e.g., “Customer has properties: Name, Revenue, Segment”) and relationship type definitions (e.g., “Customer places Order”). It does not store entity instances — the actual customer records, order records, and relationship instances remain in OneLake tables. The ontology is a schema-level reference layer that points to OneLake data, not a persistent store of resolved entities.
This is a fundamental distinction. LegalFab’s knowledge graph stores resolved entity instances and their relationships persistently — it knows that “John Smith at 123 Main St” in System A is the same person as “J. Smith” in System B, and it maintains that resolution as a first-class graph node. Fabric IQ’s ontology defines what a “Person” is; it does not resolve whether two records refer to the same person.
A significant practical constraint: Fabric IQ’s AI agents can only bind to DirectLake semantic models. Power BI models using Import mode or DirectQuery mode are not supported for entity and relationship binding. Given that the vast majority of existing Power BI deployments use Import or DirectQuery, this creates a substantial adoption barrier. Organizations would need to rebuild their semantic models as DirectLake before Fabric IQ can leverage them.
Intellectual honesty demands acknowledging Microsoft’s strengths:
The comparison between LegalFab and Microsoft Fabric is not a feature-by-feature evaluation — it is an architectural philosophy debate. The question is: should data be moved to a central location for analysis (the lakehouse model), or should intelligence be built on top of data where it already lives (the data fabric model)?
For organizations in regulated industries that need to maintain data sovereignty, conduct crosssource investigations, and build persistent intelligence without moving sensitive data into yet another centralized repository, the data fabric architecture is not just preferable — it is the technically correct approach. Microsoft Fabric solves a different problem: consolidating diverse analytics workloads onto a single platform. It does this well. But calling it a “data fabric” would be architecturally inaccurate.
Microsoft Fabric consolidates your data into its platform. Fabric IQ adds semantic labels to that consolidated data. LegalFab builds intelligence on top of your data where it already lives — with entity resolution that works across every connected source, not just OneLake. One approach moves data. The other moves insight.