Heading

Jun 5, 2025

Technical Architecture Comparison

LegalFab Knowledge Fabric vs. Microsoft Fabric

True Data Fabric vs. Data Lakehouse

Executive Summary

This document provides a technically detailed comparison between LegalFab’s Knowledge Fabric and Microsoft's Fabric, including Microsoft’s November 2025 Fabric IQ announcement. The core argument is architectural: LegalFab implements a true data fabric pattern — keeping data in place, federating queries across sources, and building intelligence through a persistent knowledge graph with built-in entity resolution — while Microsoft Fabric implements a data lakehouse pattern that centralizes data into OneLake. Fabric IQ, Microsoft’s recent semantic intelligence layer, adds ontology and graph capabilities on top of this lakehouse, but does not change the underlying architecture: data still moves to OneLake, and entity resolution still requires third-party tools.

Key Distinction: Architecture Defines Capability
Microsoft does not primarily market Fabric as a “data fabric.” Their own documentation positions it as a “unified analytics platform” built on a lakehouse architecture. The name “Fabric” is a product brand, not an architectural descriptor. The November 2025 Fabric IQ announcement adds a semantic layer for AI agent grounding, but the foundational pattern remains: data is ingested into OneLake rather than federated in place. LegalFab, by contrast, implements the data fabric pattern as defined by Gartner: automated metadata management, knowledge graph–driven intelligence, data virtualization, and federated governance across distributed sources.

1. What Microsoft Fabric Actually Is

Microsoft Fabric is a SaaS analytics platform built on a data lakehouse architecture. Its foundational storage layer, OneLake, is logically unified but physically backed by Azure Data Lake Storage Gen2 (ADLS Gen2). All data in OneLake is stored in Delta Lake format (Parquet files with ACID transaction logs). Multiple compute engines — Spark, T-SQL (Synapse), Analysis Services (Power BI) — read from the same physical OneLake copy, avoiding interengine duplication. This is engine virtualization, not source data virtualization.

How Data Gets Into OneLake

The primary data integration pattern is ETL/ELT via Data Factory (the Azure Data Factory engine rebadged inside Fabric) and Dataflows Gen2. Data is extracted from source systems, transformed, and loaded into OneLake in Delta Parquet format. This is traditional data movement — data leaves its source system and is copied into a centralized repository.

2. LegalFab Knowledge Fabric: Architecture

LegalFab’s Knowledge Fabric is a metadata-driven integration layer that implements the data fabric pattern. Source data remains in its original systems. The Knowledge Fabric stores only: entity resolution, cross-system relationships, annotations, insights, and temporal snapshots. Actual records — documents, transaction records, communications, files — stay in their source systems and are accessed at runtime.

The Persistent Knowledge Graph

At the center of the Knowledge Fabric is a persistent knowledge graph that provides corporate memory with schema-bounded extraction and source provenance tracking. The graph models entities (Persons, Organizations, Matters, Documents, Addresses, Identifiers) and relationships (OWNS, CONTROLS, RELATED_TO, EMPLOYS, REPRESENTS, LOCATED_AT, HAS_IDENTIFIER). This is not a metadata catalog — it is an active intelligence layer that resolves entities across sources, tracks lineage, and enables graph traversal for investigative queries.

Microsoft Fabric has no equivalent. Its metadata management relies on Purview — a separate Azure service, not built into Fabric’s core — that provides catalog-based governance: tagging, classification, sensitivity labels, and lineage tracking. Purview is a metadata catalog with governance workflows. It does not perform entity resolution, does not build graph-based relationships across data sources, and does not provide the semantic intelligence layer that a knowledge graph delivers.

Federated Connectivity: 200+ MCP Connectors

LegalFab integrates with 200+ data sources via Model Context Protocol (MCP) connectors supporting federated queries across databases (PostgreSQL, MySQL), SaaS platforms (Salesforce, HubSpot, Slack, Gmail), cloud storage (AWS S3, Azure Blob), and legal-specific systems (Aderant, Elite, Clio).

3. OSINT and External Intelligence

LegalFab’s Knowledge Fabric integrates natively with open-source intelligence (OSINT) sources to support its AML compliance and due diligence capabilities. This includes: corporate registries (Companies House, SEC EDGAR), beneficial ownership databases (PSC registers, UBO databases), sanctions lists (OFAC, OFSI, UN, EU), PEP (Politically Exposed Persons) databases, adverse media monitoring, court records and legal databases, identity verification providers, and credit bureaus.

These external sources are integrated into the knowledge graph through the entity resolution engine, which performs blocking, matching, and clustering. A query about a person or organization doesn’t just search internal data — it traverses the graph across all connected OSINT sources, resolving identities and surfacing cross-source relationships that would be invisible in a centralized data store.

Microsoft Fabric lacks native OSINT capabilities. It is a general-purpose analytics platform — it does not integrate with regulatory registries, sanctions lists, or beneficial ownership databases. Any such integration would require custom ETL pipelines to ingest this data into OneLake, which means copying sensitive regulatory data into yet another location and losing the real-time, federated query capability that OSINT investigations require.

4. Fabric IQ: Microsoft’s Semantic Layer (Announced November 2025)

Status: Public Preview — No GA Date Announced
Fabric IQ was announced at Microsoft Ignite in November 2025 and entered public preview on November 19, 2025. As of February 2026, it has no announced general availability date. Organizations evaluating this capability should expect breaking changes and evolving feature scope. Production deployments carry the inherent risk of preview-stage software.

What Fabric IQ Actually Is

Fabric IQ is a semantic intelligence layer added on top of the existing Fabric lakehouse. It introduces two components: an Ontology (entity type definitions, relationship types, business rules, and constraints) and a Fabric Graph (a graph-compute layer for traversals and algorithms). Microsoft positions Fabric IQ as the foundation for “agentic AI” — grounding Copilot and custom AI agents in business-specific semantics so they can reason about concepts like “Customer” or “Order” rather than raw table names.

This is architecturally important to understand: Fabric IQ is an analytical capability layered on top of the lakehouse. It does not change where data lives (still OneLake), how data gets there (still ETL/ELT), or the fundamental centralization pattern. It adds semantic meaning to data that has already been moved into OneLake.

The Ontology: Definitions, Not Instances

The Fabric IQ ontology stores entity type definitions (e.g., “Customer has properties: Name, Revenue, Segment”) and relationship type definitions (e.g., “Customer places Order”). It does not store entity instances — the actual customer records, order records, and relationship instances remain in OneLake tables. The ontology is a schema-level reference layer that points to OneLake data, not a persistent store of resolved entities.

This is a fundamental distinction. LegalFab’s knowledge graph stores resolved entity instances and their relationships persistently — it knows that “John Smith at 123 Main St” in System A is the same person as “J. Smith” in System B, and it maintains that resolution as a first-class graph node. Fabric IQ’s ontology defines what a “Person” is; it does not resolve whether two records refer to the same person.

DirectLake-Only Limitation

A significant practical constraint: Fabric IQ’s AI agents can only bind to DirectLake semantic models. Power BI models using Import mode or DirectQuery mode are not supported for entity and relationship binding. Given that the vast majority of existing Power BI deployments use Import or DirectQuery, this creates a substantial adoption barrier. Organizations would need to rebuild their semantic models as DirectLake before Fabric IQ can leverage them.

5. Head-to-Head Architectural Comparison

Dimension	LegalFab Knowledge Fabric	Microsoft Fabric
Architecture Pattern	Data Fabric — metadata-driven, federated, data stays in place	Data Lakehouse — centralized OneLake storage, Delta Lake format
Data Location	Remains in source systems; only metadata/mappings stored centrally	Ingested into OneLake via ETL/ELT; shortcuts offer limited read-only federation
Knowledge Graph	Persistent graph with entity resolution, relationship modeling, and provenance tracking	No native knowledge graph; relies on Purview separate service for metadata catalog
Entity Resolution	Built-in: blocking, matching, clustering across all sources	Not available; would require custom implementation
OSINT Integration	Native: sanctions, PEP, corporate registries, beneficial ownership, adverse media, court records	None; general-purpose analytics platform with no regulatory source connectors
Data Integration	200+ MCP connectors with federated queries; two-way data flow (read + write back)	Data Factory ETL/ELT; Dataflows Gen2; shortcuts for limited sources (read-only)
Data Duplication	Eliminated by design — single source of truth remains in source systems	Reduced between engines (shared OneLake copy), but data is still duplicated from the source
Metadata Approach	Active metadata: continuous analysis, enrichment, discovery, quality monitoring, lineage	Passive metadata scanning via Admin APIs; Purview provides catalog and lineage
Vendor Lock-In	Low — data stays in existing systems; provider-agnostic LLM layer (LightLLM router)	High — data consolidated in OneLake; deep Microsoft stack dependency Power BI, Azure, M365
AI Layer	Provider-agnostic: OpenAI, Anthropic, Google, Llama, Mistral via LightLLM router with fallback	Microsoft Copilot integration, primarily Azure OpenAI
Deployment	SaaS, dedicated cloud, customer cloud, on-premises, hybrid, air-gapped	SaaS only (Azure-hosted); no on-premises or air-gapped option
Graph Type	Persistent knowledge graph storing resolved entity instances and relationships	Ontology (type definitions) + Graph compute layer (references OneLake data)

6. Honest Assessment: Where Each Platform Excels

Where LegalFab Has Clear Advantages

True data-in-place architecture: LegalFab’s metadata-driven federation keeps source data where it lives. This is architecturally fundamental — not a feature, but the design philosophy. Microsoft Fabric’s shortcuts offer a partial approximation but are limited to read-only access for a subset of cloud storage sources.
Knowledge graph intelligence: The persistent knowledge graph with cross-source entity resolution is a genuine capability gap. Microsoft has no equivalent in Fabric. Purview provides governance metadata, not semantic intelligence.
OSINT and regulatory intelligence: Native integration with sanctions, PEP, corporate registries, and adverse media sources — federated through the knowledge graph — is a domain-specific differentiator that Microsoft cannot match without significant custom development.
Deployment flexibility: On-premises, air-gapped, hybrid, and customer-cloud deployment options. Microsoft Fabric is SaaS-only on Azure. For regulated industries with data sovereignty requirements, this is decisive.

Where Microsoft Fabric Has Advantages

Intellectual honesty demands acknowledging Microsoft’s strengths:

Ecosystem breadth: Deep integration across Power BI, Azure, Microsoft 365, Teams, and Copilot. For organizations already embedded in the Microsoft stack, the integration is seamless.
General-purpose analytics: Fabric provides data science (Spark notebooks), real-time analytics, data warehousing, and BI visualization in a single platform. LegalFab is purpose-built for legal/compliance — it is not a general-purpose analytics workbench.
Open format commitment: Delta Lake on Parquet is an open-source format. While vendor lock-in exists at the platform level, the data format itself is portable. This partially mitigates, but does not eliminate, the lock-in concern.

7. The Fundamental Difference

The comparison between LegalFab and Microsoft Fabric is not a feature-by-feature evaluation — it is an architectural philosophy debate. The question is: should data be moved to a central location for analysis (the lakehouse model), or should intelligence be built on top of data where it already lives (the data fabric model)?

LegalFab Approach

Data stays in source systems
Persistent knowledge graph stores resolved entities
Federated queries at runtime
No data duplication by design

Microsoft Fabric Approach

Data copied into centralized OneLake
Fabric IQ ontology defines entity types (preview)
Entity resolution via Quantexa (third-party)
Purview catalog for metadata/governance
Queries run against OneLake copies
Reduced inter-engine duplication only

For organizations in regulated industries that need to maintain data sovereignty, conduct crosssource investigations, and build persistent intelligence without moving sensitive data into yet another centralized repository, the data fabric architecture is not just preferable — it is the technically correct approach. Microsoft Fabric solves a different problem: consolidating diverse analytics workloads onto a single platform. It does this well. But calling it a “data fabric” would be architecturally inaccurate.

Microsoft Fabric consolidates your data into its platform. Fabric IQ adds semantic labels to that consolidated data. LegalFab builds intelligence on top of your data where it already lives — with entity resolution that works across every connected source, not just OneLake. One approach moves data. The other moves insight.

Table Of Contents