Happy New Year! As we return from the holidays, I want to continue the conversation I started in my last video of 2024 about context graphs and agentic memories. While the original discussion focused on causality and explainability in decision-making systems, I want to shift attention to something more fundamental and immediately actionable: data traces as the foundation for epistemological reasoning.
Beyond the Context Graph Hype
My previous article critiqued the framing around "context graphs" — not because the underlying ideas lack merit, but because they repackage decades-old concepts like causality DAGs and decision trees under new marketing terminology laden with unnecessary hype. The core idea of decision traces is valuable: we need systems that can track how decisions influence one another and learn from these causal relationships.
However, explainability and causality analysis are genuinely hard problems. What I want to highlight here is something more accessible, yet equally crucial: data traces — the provenance information that grounds our knowledge in verifiable sources.
Data Traces: The Low-Hanging Fruit
In my previous implementations of memory systems for AI agents, I consistently maintained links to the original conversations or sources where entities, relations, and graph structures were first extracted. This seemingly simple practice extends knowledge graphs beyond mere facts and entities to include the sources of those facts — creating a foundation for epistemological reasoning.
This approach transforms a knowledge graph from a collection of statements into a justified knowledge base where each assertion carries information about:
Where the information originated (conversation, document, observation)
When it was captured
How it was derived (extraction, inference, user statement)
What kind of epistemic status it holds (belief, verified fact, hypothesis, testimony)
Building an Epistemology Layer
With data traces in place, we can construct an epistemology layer that distinguishes between different types of knowledge:
Beliefs vs. Facts: Some information represents personal beliefs or subjective interpretations, while other data comes from verifiable observations or authoritative sources.
Source Quality: Different sources carry different epistemic weight. A statement from a research paper differs from casual conversation, which differs from speculative reasoning.
Temporal Validity: Knowledge can become outdated. Tracking sources allows us to reassess everything derived from a source when new information emerges or when we change our understanding of that source's reliability.
Derivation Chains: When we infer new knowledge from existing facts, maintaining the lineage lets us propagate uncertainty and revise conclusions when foundational assumptions change.
Revalidation and Learning
The real power of data traces emerges when circumstances change. If we later discover that a particular source was unreliable, or if we gain additional context that alters our interpretation, we can revalidate everything derived from that information. This isn't just about correcting errors — it's about building systems that can genuinely learn and evolve their understanding over time.
This revalidation capability becomes essential for long-lived AI agents that must maintain coherent knowledge bases over extended periods while adapting to new information and changing contexts.
Information Architecture Before Complex Analysis
Here's my key point: data traces are primarily an information architecture problem, not a complex causality analysis problem. Unlike decision traces, which require sophisticated causal inference and modeling of influence relationships, data traces simply require disciplined bookkeeping — tracking and maintaining source metadata alongside the knowledge itself.
This makes data traces the low-hanging fruit for building more sophisticated agentic memory systems. They're easier to implement than full causality tracking, yet they provide immediate value and create the foundation necessary for more advanced explainability work.
A Practical Path Forward
My suggestion is straightforward:
Build data traces first — systematically track sources for all information entering your knowledge graphs
Follow the provenance — maintain these source links as knowledge propagates through inference and reasoning
Apply epistemological analysis — categorize and evaluate sources according to their epistemic properties
Then tackle causality — with this foundation in place, decision traces and explainability become more tractable problems
Decision traces and causal explainability matter enormously for building trustworthy AI agents. But they require a foundation — and that foundation is built on knowing where our knowledge comes from and what justifies our beliefs. Data traces give us that foundation.
Start simple. Track your sources. Build your epistemology layer. The causality and explainability will follow naturally from that groundwork.
This article expands on themes from my previous work on agentic memory systems and knowledge graphs. For more on the original context graphs discussion and my critique of decision trace frameworks, see my earlier video and article.