How should persistent memory be structured for agents?
Is this a database problem, a retrieval problem, or a representation problem? The answer shapes whether agent memory looks more like a knowledge graph, a vector store, or something entirely new.
The questions that guide our research. Each is a doorway into a problem space we're actively exploring.
Is this a database problem, a retrieval problem, or a representation problem? The answer shapes whether agent memory looks more like a knowledge graph, a vector store, or something entirely new.
The question isn't what's theoretically irreducible, but what works consistently enough in practice to elevate into a shared building block. We're looking at patterns across agent frameworks and real deployments to find the signal.
Related: Can agent behavior be fully specified in a declarative format?
We're testing whether markdown or YAML can fully capture agent behavior, or whether control flow inevitably requires imperative code. The mdagent project is our proving ground for this boundary.
Related: What patterns are reliable enough to deserve first-class composability in agent systems?
When should the human specify precisely and when should the agent decide? This is about finding the right handoff points that feel natural rather than bureaucratic.
System design quality matters, not just output quality. We're looking for evaluation frameworks that capture elegance, maintainability, and composability — not just correctness.
When can a human look at a system's state and actually understand what it will do next? This cuts across visualization, inspection tools, and the fundamental question of how much complexity can be made transparent.
This question defines the boundary of what needs to be built into agent runtimes versus what can be left to the model. It emerged from observing that some prompt constructs genuinely expand model capabilities while others are redundant.