Obsidian as My Thinking Backend for Cursor

Every Cursor session starts cold. The agent has no idea who you are, what you're building, what decisions you've already made, or how you like to write. You either re-explain everything, paste in a wall of context, or accept that you're collaborating with someone with permanent amnesia.

I wanted something different. A vault that carries my identity, preferences, project state, and process procedures into any AI session automatically. Not just Cursor. Claude, Gemini, whatever comes next. The files stay; the tools rotate.

If you've read my earlier post on I Stopped Telling My AI to Remember Things, this builds on that work. Same vault philosophy. Different problem, and a much broader solution.

This is how I built that.

Building on QMD + Obsidian (and what's different)

In April I wrote about pairing my Obsidian vault with QMD inside OpenClaw. That setup solved a real problem: memory retrieval. QMD runs hybrid search (BM25 + vector similarity + reranking) over the vault every five minutes, so when I asked "what did we discuss about X," relevant notes surfaced without me copy-pasting them. OpenClaw's heartbeat and auto-flush kept new context flowing into the vault between sessions.

It worked. For OpenClaw. In one container. For one agent.

The Ideaverse setup I run today is the next layer. Same Obsidian vault as the storage layer. Different architecture on top, and a wider job description.

	QMD + OpenClaw (April)	Ideaverse + Cursor (now)
Primary job	Surface relevant notes from past context	Carry identity, structure, procedures, and project state into every session
Retrieval	Hybrid search (BM25 + vectors + reranking)	Keyword search via custom MCP (simpler, no npm deps)
Scope	One OpenClaw agent in Docker	Every Cursor workspace globally, plus any MCP-capable tool
Startup context	Agent workspace + indexed vault	Three briefing files (`Me.md`, `Vault Map.md`, `Skills Map.md`)
Procedures	Implicit in agent config	Explicit Skills (`sync`, `capture`, `projects-pull`, etc.) any AI can read
Boundaries	Vault folders by topic	Structural scope filtering (`personal` / `professional` / `sensitive`)
Downstream sync	None	Vault → git status → gbose.dev via automated pipelines

The honest tradeoff: search got simpler before it gets better again. QMD's semantic retrieval beats my current keyword scorer for fuzzy questions like "what did we decide about port conflicts." I know that. Wiring QMD (or something like it) into the Ideaverse MCP is still on the list. But the system as a whole is more capable than QMD alone ever was, because retrieval was only one piece of the problem.

What the Ideaverse layer adds that QMD didn't:

Identity, not just memory. QMD helps an agent find notes. It doesn't tell the agent who I am, how I like to work, or what tone to write in. The three-file AI OS (Me.md, Vault Map.md, Skills Map.md) loads that before any search happens. Every session starts with briefing, not amnesia.

Procedures, not just retrieval. Skills are executable runbooks. When I say "sync," the agent loads a four-phase pipeline. When I say "capture this," it knows exactly where today's note lives. QMD could surface a skill file; it couldn't tell the agent to follow one.

Cross-tool, not container-locked. QMD lived inside OpenClaw's Docker world. The Ideaverse MCP registers once in ~/.cursor/mcp.json and works from PortRadar, MetaCode, or any repo I open. Skills are plain markdown, so Claude Code or Gemini can read them too. One vault, many tools.

Scope and privacy by design. The MCP server filters what gets retrieved: work notes, finances, and health stay out of default personal scope. QMD indexed the whole vault path. Useful for recall, risky for boundaries.

The vault compounds outward. Projects pull git status from C:\Projects. Journal captures triage into the right notes. Blog metadata propagates to gbose.dev. QMD made the vault readable. Ideaverse makes it operational.

So: QMD was the right first move for agent memory. Ideaverse is the move from "my agent can remember" to "every AI session I open already knows who I am, how I work, and what I'm building." The search layer will catch up. The rest already did.

Definitions

A few terms worth pinning before going further.

Ideaverse. My personal knowledge vault. Plain markdown files at C:\Ideaverse, opened in Obsidian. Not a notes app. A structured context space: projects, writing, daily notes, process skills, health, finances. The key property is that it works with any tool that can read text files. Obsidian is the UI; the vault is the thing.

File over AI. The organizing philosophy. Knowledge lives in the files, not in any model's memory or any tool's database. When Claude loses context, when Cursor resets, when I switch from one AI to another, the vault is still there. Concrete consequence: I never rely on a model to "remember" something that matters. I write it down.

MCP (Model Context Protocol). An open standard for giving AI tools access to external data sources. An MCP server exposes tools that an AI client can call at runtime. In this setup: the vault is a data source, an MCP server sits in front of it, and Cursor calls the server to retrieve context on demand.

Thinking backend. The vault is not passive storage. It carries: who I am (for the AI), what my current projects are and their status, how I like to write (Writing Rules), what procedures to follow for recurring tasks (Skills), and privacy boundaries (what the AI may and may not read). Every AI session that connects to it starts informed.

Why I needed this

The standard workarounds for AI context management all have the same failure mode: they scale to one project, then break.

CLAUDE.md or AGENTS.md files in each repo work. Until you have eight repos and each file has drifted independently. Now you're maintaining eight copies of "who I am" in slightly different words.

Long Cursor User Rules (the Settings UI) inject context into every session. But that context is flat, always-on, and hard to version. You end up with a 2,000-character block that pollutes every session whether it's relevant or not.

System prompts per-tool solve nothing cross-tool. What I tell Claude in its system prompt doesn't follow me to Cursor. What Cursor knows doesn't follow me to a Gemini API call.

The real need was: one source of truth, tool-agnostic, retrievable on demand rather than always injected, and persistent across machine restarts, new repos, and new tools. The vault is that source of truth. The MCP layer is how AI tools query it without loading it all at once.

How I built it

The AI OS: three files

The vault has a lot of content. The AI doesn't need most of it at session start. The whole startup footprint is three files:

Meta/Me.md — who I am, how to work with me, context boundaries (what work AI may read vs. home AI)
Meta/Vault Map.md — vault structure, naming conventions, how to create and retrieve notes, maintenance rules
Meta/Skills Map.md — index of process skills; read the matching skill on demand, not all at once

Every entry point (AGENTS.md, CLAUDE.md at the vault root, per-repo pointers) forwards to these three. Any model, any tool, same three files. The rest of the vault enters context only when needed.

The MCP server

The vault doesn't inject itself. It waits to be asked.

I wrote a zero-dependency Node.js MCP server at Meta/_infra/ideaverse-mcp/server.js. No npm installs. It speaks JSON-RPC 2.0 over stdio and exposes three tools:

ideaverse_search(query, scope, limit) — keyword scoring across all vault notes; returns ranked snippets with file paths. Title matches score 5x, tag matches 3x, body occurrence 1x each. Capped at 15 results.
ideaverse_get(path, section) — fetch one note or one section of a note by path. Section-level fetches keep token cost low; I set a hard 8,000-character cap.
ideaverse_map(area, scope) — return a compact routing index: one line per note (path, title, type, tags, one-sentence summary). Cheap. Use this to discover what exists before deciding what to fetch.

The mental model: ideaverse_map is the table of contents, ideaverse_search is the index, ideaverse_get is the chapter. Progressive disclosure. You don't load the book; you navigate it.

Scope filtering is built into all three tools. Default scope is personal, which excludes work notes and sensitive Finances/Health content. I can pass sensitive or professional explicitly. This means the AI doesn't accidentally surface work context in a personal session or health data when I'm debugging code.

Global registration

The server is registered once in ~/.cursor/mcp.json:

{
  "mcpServers": {
    "ideaverse": {
      "command": "node",
      "args": ["C:\\Ideaverse\\Meta\\_infra\\ideaverse-mcp\\server.js"]
    }
  }
}

That's it. One config file, all workspaces. When I open a C:\Projects\portradar session or a C:\Projects\metacode session, the vault is already available. I never have to copy a config file into a new repo.

Global rules

~/.cursor/rules/gaurab-workflow.mdc has alwaysApply: true. It tells every Cursor session: use the ideaverse MCP tools for personal context; read Me.md at session start; on "sync" load the sync skill; on "capture" load the capture skill.

There are also agent-conduct.mdc (communication and code standards), git.mdc (commit protocol, loaded on demand), and pr.mdc (PR protocol, loaded on demand). All in ~/.cursor/rules/. None of them are repo-specific.

The rule in Settings UI is just a short pointer to this folder. Not a paste of the actual content.

Skills

Skills are markdown files in Meta/Skills/. Each one is a reusable procedure: sync.md, capture.md, daily-note.md, journal-triage.md, projects-pull.md, website-sync.md, initialize-harness.md.

An AI reads a skill when it needs to perform that procedure. Not before. The Skills Map is the index; each skill is the full instructions.

Cursor discovers skills via thin pointer files in ~/.cursor/skills/{name}/SKILL.md. Each pointer just says: "read the real skill at C:\Ideaverse\Meta\Skills\{name}.md." One copy of the actual skill. No drift.

Because the skill files are plain markdown, they work with Claude, Cursor, Gemini, or anything else that can read a file. Write once; run anywhere.

Sync pipelines

A few automation pieces keep the vault current:

projects-pull: scans every git repo under C:\Projects, reads the remote URL, last commit date, language, and README summary, and writes structured status back into the corresponding vault note. Run this after shipping code; the vault reflects it.
journal-triage: takes the ## Captured sections from daily notes and routes them to their proper destinations (project notes, writing ideas, personal todos).
midnight-sync-loop: PowerShell loop that runs after midnight, calls the Obsidian URI for remotely-save sync, then runs the vault rebuild scripts. Vault is current by morning.
website-sync: when a project's status or a blog post's metadata changes in the vault, this propagates to gbose.dev.

The vault is the source of truth. Code status and site content are downstream.

Here is the architecture as a diagram:

Obsidian Vault (C:\Ideaverse)
         |
         v
ideaverse-mcp server.js
         |
         v
~/.cursor/mcp.json  <---  registered once, all workspaces
         |
         v
Cursor Agent Session  <---  ~/.cursor/rules/ (alwaysApply)
         ^
         |
~/.cursor/skills/  <---  point to  Meta/Skills/*.md

What it has made possible

The most concrete change: I open a Cursor session in any repo and the agent already knows who I am. It knows my writing rules before drafting anything. It knows the project status before suggesting next steps. It knows which parts of the vault are off-limits without me specifying.

A few specific examples from the last few weeks:

When I was writing the website redesign post, the agent read Writing Rules.md before drafting a word. No em dashes in the output. Concrete voice. It matched existing post tone because the rules were explicit and in context, not implicit in my head.

When I say "sync," a four-phase pipeline runs: scan git repos and update vault, triage daily captures, rebuild vault index, optionally push to gbose.dev. One word. The skill file has the full instructions. I don't repeat them every session.

When I start work on a new side project, I run "projects pull." The vault note for that project gets git metadata automatically. Status stays current without manual updates.

The blog post you are reading was drafted in a session where the agent had the full writing rules, vault conventions, and my tone in context from the start. The file ended up in the right folder with the right frontmatter because that structure is documented in Vault Map.md and the agent read it.

The larger thing: I am not re-establishing context with every session. Identity is persistent. Procedures are persistent. Voice is persistent. The vault carries continuity across sessions the way human memory carries it across days.

Pitfalls and challenges

Context bloat is the real risk. The naive version of this system injects the whole vault into every session. That's expensive, slow, and counterproductive. Getting progressive disclosure right took a few iterations. The current design: three small startup files plus on-demand MCP calls. Total startup cost is a few thousand tokens; full-session cost depends on what you ask.

Vault drift. YAML frontmatter inconsistencies, broken wikilinks, notes that reference projects that no longer exist. The sync scripts catch some of this. But drift is slow and cumulative. I run a sync-check script periodically, but the vault still has notes I haven't touched in months that are probably wrong.

Windows path issues. mcp.json requires double backslashes (C:\\Ideaverse\\...). Obvious once you know, annoying to debug when you don't. The server path, the skill paths, and the rule paths all need this. I've hit this more than once after a machine migration.

Automation fragility. The midnight sync loop depends on Windows Task Scheduler triggering a PowerShell script that fires an Obsidian URI for remotely-save sync. It works. But if Obsidian isn't open, the sync plugin doesn't run. If the plugin times out, the loop continues anyway. The automation covers the common case well and the edge cases poorly.

AI OS freshness. Me.md and Vault Map.md are the foundation every session stands on. If they're stale, every session inherits bad context. When I add a new area to the vault or change how I want the AI to behave, I have to remember to update the maps. This is not automatic. I've had sessions where the agent followed outdated instructions because I forgot to update Me.md after a workflow change.

What I still need to improve

Semantic search. The current ideaverse_search scores keyword matches: title, tags, body frequency. It misses conceptual connections. If I search for "shipping side projects" it won't surface a note titled "Building PortRadar" unless those exact words appear. My earlier QMD setup handled this better with hybrid BM25 + vector search inside OpenClaw. Wiring that retrieval quality into the Ideaverse MCP is the next meaningful search upgrade.

Richer project metadata. projects-pull gets git remote, last commit date, and README summary. It doesn't get CI status, open issues, deployment health, or PR count. The vault project notes are informed about code, but not about the full state of a project.

Smarter context selection. The agent sometimes fetches more than it needs. A better manifest with richer summaries would let ideaverse_map be more selective before the agent decides what to fetch with ideaverse_get. Right now the routing is decent but not tight.

Automated session capture. When a session produces something worth keeping (a decision, a finding, an idea), I have to say "capture this" explicitly. A heartbeat approach (saving session summaries at regular intervals or before context compaction) would close the loop without depending on me remembering.

Structural boundary enforcement. The Professional/Personal separation is documented in Me.md and enforced by scope parameters in the MCP. But it's trust-based: the AI reads the rules and follows them. If I forget to pass the right scope, or if an AI doesn't read Me.md carefully, the boundary can blur. Making the scope enforcement structural (vault-level ACL rather than AI-level instruction) would be more robust.

The system works today. Sessions start informed, procedures run reliably, and the vault compounds over time. The gaps above are real, but they're improvements to something that already functions. That's a better place to be than the alternative, which was starting fresh every time.