# atlaso — complete docs > A memory store that flags its own conflicts before your agent does. Generated from the rendered docs site at https://www.atlaso.ai/docs. The same Markdown source ships with the SDK at https://github.com/imashishkh21/atlaso/tree/main/docs. --- # atlaso SDK — Documentation > A memory store that flags its own conflicts before your agent does. The full rendered docs site lives at ****. The files in this folder are the same content as plain Markdown so you can read it without leaving GitHub. The website is the source of truth — when the two diverge, the website wins. Open an issue if you find drift. --- ## Read in this order | Page | What it covers | |---|---| | [Getting Started](./getting-started.md) | Install, quickstart, your first deposit and recall. | | [Concepts](./concepts.md) | Field 3.0 mental model — deposits, polarity, evidence, scope, confidence, conflict, idempotency. | | [API Reference](./api-reference.md) | `Memory`, `AsyncMemory`, `UserHandle` — every method, signature, kwarg. | | [Data Types](./data-types.md) | `Deposit`, `SearchResult`, `Diagnostics`, `AddResult`, `RetractResult`, `PeekView` … | | [Errors](./errors.md) | The full exception hierarchy + which transport errors are retried. | | [Architecture](./architecture.md) | The gate, retrieval pipeline, storage layer — with diagrams. | | [CLI](./cli.md) | Every `atlaso` subcommand. | | [MCP & Hooks](./mcp-and-hooks.md) | `atlaso mcp` for Claude Code / Cursor / Codex / Windsurf / Cline + auto-memory hooks. | | [Configuration](./configuration.md) | Env vars, storage-path resolution, transport injection. | | [Admin](./admin.md) | Cross-tenant operations behind a literal-confirm wall. | | [Recipes](./recipes/) | Framework integrations — LangChain · LlamaIndex · DSPy · OpenAI Agents · CrewAI. | ## Quick links - Website docs: - Source: - Issues: - PyPI: ## Sixty-second pitch ```python from atlaso import Memory m = Memory() user = m.for_user("alice") user.add("Alice prefers oat milk in lattes") results = user.recall("milk") print(results.explain()) # bag-level verdict for r in results: if r.has_disagreement: print("WARN", r.content, "conflicts with", r.conflict_peers) elif r.is_confident: print("OK ", r.content) else: print("? ", r.content) ``` Most memory stores return hits. Atlaso returns hits **plus a verdict on whether the hits are settled**. The defining mechanic is at [Concepts → Confidence & conflict](./concepts.md#confidence--conflict). ## Status `atlaso==0.1.0a4` — research preview. Apache-2.0. Python 3.10+. Zero telemetry. Only runtime dependency is `httpx`. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Getting Started ## Install ```bash pip install atlaso ``` Tested on CPython 3.10, 3.11, 3.12, 3.13. The only runtime dependency is `httpx>=0.27,<0.29`. Atlaso never imports an LLM client. ### Optional extras ```bash # MCP server (fastmcp) — real pip install "atlaso[mcp]" # Reserved namespaces — no code, just the name (warning on Memory() construction) pip install "atlaso[langchain]" pip install "atlaso[llamaindex]" pip install "atlaso[dspy]" pip install "atlaso[openai-agents]" pip install "atlaso[crewai]" ``` ### Verify ```bash atlaso doctor ``` The doctor runs an end-to-end sanity check: imports, vendored engine, resolved storage path with source, and an `add` → `recall` → `retract` round-trip in a tempdir. Returns `0` on a healthy install. --- ## Quickstart Five minutes from `pip install` to a working conflict-flagging loop. ### 1. Bind to a user ```python from atlaso import Memory m = Memory() # resolves /.atlaso/ on first call user = m.for_user("alice") # frozen handle bound to alice ``` Always pass an authenticated identity to `for_user(...)` — never a value from a request body. The handle pre-fills `user_id` on every call. ### 2. Write a deposit ```python result = user.add("Alice prefers oat milk in lattes") print(result.id) print(result.deposit.polarity, result.deposit.evidence_grade) # open anecdotal ``` Defaults: `polarity="open"`, `evidence_grade="anecdotal"`, `scope=None`. No Field 3.0 ceremony required for casual notes. ### 3. Recall ```python results = user.recall("milk", limit=5) print(results.explain()) # bag-level verdict for r in results: print(r.is_confident, r.content) ``` ### 4. Plant a conflict, watch the bag light up ```python from atlaso import Scope # env= satisfies the gate's provenance rule for narrow-negative scope = Scope(model="gpt-5", dataset="prod-2026", env="prod") user.add("threshold 0.7 is optimal", polarity="positive", evidence_grade="observed", scope=scope) user.add("threshold 0.7 over-flags", polarity="negative", evidence_grade="observed", scope=scope) hits = user.recall("threshold", scope=scope) print(hits.has_disagreement) # True for h in hits: print(h.has_disagreement, h.content, "peers:", h.conflict_peers) ``` ### 5. Supersede instead of update Deposits are immutable. `m.update(...)` raises `AttributeError` pointing at `contradict()`. ```python old = user.add("Alice prefers oat milk") new = user.contradict( "Alice now prefers soy milk", contradicts=[old.id], reason="Apr-23 conversation update", ) print(new.deposit.id, "supersedes", old.id) ``` ### 6. Retract (soft by default, hard for GDPR) ```python user.retract(new.deposit.id, reason="customer request") # soft tombstone user.retract(new.deposit.id, reason="GDPR erasure", hard_delete=True) # irreversible ``` ### 7. Health check ```python diag = user.health(window_days=30) print(diag.fmi, diag.coverage, diag.precision, diag.resolution, diag.density) print(diag.explain()) # "Memory is healthy (FMI 72/100). Searches are returning confident answers." ``` --- ## What's next - [Concepts](./concepts.md) — Field 3.0 mental model in detail. - [API Reference](./api-reference.md) — every method, every kwarg. - [CLI](./cli.md) — `atlaso add`, `atlaso recall`, … - [MCP & Hooks](./mcp-and-hooks.md) — run as an MCP server for Claude Code / Cursor / Codex. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Concepts Atlaso is built around **Field 3.0** — every deposit carries a polarity, an evidence grade, and a structured scope, so the retrieval pipeline can group hits, flag conflicts, and tell you whether the data is settled. This page is the mental model. ## Deposits A deposit is an immutable, typed record of one thing the system observed. Every deposit carries: - `content` — the plain-English claim. - `polarity` — `positive` / `negative` / `cautionary` / `open`. - `evidence_grade` — `anecdotal` / `observed` / `replicated` / `verified`. - `scope` — six facets pinning the claim to a context. - `tags`, `artifact_refs` — free-form context. - `contradicts` — directional supersession edges to earlier deposits. - Provenance — `author`, `author_role`, `task_id`, `created_at`. ### Immutability is a feature Deposits cannot be edited in place. The SDK has no `update()` method — calling it raises `AttributeError` pointing at `contradict()`. Updating a fact creates a new deposit that supersedes the old one. The supersession edge is preserved in the `contradictions` table. ```python m.update("anything") # AttributeError: deposits are immutable. # Use m.contradict(new_text, contradicts=[old.id], reason="…") instead. ``` ### Writing one ```python from atlaso import Memory, Scope m = Memory() user = m.for_user("alice") user.add( "Threshold of 0.7 over-flags in production", polarity="negative", evidence_grade="observed", scope=Scope(model="gpt-5", env="prod"), tags=["threshold", "fp-rate"], ) ``` The **gate** may reject the write — see [Architecture · The gate](./architecture.md#the-gate). --- ## Polarity & evidence ### Polarity A deposit's *direction*. - `positive` — "X works" / "X is true". - `negative` — "X doesn't work" / "X failed". - `cautionary` — "X works *but*…" / "watch out for…". - `open` — "we haven't tested yet" / a note, not a claim. `positive`, `negative`, and `cautionary` are *directional*. When two or more distinct directional polarities appear in the same scope bag, the bag is flagged as in conflict — that includes `negative` + `cautionary` without a positive. `open` never triggers a conflict. ### Evidence grade How well-supported the claim is. - `anecdotal` — one person, one observation, no logs. - `observed` — measured at least once with artifacts. - `replicated` — multiple independent observations agree. - `verified` — replicated and recorded with provenance. ### Why both axes The gate weighs polarity against evidence to decide if a write should pass. A broad positive claim needs `replicated` or better. A narrow negative claim only needs provenance — a known failure is allowed in cheaply because downstream agents need to know. ### Author-role bump When a deposit's `author_role` normalises to `redteam` (matches `redteam`, `red_team`, `red-team`, `RedTeam` …), `negative` and `cautionary` claims get a +1 evidence-grade bump at the gate. Positive and open deposits are unaffected. ### Pedagogical errors ```python user.add("hello", polarity="strong") # InputValidationError: polarity must be one of: # "positive" — claim that X works / is true # "negative" — claim that X doesn't work / failed # "cautionary" — claim with caveats # "open" — note or open question ``` --- ## Scope A six-facet dataclass that pins each deposit to a concrete context. ```python from atlaso import Scope @dataclass(frozen=True, slots=True) class Scope: model: str | None = None dataset: str | None = None env: str | None = None version: str | None = None n: int | None = None seed: int | None = None note: str | None = None # folded into the FTS index ``` ### Why it matters Scope is the **grouping key** for confidence and conflict. Two deposits with the same six facets land in the same scope bag. The bag is what gets a precision score, a conflict check, and the `is_confident` verdict. A claim with no scope is broader than one with three facets — which is why broad positives need stronger evidence to pass the gate. ### Filtering recall ```python hits = user.recall( "threshold", scope=Scope(env="prod"), # only deposits scoped to env=prod ) ``` Partial matches work. Facets you don't pass aren't filtered. ### Breadth feeds the gate The gate counts non-None scope facets. `≥2` is *narrow*; otherwise *broad*. Broad positives need `replicated`; narrow negatives need provenance (`artifact_refs` OR `env` OR `version`). --- ## Confidence & conflict The single mechanic that makes atlaso different. ### Four flags on every hit A `SearchResult` carries four dispersion-aware flags on top of the underlying `Deposit`: - `is_confident: bool` — bag is multi-sample, near-unanimous on a dominant polarity, no directional conflict. - `has_disagreement: bool` — bag contains two or more distinct directional polarities. - `agreement_score: float` — fraction of the bag matching the dominant polarity, in `[0, 1]`. - `conflict_peers: tuple[str, ...]` — deposit IDs of the opposing-polarity hits that triggered the conflict. ### The exact rule for `is_confident` Inside the engine the bag-level fields are named `bag_precision` / `has_conflict` / `is_single_sample`. They're re-surfaced on the public `SearchResult` as `agreement_score`, `has_disagreement`, and `is_thin_evidence` respectively. ```python # _engine/aware.py @property def is_confident(self) -> bool: return ( self.bag_precision >= 0.99 # public: agreement_score and not self.is_single_sample # public: is_thin_evidence and not self.has_conflict # public: has_disagreement ) ``` Three conditions must all hold. A single observation, even with high evidence grade, can never be `is_confident`. A fifty-fifty split, even with many observations, can never be `is_confident`. A unanimous bag of two — yes. The rule is intentionally strict. ### Structural, not semantic Conflict is *structural*. Atlaso doesn't embed your text to decide if it disagrees. It groups by the structured scope and counts polarities. ### Aggregate flags are computed before slicing `SearchResults.has_disagreement` looks at *every* bag the engine returns, not just the first `limit` results. A `limit=5` slice can't hide a conflict bag from your model. Intentional. ### `.explain()` in plain English ```python print(results.explain()) # "5 hits across 2 bags · 1 bag in conflict · not confident" ``` --- ## Idempotency Bulk imports and retried writes need replay safety. Atlaso uses Stripe-style content-addressed keys with a 24-hour window. ### Why keys are required `add()` is convenient but offers no replay safety. `add_many()` requires a per-item `idempotency_key` so retrying a partially-failed batch is safe. ### The helper ```python from atlaso import idempotency_key key = idempotency_key("alice", "Alice prefers oat milk", "2026-05-11") # 'ak_b7c1d3e8a4f5...' ``` Pure blake2b-16 hash of the arguments, prefixed `ak_`. Same args → same key. ### Bulk imports ```python from atlaso import AddItem, idempotency_key items = [ AddItem( content=row["text"], idempotency_key=idempotency_key("preferences-v1", row["id"]), ) for row in rows ] result = user.add_many(items) print(result.committed, "new") print(result.duplicates, "replays") print(result.failed, "rejected") ``` ### The rules - **Same key + same content within 24h** → atlaso treats it as a replay. In `add_many()` the replayed deposit lands in `AddManyResult.duplicates` rather than `committed`. - **Same key + different content** → `IdempotencyKeyConflict` with `.existing_id`. - **After 24h** → the key is forgotten. Keys are stored in `/_idempotency.db`, separate from your deposits. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # API Reference The three public client classes: `Memory`, `AsyncMemory`, `UserHandle`. ## `Memory` The synchronous front door. Long-lived per process. Storage resolves lazily on first call — no I/O in `__init__`. ### Constructor ```python Memory( *, api_key: str | None = None, base_url: str | None = None, path: str | os.PathLike[str] | None = None, validate_user_id: Callable[[str], None] | None = None, timeout: float | httpx.Timeout = 10.0, max_retries: int = 2, suppress_async_warning: bool | None = None, transport: httpx.BaseTransport | None = None, ) ``` Every argument is keyword-only. - `api_key` — reserved for v0.2 remote backend; defaults to `ATLASO_API_KEY` env. - `base_url` — reserved for v0.2; defaults to `ATLASO_BASE_URL` env or `https://api.atlaso.dev`. - `path` — SQLite location. If `None`, resolved via the 5-step path walker (see [Configuration](./configuration.md)). - `validate_user_id` — optional callback called inside `for_user()` and every lower-level call; raise to block. - `timeout` — httpx timeout for the future remote backend. - `max_retries` — transport-level retries (network + 5xx + 429). - `suppress_async_warning` — silence `SyncInAsyncWarning`. Lever order: this arg > `ATLASO_ASYNC_WARNINGS=0` > `warnings.filterwarnings(...)`. - `transport` — inject an `httpx.BaseTransport` for tests. **Passing an `httpx.Client` raises `ConfigValidationError`** — the threat model is silent cross-tenant Authorization-header leak. Use `httpx.MockTransport(...)` for tests. ### `for_user(user_id) -> UserHandle` Returns a frozen handle bound to `user_id`. Pre-fills `user_id` on every call so identity can't be fumbled per-method. ### `add(text, *, user_id, polarity="open", evidence_grade="anecdotal", scope=None, tags=None, contradicts=None, author=None) -> AddResult` Write a deposit. Raises `InputValidationError`, `DepositRejectedError`, or `MissingContradictsError`. ### `recall(query, *, user_id, limit=10, scope=None) -> SearchResults` Dispersion-aware search. `limit` is bounded `1..1000`. The returned `SearchResults` has bag-level `has_disagreement` / `is_confident` flags computed across **all** bags before slicing. ### `get(deposit_id, *, user_id) -> Deposit | None` Fetch one deposit by id. **Returns `None` on miss** — deliberately distinct from `NotFoundError`, for dict.get muscle memory. ### `peek(user_id, *, limit=10) -> PeekView` REPL/Jupyter snapshot — recent deposits + FMI + a recent-disagreement flag. ### `list_recent(*, user_id, limit=20, offset=0) -> list[Deposit]` Newest-first paginated list. ### `contradict(new_text, contradicts, *, reason, user_id) -> AddResult` Atomic supersede. Writes a new deposit and marks one or more existing deposits as superseded. `reason` is required; empty `contradicts` list raises `InputValidationError`. ### `retract(deposit_id, *, reason, user_id, hard_delete=False, force=False) -> RetractResult` Soft retract by default (the deposit becomes invisible to recall but survives in the file). `hard_delete=True` is irreversible. `force` is reserved for v0.2 server-side protected-deposit flags; no effect on the local backend today. ### `health(*, user_id, window_days=30) -> Diagnostics` FMI + four pillars. `window_days` bounded `1..365`. ### `add_many(items, *, user_id, on_gate_reject="skip") -> AddManyResult` Bulk import. Each `AddItem` requires an `idempotency_key`. Skip-and-report semantics — successful items in `result.committed`, idempotent replays in `result.duplicates`, gate rejections in `result.failed`. ### `close()` / context manager ```python with Memory() as m: user = m.for_user("alice") user.add("…") # backend closed, httpx client closed ``` A caller-supplied transport is **not** closed for you. ### `update()` is a typo-blocker ```python m.update("anything") # AttributeError: deposits are immutable. # Use m.contradict(new_text, contradicts=[old.id], reason="…") instead. ``` --- ## `AsyncMemory` The async mirror of `Memory`. Same kwargs and method shapes; data-plane methods are awaitable. ### Differences from `Memory` - `transport` must be `httpx.AsyncBaseTransport`. Passing `httpx.AsyncClient` raises `ConfigValidationError`. - `aclose()` instead of `close()`. - `__aenter__` / `__aexit__` instead of sync context manager. - `for_user(user_id)` stays **sync** — returns an `AsyncUserHandle` with no I/O. Only the handle's data methods are awaitable. - `update()` is sync (it just raises with a pointer at `contradict()`). - `suppress_async_warning` accepted for parity but is a no-op. - Calling `AsyncMemory` from sync code raises immediately rather than warning. ### Canonical FastAPI shape ```python from contextlib import asynccontextmanager from fastapi import FastAPI, Depends from atlaso import AsyncMemory memory: AsyncMemory | None = None @asynccontextmanager async def lifespan(app: FastAPI): global memory memory = AsyncMemory() yield await memory.aclose() app = FastAPI(lifespan=lifespan) @app.post("/remember") async def remember(text: str, user_id: str = Depends(current_user)): user = memory.for_user(user_id) # sync, no I/O result = await user.add(text) return {"id": result.id} @app.get("/recall") async def recall(q: str, user_id: str = Depends(current_user)): user = memory.for_user(user_id) hits = await user.recall(q) return { "verdict": hits.explain(), "is_confident": hits.is_confident, "has_disagreement": hits.has_disagreement, "items": [ { "content": h.content, "is_confident": h.is_confident, "has_disagreement": h.has_disagreement, "agreement_score": h.agreement_score, } for h in hits ], } ``` > v0.1 wraps the sync SQLite backend in `asyncio.to_thread()`. A native async backend ships in v0.2. The public API stays the same. --- ## `UserHandle` Frozen slotted dataclass returned by `Memory.for_user(...)`. Holds the client and an authenticated `user_id`. Every method pre-fills `user_id` and delegates. ```python m = Memory(validate_user_id=allowlist_check) user = m.for_user(auth.user.id) # validate runs once here user.add("…") hits = user.recall("…") ``` Method surface mirrors `Memory` minus the `user_id=` kwarg: - `user.add(text, *, polarity="open", evidence_grade="anecdotal", scope=None, tags=None, contradicts=None, author=None)` - `user.recall(query, *, limit=10, scope=None)` - `user.get(deposit_id)` - `user.peek(*, limit=10)` - `user.list_recent(*, limit=20, offset=0)` - `user.contradict(new_text, contradicts, *, reason)` - `user.retract(deposit_id, *, reason, hard_delete=False, force=False)` - `user.health(*, window_days=30)` - `user.add_many(items, *, on_gate_reject="skip")` `user.user_id` exposes the bound id. The handle is frozen — to rebind, call `for_user(...)` again on the underlying `Memory`. There is no `handle.rebind()`. `AsyncMemory.for_user(...)` returns an `AsyncUserHandle` with the same method names but every data call awaitable. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Data Types Every result type the SDK returns. All are frozen, slotted dataclasses with a stable wire format (`to_dict()` includes a `"kind"` discriminator). Internal `dataclasses.asdict()` is forbidden in public code by CI gate. ## `Deposit` ```python @dataclass(frozen=True, slots=True) class Deposit: id: str content: str polarity: Polarity # positive | negative | cautionary | open evidence_grade: EvidenceGrade # anecdotal | observed | replicated | verified scope: Scope user_id: str tags: tuple[str, ...] contradicts: tuple[str, ...] artifact_refs: tuple[str, ...] created_at: datetime author: str | None author_role: str | None repro_status: ReproStatus # unreplicated | replicated | failed_repro task_id: str | None ``` Immutable. To "edit" a deposit, call `contradict()`. `deposit.to_dict()` produces a stable JSON shape: ```python { "kind": "deposit", "id": "7e3a1b2c-…", "content": "Alice prefers oat milk", "polarity": "open", "evidence_grade": "anecdotal", "scope": {"model": None, "dataset": None, ...}, "user_id": "alice", "tags": ["preference"], "contradicts": [], "artifact_refs": [], "created_at": "2026-05-11T14:23:00.123456+00:00", "author": None, "author_role": None, "repro_status": "unreplicated", "task_id": None, } ``` --- ## `SearchResult` ```python @dataclass(frozen=True, slots=True) class SearchResult: deposit: Deposit score: float is_confident: bool has_disagreement: bool agreement_score: float # [0, 1] — public alias for bag_precision is_thin_evidence: bool conflict_peers: tuple[str, ...] = () # Hot-path property aliases (no __getattr__ magic): @property def id(self) -> str: ... @property def content(self) -> str: ... @property def polarity(self) -> Polarity: ... @property def evidence_grade(self) -> EvidenceGrade: ... @property def scope(self) -> Scope: ... @property def tags(self) -> tuple[str, ...]: ... def explain(self) -> str: ... ``` > **Not a `Deposit` subclass.** `isinstance(hit, Deposit)` is `False` on purpose. Composition, not inheritance — so legacy code that branches on the type can't silently drop the dispersion fields. Use `hit.deposit` when you need the underlying record. ### `SearchResults` (the container) ```python @dataclass(frozen=True, slots=True) class SearchResults: items: tuple[SearchResult, ...] has_disagreement: bool # bag-level: any returned bag flagged is_confident: bool # bag-level: at least one confident bag, no disagreement anywhere def __iter__(self): ... def __len__(self): ... def __getitem__(self, idx): ... def __bool__(self): ... def explain(self) -> str: ... ``` Iterable, length-able, indexable, truthy if non-empty. **Not a list subclass.** `has_disagreement` is computed over every bag the engine returns, not just the sliced top-`limit`. A conflict can't hide from your agent because of pagination. `result.to_dict()` uses `"kind": "search_result"` and flattens the most common Deposit fields to the top level so consumers can pull `r["content"]` without nesting through `r["deposit"]["content"]`. --- ## `Diagnostics` (Field Maturity Index) ```python @dataclass(frozen=True, slots=True) class Diagnostics: fmi: int # 0..100 coverage: float # 0..1 precision: float # 0..1 resolution: float # 0..1 density: float # 0..1 window_days: int deposit_count: int def explain(self) -> str: ... ``` ### The formula ```python FMI = 100 * (Coverage × Precision × Resolution × Density) ** (1/4) ``` Geometric mean, not arithmetic — a single weak pillar drops the score sharply. Intentional. ### The pillars - **Coverage** — rolling 30-day fraction of `recall()` calls that returned at least one confident hit. Drops when users ask questions the field hasn't answered. - **Precision** — average `bag_precision` across all multi-sample bags. Drops when bags are split. - **Resolution** — fraction of conflict bags that have been *resolved* (at least one deposit is the target of a `contradicts` edge). - **Density** — `1 − singleton_bags / total_bags`. Drops when most bags have only one deposit. ### Vacuous cases Empty field → coverage 0.0, density 0.0. No multi-sample bags → precision 1.0 (vacuous). No conflicts → resolution 1.0 (vacuous). ```python diag = user.health(window_days=30) print(diag.fmi) # e.g. 72 print(diag.explain()) # "Memory is healthy (FMI 72/100). Searches are returning confident answers." ``` --- ## `AddResult` ```python @dataclass(frozen=True, slots=True) class AddResult: deposit: Deposit is_idempotent_replay: bool @property def id(self) -> str: ... ``` Returned by `add()` and `contradict()`. `is_idempotent_replay=True` when an identical-content write with the same key fell within the 24-hour window. ## `AddItem` (bulk import row) ```python @dataclass(frozen=True, slots=True) class AddItem: content: str idempotency_key: str # REQUIRED — no default polarity: Polarity = "open" evidence_grade: EvidenceGrade = "anecdotal" scope: Scope | None = None tags: tuple[str, ...] = () contradicts: tuple[str, ...] = () ``` ## `BulkReject` ```python @dataclass(frozen=True, slots=True) class BulkReject: item: AddItem error: Exception # typically a subclass of AtlasoError index: int ``` ## `AddManyResult` ```python @dataclass(frozen=True, slots=True) class AddManyResult: committed: tuple[Deposit, ...] duplicates: tuple[Deposit, ...] # idempotent replays failed: tuple[BulkReject, ...] is_partial: bool = False @property def total(self) -> int: ... @property def all_succeeded(self) -> bool: ... ``` Skip-and-report. Successful items commit even if others fail. ## `RetractResult` ```python @dataclass(frozen=True, slots=True) class RetractResult: deposit_id: str mode: str # "soft" or "hard" contradicts_preserved: tuple[str, ...] = () ``` `contradicts_preserved` is the tuple of deposit IDs whose `contradicts` edges pointed at the retracted deposit and remain intact (non-empty only after a soft retract). ## `PeekView` ```python @dataclass(frozen=True, slots=True) class PeekView: user_id: str deposits: tuple[Deposit, ...] total_count: int # excludes soft-retracted (tombstoned) rows fmi: int # 0..100 has_recent_disagreements: bool def __iter__(self): ... ``` --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Errors Atlaso's exception hierarchy is pedagogical — error messages list every valid option and point at the right method when applicable. The error *is* the documentation. ## Hierarchy ```text AtlasoError # catch-all ├── ConfigurationError # bad construction / call inputs │ ├── ConfigValidationError # construction-time │ └── InputValidationError # call-time │ └── MissingContradictsError # .candidate_parents ├── AuthenticationError # HTTP 401 (who you are) ├── TransportError # network / HTTP base │ ├── ConnectError # DNS / TCP / TLS — RETRIED │ ├── RequestTimeoutError # RETRIED │ └── APIStatusError # .status_code, .response, .request_id │ ├── PermissionDeniedError # 401 / 403 — NOT retried │ ├── NotFoundError # 404 — NOT retried │ ├── RateLimitError # 429 — .retry_after — RETRIED │ └── ServerError # 5xx — RETRIED ├── FieldError # gate / invariant base │ └── DepositRejectedError # .gate_reason └── IdempotencyKeyConflict # .idempotency_key, .existing_id ``` > Atlaso's errors deliberately avoid Python builtin names like `ConnectionError` and `TimeoutError`. Enforced by a CI gate so an upgrade can never silently shadow a builtin in user code. ## Pedagogical messages ```python from atlaso import InputValidationError try: user.add("hello", polarity="strong") except InputValidationError as e: print(e) # polarity must be one of: # "positive" — claim that X works / is true # "negative" — claim that X doesn't work / failed # "cautionary" — claim with caveats # "open" — note or open question ``` ## Retry policy The HTTP client retries transient failures up to `max_retries` times with exponential backoff. **Retried:** - `ConnectError` — DNS, TCP, TLS failures - `RequestTimeoutError` - `RateLimitError` — respects `Retry-After` header - `ServerError` — 5xx **Not retried (would be silently destructive):** - `PermissionDeniedError` — 401 / 403 - `NotFoundError` — 404 - `AuthenticationError` — wrong API key - `InputValidationError`, `DepositRejectedError`, `IdempotencyKeyConflict` — caller bugs ## Rich error fields ```python from atlaso import ( APIStatusError, RateLimitError, DepositRejectedError, IdempotencyKeyConflict, InputValidationError, ) try: user.add(...) except DepositRejectedError as e: print(e.gate_reason) # human-readable rule trace try: user.add_many(items, ...) except IdempotencyKeyConflict as e: print(e.idempotency_key, "already mapped to", e.existing_id) try: user.contradict(text, contradicts=[], reason="…") except InputValidationError as e: # Memory.contradict() validates non-empty contradicts at call-time. print(e) try: ... except RateLimitError as e: sleep(e.retry_after or 1.0) try: ... except APIStatusError as e: print(e.status_code, e.request_id, e.response) ``` `SyncInAsyncWarning` and `FrameworkExtraReservedWarning` subclass `Warning` directly (not `UserWarning`) so a shop that sets `warnings.filterwarnings('error', category=UserWarning)` won't hard-fail on them. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Glossary Every Field 3.0 term defined in one place. - **Field 3.0** — atlaso's append-only, dispersion-aware knowledge model. - **Deposit** — one immutable record of an observation. - **Polarity** — direction of a claim: `positive` / `negative` / `cautionary` / `open`. The first three are directional and feed conflict detection. - **Evidence grade** — `anecdotal` / `observed` / `replicated` / `verified`. Weighed against scope breadth at the gate. - **Scope** — six optional facets (`model`, `dataset`, `env`, `version`, `n`, `seed`) plus a free-form `note`. The grouping key for conflict. - **Scope bag** — the set of deposits sharing the same six scope facets. - **Gate** — the write-time policy that decides whether a deposit may enter the store, based on `polarity × scope_breadth × evidence_grade`. - **Scope breadth** — *narrow* when ≥2 scope facets are non-null; *broad* otherwise. - **Provenance** — at least one `artifact_refs` entry OR a scoped `env` / `version`. Required for narrow-negative and cautionary deposits. - **Conflict** — a scope bag containing ≥2 distinct directional polarities (any pair from `{positive, negative, cautionary}`). - **is_confident** — bag-level verdict; true iff `bag_precision >= 0.99` AND not `is_single_sample` AND no `has_conflict`. - **agreement_score** — public alias for `bag_precision`. - **FMI (Field Maturity Index)** — geometric mean of Coverage × Precision × Resolution × Density, scaled 0–100. - **Coverage / Precision / Resolution / Density** — the four FMI pillars. - **Idempotency key** — per-item dedup key required by `add_many()`; `idempotency_key(*parts)` helper returns a content-addressed `ak_…` hash with a 24-hour window. - **Soft retract** — adds an `atlaso:retracted=` tombstone; preserves the audit chain. - **Hard delete** — irreversible; reserved for GDPR / PII erasure. - **UserHandle** — frozen per-user view returned by `Memory.for_user(...)`. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Frequently asked **Does atlaso send any data over the network?** No. v0.1 is local-only — no outbound HTTP calls exist under `src/atlaso`. The only runtime dependency is `httpx`. `ATLASO_TELEMETRY` is a name-reserved no-op; setting it truthy emits one `UserWarning` to confirm the no-op. v0.2 will introduce an *opt-in* remote backend. **Why is v0.1 retrieval lexical (BM25) and not semantic?** Fast, deterministic, no embedding service required, reproducible across machines. The dispersion layer doesn't require semantic recall — it requires *structured* recall grouped by scope. The roadmap adds an optional embedding-backed path behind the same `recall()` signature; the BM25 path stays. **Can I share one `Memory()` instance across users?** Yes. Construct one per process. On each request, call `memory.for_user(authenticated_id)` for a `UserHandle` scoped to that user. The handle pre-fills `user_id` on every method call. Per-user data lives at `/users//field.db` — one SQLite file per tenant, isolation enforced at the filesystem. **Why does `Memory(transport=httpx.Client(...))` raise?** `httpx.Client` holds Authorization headers; sharing one across tenants risks silent leaks. Atlaso accepts only `httpx.BaseTransport` (the connection layer, no headers). For tests, use `httpx.MockTransport`. The rejection happens at construction with `ConfigValidationError`. **Where does atlaso store data?** Storage resolves lazily on the first write, in this order: (1) `path=` kwarg to `Memory()`, (2) `ATLASO_PATH` env var, (3) the nearest ancestor that already contains a `.atlaso/` directory, (4) the nearest ancestor with a project marker (`pyproject.toml`, `package.json`, `.git`, `Cargo.toml`, `go.mod`) — creates `.atlaso/` there, (5) cwd as a last resort. The walk stops at `$HOME` or the filesystem root. **There is no `~/.atlaso/` fallback for field data** — a global home-dir store would silently merge memories across projects. **How is `is_confident` computed?** For a scope bag: `bag_precision >= 0.99` AND not `is_single_sample` AND no `has_conflict`. All three must hold. A single observation, even with high evidence grade, is never confident. **Can I update a deposit?** No — deposits are immutable. `m.update(...)` raises `AttributeError` with a pointer at `contradict()`. From a `Memory` instance call `m.contradict(new_text, contradicts=[old_id], reason=..., user_id=...)`; from a bound `UserHandle` call `user.contradict(new_text, contradicts=[old_id], reason=...)` (the handle pre-fills `user_id`). Atlaso writes the new deposit and records contradiction edges from old → new so the audit chain is preserved. **Difference between `retract()` and `contradict()`?** `contradict()` writes a **new** deposit and records a `contradicts` edge from the new one to each superseded id — both remain queryable for audit; the recall pipeline doesn't hard-filter contradicted targets but surfaces the edge so your agent can branch on it. `retract()` removes a deposit from `recall` (soft tombstone by default; `hard_delete=True` is irreversible) without writing a replacement. **Why was my `add()` rejected with `DepositRejectedError`?** The gate refused the write. `e.gate_reason` tells you which rule fired. Usually fix is a higher evidence grade, a narrower scope, or provenance (`artifact_refs`, `scope.env`, `scope.version`). **Does atlaso work with Claude Code / Cursor / Codex / Windsurf / Cline?** Yes via MCP. `pip install "atlaso[mcp]"`, run `atlaso mcp`, wire into your client's MCP config. The server exposes nine tools (full `Memory()` surface). Claude Code also has `atlaso install-hooks` which registers a synchronous `UserPromptSubmit` recall hook and an async `Stop` deposit hook in `settings.json`. **Should I use `Memory` or `AsyncMemory` in a web app?** Use `AsyncMemory` inside any async runtime (FastAPI, Starlette, asyncio loops). v0.1's async backend wraps the sync SQLite engine in `asyncio.to_thread`, so calls don't block the event loop. Calling synchronous `Memory` from an async handler emits `SyncInAsyncWarning` once — silence with `Memory(suppress_async_warning=True)` or `ATLASO_ASYNC_WARNINGS=0` if intentional. `for_user()` stays sync on `AsyncMemory`; only the data methods are awaitable. **What should I pass to `for_user()` — and what does `validate_user_id` do?** An *authenticated* identity: whatever your auth middleware returned after verifying the user. The shape constraint is `^[A-Za-z0-9_\-:.@]{1,128}$`; spaces and slashes are rejected. **Never pass a value from a request body.** For a single audit point, pass `validate_user_id=` to the `Memory()` constructor: it runs inside every `for_user(...)` and every lower-level call. Raise from it to block. **How does `add_many()` handle partial failures and duplicate keys?** Skip-and-report. Each `AddItem` requires an `idempotency_key`. Items that pass the gate land in `result.committed`; items where the key + identical content were seen in the last 24 hours land in `result.duplicates` (no second write); items the gate rejects land in `result.failed` as `BulkReject(item, error, index)` so you can retry the subset after fixing metadata. Same-key + *different* content raises `IdempotencyKeyConflict` with `.existing_id`. **Will the API change before v1.0?** Public API is design-locked. New methods may be added. Existing signatures, return types, and exception classes follow semver inside the alpha track. Read the release notes before each upgrade. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Architecture How atlaso's moving parts fit together. ## One-line mental model Append-only SQLite ledger of *typed* claims (polarity + evidence grade + scope). A **gate** rejects under-evidenced writes. A **dispersion-aware pipeline** groups hits by scope at read time and annotates each result with whether its neighbors agree, disagree, or conflict. ## Module map ```mermaid graph TD App[App / Caller] MCP[atlaso.mcp.server
FastMCP] Admin[atlaso.admin
cross-tenant ops] Mem[Memory / AsyncMemory
+ UserHandle] Val[_validation
user_id regex] Path[_path_resolver
.atlaso/ discovery] Local[LocalBackend
per-user FieldStore cache] Gate[_engine.gate
write-time policy] Aware[_engine.aware
bag stats + conflict] Retr[_engine.retrieval
BM25 + decay] Maturity[_engine.maturity
FMI geometric mean] Store[_engine.store
FieldStore] DB[(SQLite + FTS5
field.db)] App --> Mem MCP --> Mem Admin --> Local Mem --> Val Mem --> Local Local --> Path Local --> Gate Local --> Aware Local --> Maturity Aware --> Retr Aware --> Store Maturity --> Aware Maturity --> Store Retr --> Store Store --> DB ``` ## What happens on `add()` 1. Validate `user_id` against the strict regex (`_validation.py`). 2. Lazily resolve storage path (constructor / `ATLASO_PATH` / ancestor walk / project marker / cwd). 3. Open or fetch the cached per-user `FieldStore` at `/users//field.db`. 4. Run the gate. If rejected, raise `DepositRejectedError(gate_reason=...)`. 5. Insert into `deposits` + mirror into the contentless `deposits_fts` FTS5 virtual table. 6. Write one `contradictions` edge per id in `contradicts=[...]`. 7. Re-read the row, translate to a public `Deposit`, return `AddResult`. ```mermaid sequenceDiagram autonumber participant App as Caller participant API as Memory.add participant V as _validation participant BE as LocalBackend participant G as _engine.gate participant S as FieldStore App->>API: add(text, user_id, polarity, evidence_grade, scope) API->>V: check_user_id_shape(user_id) API->>BE: add(...) BE->>BE: resolve path, open per-user FieldStore BE->>G: evaluate_deposit_request(...) G-->>BE: GateDecision(accept | reject + reason) alt accepted BE->>S: INSERT deposits + deposits_fts + contradictions S-->>BE: deposit_id BE-->>API: AddResult(deposit) else rejected BE-->>API: raise DepositRejectedError(gate_reason) end ``` ## What happens on `recall()` 1. Sanitise the FTS5 query (strip ``" ( ) : * ? , ; . @ ! # $ % ^ & + = [ ] { } | \ < > ~ ` `` and OR-join remaining tokens). 2. `_engine.aware.query_aware` fans out per polarity with freshness decay. 3. Group every candidate into `BagStats` keyed by the six scope facets. 4. Compute `bag_precision` / `is_single_sample` / `has_conflict` / `conflict_peers` per bag. 5. Write one summary row to `query_log` — powers FMI Coverage. 6. Aggregate `has_disagreement` / `is_confident` across *all* bags *before* slicing. 7. Sort conflict bags first, flatten into per-deposit `SearchResult` rows. ```mermaid sequenceDiagram autonumber participant App as Caller participant API as Memory.recall participant BE as LocalBackend participant A as _engine.aware participant S as FieldStore App->>API: recall(query, user_id, limit, scope) API->>BE: recall(...) BE->>BE: sanitise FTS5 query (strip syntax chars, OR-join) BE->>A: query_aware(store, query, scope_filter, limit) A->>S: list_recent(1000) and BM25 per polarity S-->>A: candidate deposits A->>A: group by scope_bag, compute BagStats A->>S: log_query(bag_key, is_confident, has_conflict, ...) A-->>BE: { results: [bags], field_health } BE->>BE: aggregate flags across ALL bags, sort conflicts first, flatten BE-->>API: SearchResults(items, has_disagreement, is_confident) ``` --- # The gate A write-time policy that decides whether a deposit may enter the store, based on `polarity × scope_breadth × evidence_grade`. > The bar is `expected_harm × scope_breadth`, not polarity. A broad positive claim ("X always works") needs replicated evidence; a narrow negative claim needs provenance so the field doesn't fill with "didn't work on my machine" sludge. Open questions are always cheap to record. > — `_engine/gate.py` The gate fires **at write time** and only decides whether the deposit is allowed. Conflict flagging happens at read time in `_engine/aware.py`. ## Scope breadth ```python def _scope_breadth(scope: Scope) -> Literal["narrow", "broad"]: facets = sum(1 for f in ( scope.model, scope.dataset, scope.env, scope.version, scope.n, scope.seed, ) if f is not None) return "narrow" if facets >= 2 else "broad" ``` ## Rules - `open` → always accept. - `positive` + `broad` → requires `replicated` or `verified`. - `positive` + `narrow` → requires `observed` or stronger. - `negative` + `broad` → requires `observed` or stronger. - `negative` + `narrow` → requires provenance: at least one `artifact_refs` entry OR a scoped `env` or `version`. - `cautionary` → requires provenance. ## Author-role modifier When `author_role` normalises to `redteam` (strip non-alphanumeric + lowercase, so `redteam`, `red_team`, `Red-Team`, `RedTeam` all match), `negative` and `cautionary` claims get a +1 evidence-grade bump. Positive and open deposits are unaffected. ## When rejected ```python from atlaso import DepositRejectedError try: user.add( "threshold 0.7 is always optimal", polarity="positive", evidence_grade="anecdotal", ) except DepositRejectedError as e: print(e.gate_reason) # "positive/broad claim requires evidence_grade>=replicated; # upgrade evidence or narrow scope (e.g., add model= or dataset=)." ``` --- # Retrieval pipeline ``` query → filter → rank → aggregate ``` ## BM25 + asymmetric freshness decay ```python # _engine/retrieval.py _HALF_LIVES = { "positive": 14.0, # days "open": 14.0, "negative": 90.0, # ~three months "cautionary": 90.0, } # rank score = bm25 * exp(-age_days * ln2 / half_life) ``` Asymmetric on purpose. A known failure is a constraint that should stay fresh for months. A positive observation needs to be re-evidenced more often before the dispersion pipeline takes it seriously. ## k distribution ```python # _engine/aware.py — for recall(limit=N) k_positive = max(1, int(limit * 0.30)) k_negative = max(1, int(limit * 0.30)) k_cautionary = max(1, int(limit * 0.20)) k_open = max(1, int(limit * 0.20)) candidates = max(50, k * 10) # candidate over-fetch ``` ## Scope-bag aggregation Engine field names (left) → public field names (right) on `SearchResult`: - `bag_size` — total deposits in the bag. - `dominant_polarity` — directional polarity with the most deposits. - `bag_precision` → public `agreement_score` — fraction matching the dominant polarity. - `is_single_sample` → public `is_thin_evidence`. - `has_conflict` → public `has_disagreement` — bag contains ≥2 distinct directional polarities (any pair drawn from `{positive, negative, cautionary}`). - `conflict_peers` — ids of the opposing-polarity hits that flipped the bag. ## The `is_confident` rule ```python is_confident = ( bag_precision >= 0.99 and not is_single_sample and not has_conflict ) ``` ## Query log Each `recall()` writes one row to `query_log` summarising the top hit's flags + bag size + result count. Powers **Coverage** in the Field Maturity Index. > v0.1 retrieval is lexical BM25 — fast, deterministic, no embedding service required. v0.2 will add an optional embedding-backed recall behind the same `recall()` signature. --- # Storage layer One SQLite file per user, FTS5-indexed. WAL mode. No `~/.atlaso/` fallback for field data — atlaso writes alongside your project. ## On-disk layout ``` /.atlaso/ ├── users/ │ └── / │ └── field.db # per-user store ├── _unscoped/ │ └── field.db # atlaso.admin writes └── _idempotency.db # 24h Stripe-style dedup keys ``` ## Schema ```sql CREATE TABLE deposits ( id TEXT PRIMARY KEY, content TEXT NOT NULL, polarity TEXT NOT NULL CHECK (polarity IN ('positive','negative','cautionary','open')), evidence_grade TEXT NOT NULL CHECK (evidence_grade IN ('anecdotal','observed','replicated','verified')), author TEXT NOT NULL, task_id TEXT, repro_status TEXT NOT NULL CHECK (repro_status IN ('unreplicated','replicated','failed_repro')), created_at TEXT NOT NULL, scope_note TEXT NOT NULL, scope_model TEXT, scope_dataset TEXT, scope_env TEXT, scope_version TEXT, scope_n INTEGER, scope_seed INTEGER, tags_json TEXT NOT NULL, artifact_refs_json TEXT NOT NULL, author_role TEXT ); CREATE INDEX deposits_created_at ON deposits(created_at); CREATE INDEX deposits_polarity ON deposits(polarity); CREATE VIRTUAL TABLE deposits_fts USING fts5( body, content='' -- contentless: stores tokenization, not source text ); -- No ON DELETE CASCADE — hard delete in _local.py removes -- contradiction edges manually in FK-safe order. CREATE TABLE contradictions ( from_deposit_id TEXT NOT NULL, to_deposit_id TEXT NOT NULL, reason TEXT NOT NULL, created_at TEXT NOT NULL, PRIMARY KEY (from_deposit_id, to_deposit_id), FOREIGN KEY (from_deposit_id) REFERENCES deposits(id), FOREIGN KEY (to_deposit_id) REFERENCES deposits(id) ); CREATE TABLE query_log ( id TEXT PRIMARY KEY, -- UUID string created_at TEXT NOT NULL, bag_key TEXT, is_confident INTEGER, has_conflict INTEGER, is_single_sample INTEGER, bag_size INTEGER, result_count INTEGER ); ``` ## FTS5 is contentless `deposits_fts` uses `content=''` — stores tokenization, not source text. The indexed `body` column is `content + tags + scope.note` merged at insert time so one FTS query matches across all three fields. ## PRAGMAs - `foreign_keys=ON` - `journal_mode=WAL` - `busy_timeout=5000` Connections are serialised by a `threading.RLock` at the SDK boundary — multi-threaded applications are safe without user-side coordination. ## Soft retract vs hard delete Soft retract adds an `atlaso:retracted=` tombstone tag. The deposit becomes invisible to `recall()` but the row survives. Hard delete removes the row and triggers an FK-safe contentless delete on the FTS index. Default is soft. Use `hard_delete=True` only for GDPR / PII. ## No `~/.atlaso/` for field data Atlaso intentionally never falls back to `~/.atlaso/` for the field database — a global home-dir store would silently merge memories across unrelated projects. (`atlaso install-hooks` does write hook shell scripts to `~/.atlaso/hooks/` — that's tooling, not field data.) ## The vendored engine The `_engine/` directory is **byte-identical-vendored** from the upstream monorepo at wheel build time (Hatch hook in `tools/vendor_engine.py`). A CI gate (`tools/verify_engine_parity.py`) blocks releases where the SDK's copy has drifted. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # CLI reference `atlaso` is a CLI as much as a library. Every subcommand mirrors a `Memory()` method and an MCP tool, so you can debug the same field three ways. Every data subcommand accepts `--json` for machine-readable output. ## Health & sanity ### `atlaso version` ```bash atlaso version ``` Prints the installed SDK version. ### `atlaso check` ```bash atlaso check ``` One-line import check — confirms `atlaso` and (if installed) `atlaso[mcp]` resolve. ### `atlaso doctor` ```bash atlaso doctor ``` End-to-end install diagnostic. Checks import, vendored engine, resolved storage path, and runs an `add` → `recall` → `retract` round-trip in a tempdir. Returns `0` on success. ## Data operations `--user` is the authenticated identity — never a value from a request body. ### `atlaso add` ```bash atlaso add "Alice prefers oat milk" --user alice --tag preference atlaso add "0.7 over-flags in prod" --user alice \ --polarity negative --evidence observed \ --scope-note "model=gpt-5; env=prod" ``` - `--polarity` `positive | negative | cautionary | open` — default `open`. - `--evidence` `anecdotal | observed | replicated | verified` — default `anecdotal`. - `--scope-note STR`, `--tag T` (repeatable). ### `atlaso recall` ```bash atlaso recall "threshold" --user alice --limit 10 --explain ``` Prints a verdict line then one row per hit, prefixed with `⚠` for disagreement, `✓` for confident, `·` otherwise. ### `atlaso get` ```bash atlaso get 7e3a1b2c --user alice ``` Fetch one deposit by id. Exit 1 if not found. ### `atlaso list-recent` ```bash atlaso list-recent --user alice --limit 50 --offset 0 ``` ### `atlaso peek` ```bash atlaso peek alice --limit 10 ``` `user_id` is **positional** on `peek` and `health`. ### `atlaso health` ```bash atlaso health alice --window 30 ``` ### `atlaso contradict` ```bash atlaso contradict "Alice now prefers soy milk" 7e3a1b2c 9c0f1d2a \ --user alice --reason "Apr-23 conversation update" ``` Multiple deposit IDs accepted — all marked superseded by the new deposit atomically. `--reason` is **required**. ### `atlaso retract` ```bash # soft (default) atlaso retract 7e3a1b2c --user alice --reason "user request" # hard — irreversible, removes the row and clears FTS index atlaso retract 7e3a1b2c --user alice --reason "GDPR erasure" --hard ``` `--reason` is required. `--hard` is irreversible — use only for legal erasure. ## Operational ### `atlaso mcp` ```bash atlaso mcp ``` Starts the FastMCP stdio server. Requires `pip install "atlaso[mcp]"`. See [MCP & Hooks](./mcp-and-hooks.md). ### `atlaso install-hooks` ```bash atlaso install-hooks --scope user atlaso install-hooks --scope project --project-dir . ``` Writes `~/.atlaso/hooks/atlaso_recall_hook.sh` and `atlaso_deposit_hook.sh`, then merges `UserPromptSubmit` + `Stop` entries into Claude Code's `settings.json` (user or project scope). Idempotent on rerun. ### `atlaso uninstall-hooks` ```bash atlaso uninstall-hooks --scope user ``` Removes only the atlaso hook entries from `settings.json`. Leaves the scripts at `~/.atlaso/hooks/` — delete manually if you want a clean removal. ### `atlaso demo` ```bash atlaso demo ``` Runs the narrated end-to-end demo. Requires a source checkout (looks for `demo.py`). --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # MCP & Hooks Run atlaso as an MCP server (or as auto-fire hooks) inside Claude Code, Cursor, Codex, Windsurf, Cline — any MCP-compatible client. ## Why MCP MCP gives an LLM client a uniform way to call external tools. Running `atlaso mcp` exposes the full `Memory` surface — same Field 3.0 vocabulary, same conflict-aware verdicts, zero per-framework adapter code. ## Install ```bash pip install "atlaso[mcp]" ``` Pulls in `fastmcp>=0.2,<0.4`. Without the extra, `atlaso mcp` prints an install instruction and exits 1. ## Run ```bash atlaso mcp ``` Runs on **stdio** — the transport every MCP client speaks. No HTTP server to manage. ## Wire up Claude Code ```json // ~/.config/claude/mcp_servers.json { "mcpServers": { "atlaso": { "command": "atlaso", "args": ["mcp"] } } } ``` ## Tools exposed Tool names match the Python method names exactly. | Tool | Args | |---|---| | `add` | `content, user_id, polarity?, evidence_grade?, scope_note?, tags?` | | `recall` | `query, user_id, limit?` | | `get` | `deposit_id, user_id` | | `list_recent` | `user_id, limit?, offset?` | | `contradict` | `new_text, contradicts: list[str], reason, user_id` | | `retract` | `deposit_id, reason, user_id, hard_delete?` | | `health` | `user_id, window_days?` | | `peek` | `user_id, limit?` | | `add_many` | `user_id, items: list[dict], on_gate_reject?` | `add_many` items[] schema: ```text content (required, str) idempotency_key (required, str) polarity (optional, default "open") evidence_grade (optional, default "anecdotal") scope_note (optional, str) tags (optional, list[str]) contradicts (optional, list[str]) ``` ## ATLASO_API_KEY at startup The server starts even if `ATLASO_API_KEY` is unset — a warning prints to stderr. In v0.1 the local backend doesn't require it; tool calls succeed against the local field. The env var is sniffed today so v0.2 (remote backend) can enforce auth without breaking your config. --- ## Claude Code hooks Hooks make memory **automatic**. Where MCP gives the agent tools it has to remember to call, hooks fire on every prompt and turn-end regardless. ```bash atlaso install-hooks --scope user ``` Writes: - `~/.atlaso/hooks/atlaso_recall_hook.sh` - `~/.atlaso/hooks/atlaso_deposit_hook.sh` And registers them in `~/.claude/settings.json`: - `UserPromptSubmit` — synchronous `atlaso recall `, prepends results to context via `hookSpecificOutput.additionalContext`. - `Stop` — async fire-and-forget `atlaso add ` with `--tag claude-code --tag auto-deposit`. Both hooks resolve `user_id` from `$ATLASO_USER`, falling back to `$USER`. They fail silently on timeout or missing CLI — the agent's turn is never broken by a hook. ### Remove ```bash atlaso uninstall-hooks --scope user ``` Removes only the atlaso entries from `settings.json`. The shell scripts remain at `~/.atlaso/hooks/` — delete manually if you want a clean removal. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Configuration Every knob is either a constructor kwarg or an env var. **There is no config file.** ## Environment variables ### `ATLASO_PATH` Forces the base storage directory. Bypasses the project-marker walk. The path is expanded with `~` and resolved to absolute. ```bash export ATLASO_PATH=/var/atlaso-data python my_app.py ``` ### `ATLASO_API_KEY` Reserved for the v0.2 remote backend. The MCP server warns to stderr if unset; the v0.1 local backend doesn't require it. ### `ATLASO_BASE_URL` Reserved for v0.2. Defaults to `https://api.atlaso.dev`. ### `ATLASO_ASYNC_WARNINGS` Set to `0` / `false` / `no` / `off` to suppress `SyncInAsyncWarning` process-wide. Unknown values default to ON (no silent disable on typo). ### `USER` (read-only fallback for hooks) The Claude Code hook scripts at `~/.atlaso/hooks/` resolve `user_id` from `$ATLASO_USER` with a fallback to the standard shell `$USER`. The SDK itself doesn't read these — they only affect hook behaviour. ### `ATLASO_TELEMETRY` **Reserved no-op.** Has no effect in v0.1. If set truthy, atlaso emits one `UserWarning` to make the no-op explicit. The variable is name-reserved so future versions can introduce opt-in telemetry without colliding with anything you set today. **Atlaso ships zero telemetry, ever.** ## Storage path resolution With no `path=` and no `ATLASO_PATH`, atlaso walks from cwd outward: 1. Explicit `path=` kwarg. 2. `ATLASO_PATH` env var. 3. Walk cwd → ancestors: first existing `.atlaso/` directory wins. 4. Walk cwd → ancestors: first directory with `pyproject.toml`, `package.json`, `.git`, `Cargo.toml`, or `go.mod` — creates `/.atlaso/`. 5. cwd as a last resort — creates `/.atlaso/`. Walk stops at `$HOME` or filesystem root, depth-capped at 64. **There is no `~/.atlaso/` fallback** for field data. ```bash atlaso doctor # prints the resolved path + its source ``` ## Transport injection For tests, inject an `httpx.MockTransport`. Atlaso refuses an `httpx.Client` here on purpose — the client holds Authorization headers, and sharing one across tenants would leak them. ```python import httpx from atlaso import Memory def stub(request: httpx.Request) -> httpx.Response: return httpx.Response(200, json={"ok": True}) m = Memory(transport=httpx.MockTransport(stub)) ``` > `Memory(transport=httpx.Client(...))` raises `ConfigValidationError` immediately. ## Constructor knobs ```python Memory( api_key=None, # v0.2 — env: ATLASO_API_KEY base_url=None, # v0.2 — env: ATLASO_BASE_URL path=None, # storage — env: ATLASO_PATH validate_user_id=None, # callable; raise to block timeout=10.0, max_retries=2, suppress_async_warning=None, # env: ATLASO_ASYNC_WARNINGS transport=None, ) ``` --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Admin API `atlaso.admin` is the cross-tenant escape hatch. Every function takes a literal `confirm` string so importing the module is a code-review event. ## Why a separate module Per-user isolation is the default. Cross-tenant queries — search all users for a string, list all user IDs, write to the unscoped shard — are sometimes necessary (audits, migrations, debugging) but should never be reachable from tab-completion on `Memory`. Atlaso puts them in their own module and requires a literal confirm string so the function is *greppable* by auditors. ## Shape ```python from atlaso import Memory from atlaso.admin import ( search_across_users, search_across_users_async, list_all_user_ids, search_unscoped, add_unscoped, ) m = Memory() hits = search_across_users( m, "leaked api key", confirm="I_UNDERSTAND_THIS_CROSSES_TENANTS", limit=100, ) ``` The `confirm` parameter is a `Literal["I_UNDERSTAND_THIS_CROSSES_TENANTS"]` — static checkers reject any other value. Auditors can grep for the string to find every call site that crossed tenants. ## Functions - `search_across_users(memory, query, *, confirm, limit=100) -> list[Deposit]` - `search_across_users_async(memory, query, *, confirm, limit=100)` - `list_all_user_ids(memory, *, confirm) -> list[str]` - `search_unscoped(memory, query, *, confirm, limit=100) -> list[Deposit]` - `add_unscoped(memory, text, *, confirm) -> str` ## v0.1 status The admin transport is not wired in v0.1. Every function raises `NotImplementedError("v0.1.0a0: admin transport pending")`. The contract above is design-locked — when v0.2 ships the transport, your call sites don't change. ## Why "import is a code-review event" Tab-completing `m.` never offers an admin verb. The word `admin` appears in the import statement and the confirm string appears in every call. Any reviewer reading a diff can't miss it. Greppable by design. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Atlaso + LangChain `pip install atlaso[langchain]` is reserved namespace today — the extra installs only `atlaso` core and emits a `FrameworkExtraReservedWarning` once per process when LangChain is detected alongside it. Until the dedicated adapter ships, wrap Atlaso into LangChain manually: ```python from atlaso import Memory from langchain.memory import BaseMemory class AtlasoMemory(BaseMemory): memory_key: str = "history" def __init__(self, user_id: str, **kwargs): super().__init__(**kwargs) self._mem = Memory().for_user(user_id) @property def memory_variables(self) -> list[str]: return [self.memory_key] def load_memory_variables(self, inputs: dict) -> dict: hits = self._mem.recall(inputs.get("input", ""), limit=5) return {self.memory_key: "\n".join(h.content for h in hits if h.is_confident)} def save_context(self, inputs: dict, outputs: dict) -> None: if outputs.get("output"): self._mem.add(outputs["output"]) def clear(self) -> None: # Atlaso has no `clear` — deposits are immutable evidence. # Use `m.retract(deposit_id, reason=...)` per-row instead. raise NotImplementedError( "Atlaso deposits are immutable; clear is not supported. " "Retract individual deposits via m.retract(...)." ) ``` **What you lose by going through `BaseMemory`:** the dispersion fields (`is_confident`, `has_disagreement`, `agreement_score`, `conflict_peers`) don't fit `dict[str, str]`. The recipe filters on `is_confident` to avoid surfacing conflicted memories as authoritative. If your application can branch on disagreement (e.g. ask the user for confirmation), call `m.recall()` directly instead. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Atlaso + LlamaIndex `pip install atlaso[llamaindex]` is reserved namespace today; full adapter lands based on revealed-intent signals over 60 days post-release. Today, wrap manually: ```python from atlaso import Memory from llama_index.core.memory.types import BaseMemory from llama_index.core.llms import ChatMessage, MessageRole class AtlasoLlamaMemory(BaseMemory): """LlamaIndex BaseMemory backed by Atlaso. Carries dispersion fields in metadata.""" def __init__(self, user_id: str): self._mem = Memory().for_user(user_id) def get(self, input: str | None = None, **kwargs) -> list[ChatMessage]: if not input: recent = self._mem.list_recent(limit=10) return [ChatMessage(role=MessageRole.USER, content=d.content) for d in recent] results = self._mem.recall(input, limit=10) msgs = [] for r in results: # LlamaIndex BaseMemory carries metadata across the boundary — # use it to surface the +10pp dispersion fields downstream. msgs.append( ChatMessage( role=MessageRole.USER, content=r.content, additional_kwargs={ "is_confident": r.is_confident, "has_disagreement": r.has_disagreement, "agreement_score": r.agreement_score, }, ) ) return msgs def put(self, message: ChatMessage) -> None: if message.content: self._mem.add(message.content) def reset(self) -> None: # See langchain.md note — Atlaso deposits are immutable. raise NotImplementedError("Use m.retract(deposit_id, reason=...) per row.") ``` **Why LlamaIndex over LangChain for Atlaso:** LlamaIndex's `BaseMemory.get()` returns `ChatMessage` objects with `additional_kwargs`, which lets the dispersion fields ride along. LangChain's `BaseMemory` returns `dict[str, str]` and drops them. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Atlaso + DSPy `pip install atlaso[dspy]` is reserved namespace. DSPy modules typically take a retriever as a kwarg — wrap Atlaso as a custom retriever: ```python import dspy from atlaso import Memory memory = Memory() class AtlasoRetrieve(dspy.Retrieve): """DSPy retriever backed by Atlaso. Carries dispersion fields in each Example so DSPy programs can branch on confidence.""" def __init__(self, user_id: str, k: int = 5): super().__init__(k=k) self._user = memory.for_user(user_id) def forward(self, query_or_queries: str | list[str], k: int | None = None) -> dspy.Prediction: queries = ( [query_or_queries] if isinstance(query_or_queries, str) else query_or_queries ) passages = [] for q in queries: results = self._user.recall(q, limit=k or self.k) for r in results: passages.append( dspy.Example( long_text=r.content, is_confident=r.is_confident, has_disagreement=r.has_disagreement, agreement_score=r.agreement_score, ) ) return dspy.Prediction(passages=passages) # Use in a DSPy program: class AssistantWithMemory(dspy.Module): def __init__(self, user_id: str): super().__init__() self.retrieve = AtlasoRetrieve(user_id=user_id, k=5) self.respond = dspy.Predict("question, context -> answer") def forward(self, question: str) -> dspy.Prediction: passages = self.retrieve(question).passages # Trust hierarchy: confident > unconfirmed > disagreement. # DSPy programs can use `is_confident` as a feature for the predictor # or filter conflicted passages out of context entirely. trusted = [p for p in passages if p.is_confident] ctx = "\n".join(p.long_text for p in (trusted or passages)) return self.respond(question=question, context=ctx) ``` Atlaso's dispersion fields fit DSPy's example metadata cleanly — train your DSPy program on whether `is_confident` matters for downstream answer quality, and you get a confidence-aware retrieval pipeline without writing the confidence model yourself. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Atlaso + OpenAI Agents SDK `pip install atlaso[openai-agents]` is reserved namespace. The OpenAI Agents SDK uses `@function_tool` decorators — wrap Atlaso verbs as tools and the agent picks them up: ```python from atlaso import Memory from agents import Agent, function_tool memory = Memory() @function_tool def remember(text: str, user_id: str) -> str: """Save a fact to long-term memory for this user.""" result = memory.add(text, user_id=user_id) return f"Saved as deposit {result.id}" @function_tool def recall(query: str, user_id: str) -> str: """Search long-term memory. Returns the verdict (action language) plus per-hit content + confidence flags so the agent can branch on disagreement instead of treating retrieval as authoritative. """ results = memory.recall(query, user_id=user_id, limit=5) lines = [results.explain()] for r in results: prefix = "?" if r.has_disagreement else ("✓" if r.is_confident else "·") lines.append(f"{prefix} {r.content}") return "\n".join(lines) @function_tool def contradict(new_text: str, supersedes_deposit_id: str, reason: str, user_id: str) -> str: """When you learn something that contradicts an earlier memory, deposit the new finding AND mark the old one as superseded — atomically. Atlaso has no `update`; this is the canonical revision verb. """ result = memory.contradict( new_text, contradicts=[supersedes_deposit_id], reason=reason, user_id=user_id, ) return f"Recorded as deposit {result.id}; old fact retracted with audit reason." assistant = Agent( name="Assistant", instructions=( "You have long-term memory via remember/recall/contradict. " "Branch on the prefix in recall: ✓ = trust, ? = disagreement, · = unconfirmed." ), tools=[remember, recall, contradict], ) ``` Atlaso pairs especially well with OpenAI Agents SDK because the agent loop already passes structured outputs back to the LLM — surfacing `is_confident` / `has_disagreement` as inline prefixes lets the model branch on retrieval quality without you writing extra orchestration. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12 --- # Atlaso + CrewAI `pip install atlaso[crewai]` is reserved namespace. Today, wire Atlaso as CrewAI's `external_memory`: ```python from atlaso import Memory from crewai import Agent, Crew, Task from crewai.memory.external.external_memory import ExternalMemory class AtlasoExternalMemory(ExternalMemory): """Atlaso-backed external memory for a CrewAI Crew.""" def __init__(self, user_id: str = "crew"): super().__init__() self._mem = Memory().for_user(user_id) def save(self, value: str, metadata: dict | None = None, agent: str | None = None) -> None: self._mem.add( value, polarity="open", tags=[f"agent:{agent}"] if agent else (), scope=None, ) def search(self, query: str, limit: int = 3, score_threshold: float = 0.0) -> list[dict]: results = self._mem.recall(query, limit=limit) return [ { "context": r.content, "metadata": { "is_confident": r.is_confident, "has_disagreement": r.has_disagreement, "score": r.score, }, } for r in results if r.score >= score_threshold ] crew = Crew( agents=[Agent(role="Researcher", goal="...", backstory="...")], tasks=[Task(description="...", expected_output="...")], external_memory=AtlasoExternalMemory(user_id="my-crew-id"), ) ``` CrewAI claims a "90% token-cost reduction" with external memory; Atlaso's dispersion-aware retrieval makes that reduction safer because conflicted memories don't get treated as authoritative when they're fetched into context. --- **Source:** **Edit on GitHub:** **Updated:** 2026-05-12