REST API

FastAPI server exposing the Pensyve memory runtime over HTTP.

uvicorn pensyve_server.main:app --reload

Base URL: http://localhost:3000

Authentication

Auth is opt-in. Set PENSYVE_API_KEYS to a comma-separated list of keys. When set, every request must include:

Authorization: Bearer your-api-key

When PENSYVE_API_KEYS is unset, all endpoints are open.

Pagination

Recall and inspect endpoints support cursor-based pagination. Pass cursor (a memory ID) to fetch the next page. The response includes a cursor field — null when there are no more results.

Endpoints

`GET /v1/health`

Health check.

Response:

{ "status": "ok", "version": "0.1.0" }

curl http://localhost:3000/v1/health

`POST /v1/entities`

Create or get an entity.

Request body:

Field	Type	Default	Description
`name`	`string`	required	Entity name
`kind`	`string`	`"user"`	`"agent"`, `"user"`, `"team"`, or `"tool"`

Response: 200

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "alice",
  "kind": "user"
}

curl -X POST http://localhost:3000/v1/entities \
  -H "Content-Type: application/json" \
  -d '{"name": "alice", "kind": "user"}'

`POST /v1/remember`

Store a semantic memory about an entity.

Request body:

Field	Type	Default	Description
`entity`	`string`	required	Entity name
`fact`	`string`	required	The fact to store
`confidence`	`number`	`0.8`	Confidence in `[0, 1]`

Response: 200

{
  "id": "a1b2c3d4-...",
  "content": "Alice prefers dark mode",
  "memory_type": "semantic",
  "confidence": 0.8,
  "stability": 1.0,
  "score": null
}

curl -X POST http://localhost:3000/v1/remember \
  -H "Content-Type: application/json" \
  -d '{"entity": "alice", "fact": "Alice prefers dark mode"}'

When PENSYVE_TIER2_ENABLED=true, additional facts are automatically extracted via LLM and stored alongside the explicit fact.

`POST /v1/recall`

Search memories. Fuses vector, BM25, graph, recency, and other signals.

Request body:

Field	Type	Default	Description
`query`	`string`	required	Search query
`entity`	`string \| null`	`null`	Filter to a specific entity
`limit`	`integer`	`5`	Max results per page
`types`	`string[] \| null`	`null`	Filter by memory type: `"episodic"`, `"semantic"`, `"procedural"`

Query parameters:

Param	Type	Description
`cursor`	`string \| null`	Memory ID to paginate from

Response: 200

{
  "memories": [
    {
      "id": "a1b2c3d4-...",
      "content": "Alice prefers dark mode",
      "memory_type": "semantic",
      "confidence": 0.8,
      "stability": 1.0,
      "score": 0.87
    }
  ],
  "contradictions": [],
  "cursor": null
}

The contradictions array is populated only when Tier 2 extraction is enabled. Each entry is {"description": "..."}.

curl -X POST http://localhost:3000/v1/recall \
  -H "Content-Type: application/json" \
  -d '{"query": "dark mode preference", "entity": "alice", "limit": 10}'

Search memories and cluster them by source session in one round trip. Same RRF fusion pipeline as /v1/recall, post-processed by the core engine to group results by episode_id. The canonical entry point for "memory as input to an LLM reader" workflows.

Internal benchmarking on LongMemEval_S confirmed that this layout produces materially better reader accuracy than flat recall — moving the session-clustering step into the engine eliminates a class of consumer-side reordering bugs and matches the prompt format the underlying retrieval pipeline is tuned for.

Request body:

Field	Type	Default	Description
`query`	`string`	required	Search query
`limit`	`integer`	`50`	Max memories to consider across all groups
`order`	`"chronological" \| "relevance"`	`"chronological"`	Group ordering (oldest-first or highest-scoring-first)
`max_groups`	`integer \| null`	`null`	Optional cap on the number of returned groups

Response: 200

{
  "groups": [
    {
      "session_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
      "session_time": "2026-01-01T10:00:00+00:00",
      "group_score": 0.92,
      "memories": [
        {
          "id": "a1b2c3d4-...",
          "content": "user: I bought three books yesterday",
          "memory_type": "episodic",
          "confidence": 1.0,
          "stability": 0.8,
          "score": 0.92
        },
        {
          "id": "b2c3d4e5-...",
          "content": "assistant: Nice — any you'd recommend?",
          "memory_type": "episodic",
          "confidence": 1.0,
          "stability": 0.8,
          "score": 0.92
        }
      ]
    },
    {
      "session_id": null,
      "session_time": "2026-02-01T09:00:00+00:00",
      "group_score": 0.51,
      "memories": [
        {
          "id": "c3d4e5f6-...",
          "content": "Alice prefers hardcover",
          "memory_type": "semantic",
          "confidence": 0.9,
          "stability": 1.0,
          "score": 0.51
        }
      ]
    }
  ]
}

session_id is null for semantic and procedural memories that have no episode ancestor — they surface as singleton groups. The default chronological order matches the LongMemEval-validated layout: oldest session first, with within-group memories in conversation order.

Errors:

400 Bad Request if order is not "chronological" or "relevance".
500 Internal Server Error if the underlying recall pipeline fails.

curl -X POST http://localhost:3000/v1/recall_grouped \
  -H "Content-Type: application/json" \
  -d '{"query": "How many books did I buy?", "limit": 50, "order": "chronological"}'

`POST /v1/episodes/start`

Begin tracking an interaction episode.

Request body:

Field	Type	Default	Description
`participants`	`string[]`	required	Entity names of the participants

Response: 200

{
  "episode_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}

Episodes expire after 30 minutes of inactivity.

curl -X POST http://localhost:3000/v1/episodes/start \
  -H "Content-Type: application/json" \
  -d '{"participants": ["alice", "my-agent"]}'

`POST /v1/episodes/message`

Add a message to an active episode.

Request body:

Field	Type	Default	Description
`episode_id`	`string`	required	Episode ID from `/v1/episodes/start`
`role`	`string`	required	Speaker role (e.g. `"user"`, `"assistant"`)
`content`	`string`	required	Message text

Response: 200

{ "status": "ok" }

Error: 404 if the episode ID is not found or has expired.

curl -X POST http://localhost:3000/v1/episodes/message \
  -H "Content-Type: application/json" \
  -d '{
    "episode_id": "f47ac10b-...",
    "role": "user",
    "content": "What is the status of project X?"
  }'

`POST /v1/episodes/end`

Close an episode and extract memories.

Request body:

Field	Type	Default	Description
`episode_id`	`string`	required	Episode ID
`outcome`	`string \| null`	`null`	`"success"`, `"failure"`, or `"partial"`

Response: 200

{
  "memories_created": 3
}

Error: 404 if the episode ID is not found.

curl -X POST http://localhost:3000/v1/episodes/end \
  -H "Content-Type: application/json" \
  -d '{"episode_id": "f47ac10b-...", "outcome": "success"}'

`DELETE /v1/entities/{entity_name}`

Archive or permanently delete all memories for an entity.

Path parameters:

Param	Type	Description
`entity_name`	`string`	Entity name

Query parameters:

Param	Type	Default	Description
`hard_delete`	`boolean`	`false`	Permanently delete instead of archiving

Response: 200

{
  "forgotten_count": 12
}

curl -X DELETE http://localhost:3000/v1/entities/alice

# Permanent deletion
curl -X DELETE "http://localhost:3000/v1/entities/alice?hard_delete=true"

`POST /v1/inspect`

View all memories for an entity, grouped by type.

Request body:

Field	Type	Default	Description
`entity`	`string`	required	Entity name
`limit`	`integer`	`50`	Max results per page
`cursor`	`string \| null`	`null`	Memory ID to paginate from

Response: 200

{
  "entity": "alice",
  "episodic": [
    {
      "id": "...",
      "content": "Asked about project X",
      "memory_type": "episodic",
      "confidence": 1.0,
      "stability": 0.95,
      "score": null
    }
  ],
  "semantic": [],
  "procedural": [],
  "cursor": null
}

curl -X POST http://localhost:3000/v1/inspect \
  -H "Content-Type: application/json" \
  -d '{"entity": "alice"}'

`GET /v1/stats`

Memory statistics for the current namespace.

Response: 200

{
  "namespace": "default",
  "entities": 0,
  "episodic_memories": 42,
  "semantic_memories": 15,
  "procedural_memories": 3
}

curl http://localhost:3000/v1/stats

`POST /v1/consolidate`

Trigger background consolidation. Promotes repeated episodic memories to semantic, applies FSRS decay, and archives memories below threshold.

Request body: None.

Response: 200

{
  "promoted": 3,
  "decayed": 12,
  "archived": 1
}

curl -X POST http://localhost:3000/v1/consolidate

Error Responses

All errors return JSON:

{
  "detail": "Episode f47ac10b-... not found"
}

Status	Meaning
`400`	Invalid request body
`401`	Missing or invalid `Authorization: Bearer` token
`404`	Resource not found (episode, entity)
`422`	Validation error (Pydantic)
`500`	Internal server error

Environment Variables

Variable	Default	Purpose
`PENSYVE_PATH`	`~/.pensyve/`	SQLite database path
`PENSYVE_NAMESPACE`	`"default"`	Memory namespace
`PENSYVE_API_KEYS`	unset	Comma-separated API keys
`PENSYVE_TIER2_ENABLED`	`"false"`	Enable LLM-based Tier 2 extraction
`PENSYVE_TIER2_MODEL_PATH`	unset	Path to GGUF model for Tier 2

REST API

Authentication

Endpoints

`GET /v1/health`

`POST /v1/entities`

`POST /v1/remember`

`POST /v1/recall`

`POST /v1/recall_grouped`

`POST /v1/episodes/start`

`POST /v1/episodes/message`

`POST /v1/episodes/end`

`DELETE /v1/entities/{entity_name}`

`POST /v1/inspect`

`GET /v1/stats`

`POST /v1/consolidate`

Error Responses

Environment Variables

On this page