REST API
FastAPI server exposing the Pensyve memory runtime over HTTP.
uvicorn pensyve_server.main:app --reloadBase URL: http://localhost:3000
Authentication
Auth is opt-in. Set PENSYVE_API_KEYS to a comma-separated list of keys. When set, every request must include:
Authorization: Bearer your-api-keyWhen PENSYVE_API_KEYS is unset, all endpoints are open.
Pagination
Recall and inspect endpoints support cursor-based pagination. Pass cursor (a memory ID) to fetch the next page. The response includes a cursor field — null when there are no more results.
Endpoints
GET /v1/health
Health check.
Response:
{ "status": "ok", "version": "0.1.0" }curl http://localhost:3000/v1/healthPOST /v1/entities
Create or get an entity.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
name | string | required | Entity name |
kind | string | "user" | "agent", "user", "team", or "tool" |
Response: 200
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"name": "alice",
"kind": "user"
}curl -X POST http://localhost:3000/v1/entities \
-H "Content-Type: application/json" \
-d '{"name": "alice", "kind": "user"}'POST /v1/remember
Store a semantic memory about an entity.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
entity | string | required | Entity name |
fact | string | required | The fact to store |
confidence | number | 0.8 | Confidence in [0, 1] |
Response: 200
{
"id": "a1b2c3d4-...",
"content": "Alice prefers dark mode",
"memory_type": "semantic",
"confidence": 0.8,
"stability": 1.0,
"score": null
}curl -X POST http://localhost:3000/v1/remember \
-H "Content-Type: application/json" \
-d '{"entity": "alice", "fact": "Alice prefers dark mode"}'When PENSYVE_TIER2_ENABLED=true, additional facts are automatically
extracted via LLM and stored alongside the explicit fact.
POST /v1/recall
Search memories. Fuses vector, BM25, graph, recency, and other signals.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
query | string | required | Search query |
entity | string | null | null | Filter to a specific entity |
limit | integer | 5 | Max results per page |
types | string[] | null | null | Filter by memory type: "episodic", "semantic", "procedural" |
Query parameters:
| Param | Type | Description |
|---|---|---|
cursor | string | null | Memory ID to paginate from |
Response: 200
{
"memories": [
{
"id": "a1b2c3d4-...",
"content": "Alice prefers dark mode",
"memory_type": "semantic",
"confidence": 0.8,
"stability": 1.0,
"score": 0.87
}
],
"contradictions": [],
"cursor": null
}The contradictions array is populated only when Tier 2 extraction is enabled. Each entry is {"description": "..."}.
curl -X POST http://localhost:3000/v1/recall \
-H "Content-Type: application/json" \
-d '{"query": "dark mode preference", "entity": "alice", "limit": 10}'POST /v1/recall_grouped
Search memories and cluster them by source session in one round trip. Same RRF fusion pipeline as /v1/recall, post-processed by the core engine to group results by episode_id. The canonical entry point for "memory as input to an LLM reader" workflows.
Internal benchmarking on LongMemEval_S confirmed that this layout produces materially better reader accuracy than flat recall — moving the session-clustering step into the engine eliminates a class of consumer-side reordering bugs and matches the prompt format the underlying retrieval pipeline is tuned for.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
query | string | required | Search query |
limit | integer | 50 | Max memories to consider across all groups |
order | "chronological" | "relevance" | "chronological" | Group ordering (oldest-first or highest-scoring-first) |
max_groups | integer | null | null | Optional cap on the number of returned groups |
Response: 200
{
"groups": [
{
"session_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"session_time": "2026-01-01T10:00:00+00:00",
"group_score": 0.92,
"memories": [
{
"id": "a1b2c3d4-...",
"content": "user: I bought three books yesterday",
"memory_type": "episodic",
"confidence": 1.0,
"stability": 0.8,
"score": 0.92
},
{
"id": "b2c3d4e5-...",
"content": "assistant: Nice — any you'd recommend?",
"memory_type": "episodic",
"confidence": 1.0,
"stability": 0.8,
"score": 0.92
}
]
},
{
"session_id": null,
"session_time": "2026-02-01T09:00:00+00:00",
"group_score": 0.51,
"memories": [
{
"id": "c3d4e5f6-...",
"content": "Alice prefers hardcover",
"memory_type": "semantic",
"confidence": 0.9,
"stability": 1.0,
"score": 0.51
}
]
}
]
}session_id is null for semantic and procedural memories that have no episode ancestor — they surface as singleton groups. The default chronological order matches the LongMemEval-validated layout: oldest session first, with within-group memories in conversation order.
Errors:
400 Bad Requestiforderis not"chronological"or"relevance".500 Internal Server Errorif the underlying recall pipeline fails.
curl -X POST http://localhost:3000/v1/recall_grouped \
-H "Content-Type: application/json" \
-d '{"query": "How many books did I buy?", "limit": 50, "order": "chronological"}'POST /v1/episodes/start
Begin tracking an interaction episode.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
participants | string[] | required | Entity names of the participants |
Response: 200
{
"episode_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479"
}Episodes expire after 30 minutes of inactivity.
curl -X POST http://localhost:3000/v1/episodes/start \
-H "Content-Type: application/json" \
-d '{"participants": ["alice", "my-agent"]}'POST /v1/episodes/message
Add a message to an active episode.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
episode_id | string | required | Episode ID from /v1/episodes/start |
role | string | required | Speaker role (e.g. "user", "assistant") |
content | string | required | Message text |
Response: 200
{ "status": "ok" }Error: 404 if the episode ID is not found or has expired.
curl -X POST http://localhost:3000/v1/episodes/message \
-H "Content-Type: application/json" \
-d '{
"episode_id": "f47ac10b-...",
"role": "user",
"content": "What is the status of project X?"
}'POST /v1/episodes/end
Close an episode and extract memories.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
episode_id | string | required | Episode ID |
outcome | string | null | null | "success", "failure", or "partial" |
Response: 200
{
"memories_created": 3
}Error: 404 if the episode ID is not found.
curl -X POST http://localhost:3000/v1/episodes/end \
-H "Content-Type: application/json" \
-d '{"episode_id": "f47ac10b-...", "outcome": "success"}'DELETE /v1/entities/{entity_name}
Archive or permanently delete all memories for an entity.
Path parameters:
| Param | Type | Description |
|---|---|---|
entity_name | string | Entity name |
Query parameters:
| Param | Type | Default | Description |
|---|---|---|---|
hard_delete | boolean | false | Permanently delete instead of archiving |
Response: 200
{
"forgotten_count": 12
}curl -X DELETE http://localhost:3000/v1/entities/alice
# Permanent deletion
curl -X DELETE "http://localhost:3000/v1/entities/alice?hard_delete=true"POST /v1/inspect
View all memories for an entity, grouped by type.
Request body:
| Field | Type | Default | Description |
|---|---|---|---|
entity | string | required | Entity name |
limit | integer | 50 | Max results per page |
cursor | string | null | null | Memory ID to paginate from |
Response: 200
{
"entity": "alice",
"episodic": [
{
"id": "...",
"content": "Asked about project X",
"memory_type": "episodic",
"confidence": 1.0,
"stability": 0.95,
"score": null
}
],
"semantic": [],
"procedural": [],
"cursor": null
}curl -X POST http://localhost:3000/v1/inspect \
-H "Content-Type: application/json" \
-d '{"entity": "alice"}'GET /v1/stats
Memory statistics for the current namespace.
Response: 200
{
"namespace": "default",
"entities": 0,
"episodic_memories": 42,
"semantic_memories": 15,
"procedural_memories": 3
}curl http://localhost:3000/v1/statsPOST /v1/consolidate
Trigger background consolidation. Promotes repeated episodic memories to semantic, applies FSRS decay, and archives memories below threshold.
Request body: None.
Response: 200
{
"promoted": 3,
"decayed": 12,
"archived": 1
}curl -X POST http://localhost:3000/v1/consolidateError Responses
All errors return JSON:
{
"detail": "Episode f47ac10b-... not found"
}| Status | Meaning |
|---|---|
400 | Invalid request body |
401 | Missing or invalid Authorization: Bearer token |
404 | Resource not found (episode, entity) |
422 | Validation error (Pydantic) |
500 | Internal server error |
Environment Variables
| Variable | Default | Purpose |
|---|---|---|
PENSYVE_PATH | ~/.pensyve/ | SQLite database path |
PENSYVE_NAMESPACE | "default" | Memory namespace |
PENSYVE_API_KEYS | unset | Comma-separated API keys |
PENSYVE_TIER2_ENABLED | "false" | Enable LLM-based Tier 2 extraction |
PENSYVE_TIER2_MODEL_PATH | unset | Path to GGUF model for Tier 2 |