Site menu
Sections in this article

Documentation

memok-ai core library

Who this page is for

This site page summarizes the memok-ai npm package (SQLite + MemokPipelineConfig + memok-ai/bridge). It is not the OpenClaw install guide.

  • Library integrators — you ship a Node.js service, worker, or CLI and want long-text or conversation memory in SQLite.
  • OpenClaw readers crossing over — you want a clear boundary: what the plugin does vs what the core library exports.
  • Core contributors — you hack on galaxy8691/memok-ai itself (tests, pipelines, packaging).

Suggested reading paths:

  1. Integrators: Quickstart → Integration → Patterns → Environment variables (skim) → Links.
  2. OpenClaw operators: start at this site’s OpenClaw doc (/docs/openclaw-plugin) and the plugin README; return here only for bridge API shape and SQLite semantics.
  3. Contributors: Requirements → Installation (clone) → For contributors.

Note

Installers, openclaw memok setup, gateway/plugin compatibility, and troubleshooting live in https://github.com/galaxy8691/memok-ai-openclaw. This page focuses on the library.

Quickstart (about five minutes)

Install the package, create a fresh SQLite file once, run one ingest pass. You still need a valid OpenAI-compatible API key in your process environment (or pass openaiApiKey explicitly). memok-ai does not load .env for you.

bash

bash
npm install memok-ai

TypeScript (minimal)

TypeScript
import { createFreshMemokSqliteFile, articleWordPipeline } from "memok-ai/bridge";

const dbPath = "./data/memok.sqlite";
createFreshMemokSqliteFile(dbPath); // omit if DB already exists

await articleWordPipeline("Your long text or consolidated chat …", {
  dbPath,
  openaiApiKey: process.env.OPENAI_API_KEY!,
  openaiBaseUrl: process.env.OPENAI_BASE_URL,
  llmModel: "gpt-4o-mini",
  llmMaxWorkers: 4,
  articleSentencesMaxOutputTokens: 8192,
  coreWordsNormalizeMaxOutputTokens: 32768,
  sentenceMergeMaxCompletionTokens: 2048,
});

Next: Integration for imports, recall/feedback, and production configuration; Patterns for typical service shapes.

Choose your path

Pick a row that matches your goal; the right column points to the doc or package you should follow first.

GoalRecommendationWhy
Give OpenClaw assistants durable memory with a wizardThis site: /docs/openclaw-plugin — plus plugin repo READMEGateway wiring, one-line installers, openclaw memok setup, and compatibility matrix are maintained there—not in the core repo.
Call memok from your own HTTP API, queue worker, or scriptnpm package memok-ai (often import memok-ai/bridge only)You own config, secrets, scheduling, and how recall is injected into prompts.
Fork or extend pipelines / SQLite schemaClone https://github.com/galaxy8691/memok-ai and follow For contributorsYou need the full source, tests, and CI scripts from the core repository.
Pure semantic search over embeddings in a hosted vector DBA dedicated vector productmemok-ai optimizes for a structured, reinforceable graph in SQLite—different trade-offs (see Capabilities below).

How the plugin relates to the core

The gateway process loads the thin extension from memok-ai-openclaw; that extension depends on memok-ai (often as memok-ai-core) and imports the stable surface from memok-ai-core/bridge. Your SQLite path is configured in the plugin/host layer. Without OpenClaw, your app imports memok-ai or memok-ai/bridge directly.

Mermaid source (paste into a Mermaid-capable editor or GitHub preview to render):

OpenClaw path

Text
flowchart LR
  subgraph host [OpenClawHost]
    Gateway[GatewayProcess]
  end
  subgraph pluginRepo [memok-ai-openclaw]
    Ext[PluginExtension]
  end
  subgraph corePkg [memok-ai npm]
    Bridge["memok-ai/bridge"]
    Db[(SQLite file)]
  end
  Gateway --> Ext
  Ext --> Bridge
  Bridge --> Db

Standalone Node app

Text
flowchart LR
  App[YourNodeApp] --> Entry["memok-ai or memok-ai/bridge"]
  Entry --> Db[(SQLite file)]

Integration guide

Import the full surface from memok-ai, or the stable subset from memok-ai/bridge for gateways and OpenClaw-style hosts. The plugin repo may list this package under an alias such as memok-ai-core; the npm registry name remains memok-ai.

TypeScript

TypeScript
// Full API surface (pipelines, SQLite helpers, types)
import {
  articleWordPipelineV2,
  buildPipelineContext,
} from "memok-ai";

// Stable subset for gateways / OpenClaw-style hosts
import {
  articleWordPipeline,
  dreamingPipeline,
} from "memok-ai/bridge";
You importWhen
memok-ai/bridgeGateways, bots, or minimal services that need article ingest, dreaming, recall, feedback, and DB bootstrap helpers
memok-aiCustom tooling that also needs articleWordPipelineV2, buildPipelineContext, hardenDb, deeper dreaming exports, etc.

Typical integration steps: npm install memok-ai; choose dbPath; for a brand-new file call createFreshMemokSqliteFile(dbPath) once (from memok-ai or memok-ai/bridge)—tables, dream_logs, and link indexes; throws if the file exists unless you pass { replace: true }; build a MemokPipelineConfig object and pass it to bridge functions.

Tip

Your application may call dotenv or read Kubernetes secrets—that is your host’s responsibility. memok-ai never loads .env internally; always pass secrets via MemokPipelineConfig or whatever config layer you already use.

Bridge entrypoints (articleWordPipeline, dreamingPipeline, etc.) take a full MemokPipelineConfig (or DreamingPipelineConfig). For low-level pipelines that accept { ctx }, import buildPipelineContext from memok-ai and pass PipelineLlmContext.

TypeScript (single ingest)

TypeScript
import { articleWordPipeline } from "memok-ai/bridge";

await articleWordPipeline(longText, {
  dbPath: "/path/to/memok.sqlite",
  openaiApiKey: process.env.OPENAI_API_KEY!,
  openaiBaseUrl: process.env.OPENAI_BASE_URL,
  llmModel: "gpt-4o-mini",
  llmMaxWorkers: 4,
  articleSentencesMaxOutputTokens: 8192,
  coreWordsNormalizeMaxOutputTokens: 32768,
  sentenceMergeMaxCompletionTokens: 2048,
});

End-to-end (ingest + sample recall + feedback)

TypeScript
import {
  type MemokPipelineConfig,
  createFreshMemokSqliteFile,
  articleWordPipeline,
  extractMemorySentencesByWordSample,
  applySentenceUsageFeedback,
} from "memok-ai/bridge";

const dbPath = "./data/memok.sqlite";
createFreshMemokSqliteFile(dbPath); // once for a new file; omit if DB exists, or pass { replace: true }

const memok: MemokPipelineConfig = {
  dbPath,
  openaiApiKey: process.env.OPENAI_API_KEY!,
  openaiBaseUrl: process.env.OPENAI_BASE_URL,
  llmModel: "gpt-4o-mini",
  llmMaxWorkers: 4,
  articleSentencesMaxOutputTokens: 8192,
  coreWordsNormalizeMaxOutputTokens: 32768,
  sentenceMergeMaxCompletionTokens: 2048,
};

await articleWordPipeline("Long article or consolidated chat …", memok);

const recall = extractMemorySentencesByWordSample({ ...memok, fraction: 0.2 });
// Build your LLM prompt from recall.sentences, then mark usage:

applySentenceUsageFeedback({
  ...memok,
  sentenceIds: recall.sentences.map((s) => s.id),
});

To produce the v2 tuple without writing SQLite, use articleWordPipelineV2 with buildPipelineContext from memok-ai instead of articleWordPipeline.

Common integration patterns

Sketches only—wire error handling, retries, and your LLM client. Types and return shapes follow the upstream package.

Pattern A — conversation-aware service: ingest consolidated text after a session, sample recall before the next model call, then apply feedback for sentences you actually used in the reply.

Pattern A (skeleton)

TypeScript
import {
  type MemokPipelineConfig,
  articleWordPipeline,
  extractMemorySentencesByWordSample,
  applySentenceUsageFeedback,
} from "memok-ai/bridge";

async function afterSession(memok: MemokPipelineConfig, transcript: string) {
  await articleWordPipeline(transcript, memok);
}

async function beforeReply(memok: MemokPipelineConfig) {
  const recall = extractMemorySentencesByWordSample({ ...memok, fraction: 0.2 });
  // const reply = await yourLlm(buildPrompt(recall.sentences));
  return recall;
}

async function reinforce(memok: MemokPipelineConfig, sentenceIds: number[]) {
  await applySentenceUsageFeedback({ ...memok, sentenceIds });
}

Pattern B — batch document ingest: loop over documents and call articleWordPipeline per document (or chunk) with the same MemokPipelineConfig.

Pattern B (skeleton)

TypeScript
import { articleWordPipeline, type MemokPipelineConfig } from "memok-ai/bridge";

async function indexDocs(memok: MemokPipelineConfig, docs: { id: string; body: string }[]) {
  for (const doc of docs) {
    await articleWordPipeline(doc.body, memok);
    // optionally log doc.id for traceability in your app
  }
}

Pattern C — scheduled maintenance: run dreamingPipeline on a timer or external scheduler (cron, worker). You must supply DreamingPipelineConfig including dreamLogWarn; the OpenClaw plugin wraps the same function with its own schedule.

Pattern C (skeleton)

TypeScript
import { dreamingPipeline, type DreamingPipelineConfig } from "memok-ai/bridge";

async function runNightlyDreaming(cfg: DreamingPipelineConfig) {
  await dreamingPipeline(cfg);
}

// Example only: prefer your platform scheduler instead of setInterval in production.
// setInterval(() => { void runNightlyDreaming(dreamCfg); }, 24 * 60 * 60 * 1000);

Capabilities and design notes

memok-ai is a Node.js + TypeScript memory pipeline for long text and conversations. It extracts structured memory units with OpenAI-compatible LLM APIs and stores them in SQLite for recall, reinforcement, and dreaming.

  • End-to-end article pipeline (article-word-pipeline) with stable JSON tuples
  • SQLite paths for words, normal_words, sentences, and link tables
  • Dreaming pipeline (dreaming-pipeline): predream + story-word-sentence loops
  • OpenClaw plugin (separate repo) builds on the same bridge for per-turn recall and optional scheduled maintenance

Evaluation (upstream tested): with the OpenClaw plugin recall/report flow, effective utilization of candidate memories reflected in assistant replies exceeded 95% in their runs—your results depend on model, task, and sampling.

What the OpenClaw plugin adds: per-turn recall; reinforcement via memok_report_used_memory_ids; optional predream/dreaming as graph maintenance—not a pure append-only log.

How this differs from embedding-only stacks (trade-off, not universal better/worse on retrieval):

memok-aiTypical hosted vector DB
DeploymentSQLite on your machineCloud API + billing
Recall signalWord / normalized-word graph, weights, samplingEmbedding similarity
ExplainabilityStructured rows you can inspectMostly similarity scores
PrivacyData stays local by defaultUsually leaves your host

Upstream notes: informal timing on typical local setups (SSD, modest DB) is often ~10² ms to persist a turn and sub-100 ms for recall queries—indicative only, not an SLA. Active DBs in the wild have reached on the order of ~1k sentences and 100k+ link rows.

Authoritative narrative and changelog: https://github.com/galaxy8691/memok-ai/blob/main/README.md

Requirements

  • Node.js ≥20 (LTS recommended) and npm
  • OpenClaw: supported gateway and plugin API versions are documented in https://github.com/galaxy8691/memok-ai-openclaw—the core memok-ai package.json does not pin them.
  • First-time npm install in the core repo is often dominated by better-sqlite3 (native prebuild/compile); allow a few minutes on a cold cache.

Installation

1) Clone the core repo for development

bash

bash
npm install
npm run build
npm test
  • npm install — runs prepare → npm run build
  • npm run build — tsc only
  • npm test — Vitest; some tests call LLMs when OPENAI_API_KEY is set
  • npm run ci — Biome + build + test

Core tests do not read .env; export variables in your shell. Mirror: https://gitee.com/wik20/memok-ai.

2) npm dependency — https://www.npmjs.com/package/memok-ai

bash

bash
npm install memok-ai

3) OpenClaw plugin — use https://github.com/galaxy8691/memok-ai-openclaw and this site’s /docs/openclaw-plugin for gateway steps.

bash

bash
git clone https://github.com/galaxy8691/memok-ai-openclaw.git
cd memok-ai-openclaw
# openclaw plugins install … && openclaw memok setup  (see plugin README)

Dreaming

Call dreamingPipeline from memok-ai/bridge with DreamingPipelineConfig: MemokPipelineConfig plus required dreamLogWarn, plus optional story tuning (maxWords, fraction, minRuns, maxRuns). The OpenClaw plugin schedules the same function.

Persistence: every run (success or failure) appends one row to SQLite table dream_logs (dream_date, ts, status ok/error, log_json). Implement dreamLogWarn to log or forward non-fatal issues (for example when dream_logs cannot be written); hard failures still throw after logging.

Monitoring and debugging:

  1. Treat dream_logs as your primary health signal for scheduled maintenance; tail the latest rows after each scheduled window.
  2. Open the DB read-only: SELECT * FROM dream_logs ORDER BY id DESC LIMIT 5;
  3. For failures, read status = 'error' and inspect log_json.error; compare log_json.predream and story sections across runs to see whether merges/decay are progressing.
  4. If you host dreaming yourself, alert on repeated errors and on sudden drops in predream counters—plugin users should align cron with upstream guidance and check the same table.

Configuration priority (OpenClaw plugin)

For OPENAI_API_KEY, OPENAI_BASE_URL, and MEMOK_LLM_MODEL when using the separate OpenClaw plugin:

  • Existing process environment variables win.
  • Plugin config only fills missing values.

This core library never loads .env files; inject secrets via your process manager or gateway.

Environment variables

When to care about this table:

  • Running upstream tests or quick local scripts that assemble config from process.env.
  • OpenClaw plugin process filling MEMOK_* defaults (see Configuration priority).
  • Production services should still prefer explicit MemokPipelineConfig from your config service—not implicit env coupling.

Who actually reads these?

  1. OpenClaw plugin process — may default missing fields from MEMOK_* when building a MemokPipelineConfig-shaped object
  2. This repo’s tests / legacy helpers — some paths read process.env when you do not pass an explicit config object
  3. Library integrators — should not rely on this table in production; construct MemokPipelineConfig explicitly

Per-stage model env names (e.g. MEMOK_V2_ARTICLE_CORE_WORDS_LLM_MODEL) are documented in upstream resolveModel helpers.

VariableRequiredWhyEffect when set
OPENAI_API_KEYYes (env-based flows)Key for OpenAI-compatible endpointsUsed when config is assembled from env without explicit openaiApiKey
OPENAI_BASE_URLNoSelf-hosted or proxy gatewaysOverrides default OpenAI host for the client
MEMOK_LLM_MODELNoQuick default model switchDefault model when not set in config
MEMOK_DB_PATHNoQuick default SQLite pathDefault ./memok.sqlite when env helpers resolve dbPath
MEMOK_LLM_MAX_WORKERSNoCap parallel LLM callsInteger >1 enables bounded parallelism in article stages
MEMOK_V2_ARTICLE_SENTENCES_MAX_OUTPUT_TOKENSNoBound article sentence stageClamped token ceiling for that stage
MEMOK_CORE_WORDS_NORMALIZE_MAX_OUTPUT_TOKENSNoBound normalization stageClamped token ceiling
MEMOK_SENTENCE_MERGE_MAX_COMPLETION_TOKENSNoBound merge completionsClamped token ceiling
MEMOK_SKIP_LLM_STRUCTURED_PARSENoDebug / resilienceWhen truthy, skips strict structured parsing where implemented

Note

Full variable list and edge cases may evolve; cross-check https://github.com/galaxy8691/memok-ai/blob/main/README.md when upgrading versions.

Performance and tuning (qualitative)

No SLA numbers here—tune on your hardware and workload.

TopicPractical notes
SQLite filePut dbPath on fast local disk (SSD). For tests only, you may use :memory: if your integration supports it—confirm in upstream docs before relying on it in production.
ParallelismllmMaxWorkers > 1 speeds article stages but increases concurrent LLM calls; reduce it if you hit rate limits or memory pressure.
Token ceilingsarticleSentencesMaxOutputTokens and related fields cap stage outputs; raising them can increase cost and latency, lowering them can reduce timeouts.
better-sqlite3Native module: cold installs are slower; runtime is usually dominated by LLM round-trips, not SQLite for modest DB sizes.
Informal latencyUpstream reports ~10² ms per-turn persist and sub-100 ms recall on typical local setups—anecdotal, reproduce on your stack.

For contributors

Day-one workflow for the core repository (not the plugin):

  1. Clone galaxy8691/memok-ai and use Node 20+.
  2. Run npm install, then npm run ci before opening a PR.
  3. Set OPENAI_API_KEY only when running tests that intentionally call remote models (see CONTRIBUTING.md).
  4. Use npm run build after src/ edits if you skipped install hooks.

There is no memok-ai dev CLI binary. OpenClaw commands such as openclaw memok setup belong to the gateway; verify with openclaw --help and the plugin README.

ScriptPurposeTypical invocation
buildCompile TypeScript (tsc)npm run build
testRun Vitest oncenpm test
lintBiome checknpm run lint
formatBiome format --writenpm run format
cilint + build + testnpm run ci
prepareRuns build on installnpm install (hook)

Contributing guide: https://github.com/galaxy8691/memok-ai/blob/main/CONTRIBUTING.md