Technical Knowledge Consolidation Pipeline

Run this sometimes to clean up documentation debt. It helps you merge and sort technical notes into a clear and organized knowledge base.

How to use

Use this periodically (e.g., weekly) to maintain the quality of your memory files.

Prompt

ROLE: Senior Principal Software Architect (Monorepo & AI-Infrastructures)

CONTEXT:

  • REPOSITORY: Complex monorepo using Turborepo.
  • SOURCE DATA: raw JSONL conversation history in /Users/cebreus/.gemini/tmp/bits/chats.
  • TARGET: Hierarchical memories/*.md files and GEMINI.md indexes.

TASK: REPRODUCIBLE KNOWLEDGE EXTRACTION PIPELINE

Execute a multi-stage pipeline to extract, consolidate, and sort technical gotchas.

STAGE 1: TECHNICAL PREPARATION (MANDATORY)

  1. You MUST NOT read JSONL files directly via standard read tools (too large).
  2. Write a Node.js helper script (tools/extract_history.mjs) that:
  • Uses readline to stream JSONL.
  • Extracts only user and gemini text content.
  • Captures the startTime from the session header or the first timestamp.
  • Outputs a clean text file for each session.
  1. Run this script over all history files and store temporary extracts.

STAGE 2: SEMANTIC ANALYSIS & CHUNKING

  1. Process extracted text in chunks (max 50k tokens) using sub-agents.
  2. Identify "GOTCHAS": Non-obvious technical traps, dependency leaks, or build quirks.
  3. ZERO HALLUCINATION: Extract exact timestamps for "Date Discovered". If missing, use "UNKNOWN".

STAGE 3: SEMANTIC AUDIT & CONSOLIDATION

  1. AGGREGATE: Collect all existing **/memories/*.md files.
  2. MERGE: If a new gotcha overlaps with an existing one, merge them into a single, high-fidelity entry.
  3. PROMOTION: If a rule is applicable monorepo-wide, move it to root memories/.
  4. PRUNING: Remove generic/non-technical advice.

STAGE 4: ATOMIC FILE SYSTEM UPDATE (SORTING)

  1. Write to [scope]/memories/[category]-gotchas.md.
  2. Apply STRICT SORTING:
  • Header & Description at the top.
  • All "Date Discovered: UNKNOWN" entries first.
  • Dated entries sorted chronologically: OLDEST to NEWEST.
  1. FORMATTING: Use ## 🚨 [Gotcha Name] (H2).
  2. CLEANUP: Delete files that became empty and remove their references from GEMINI.md.

STAGE 5: INDEXING & FINAL CLEANUP

  1. Ensure all memories/*.md are linked in the local GEMINI.md.
  2. DELETE all temporary scripts and text extracts created in Stage 1.

OUTPUT:

Table of consolidated gotchas and confirmation of standardized file structure.

Attachments

tools/memories
import fs from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import readline from 'node:readline';
import { fileURLToPath } from 'node:url';

const ROOT = path.resolve(path.dirname(fileURLToPath(import.meta.url)), '..');
const CHATS = path.join(os.homedir(), '.gemini/tmp/bits/chats');
const OUT = path.join(ROOT, '.tmp/extracted_chats');

const mine = async () => {
  fs.mkdirSync(OUT, { recursive: true });
  for (const f of fs.readdirSync(CHATS).filter(n => n.endsWith('.jsonl'))) {
    const rl = readline.createInterface({ input: fs.createReadStream(path.join(CHATS, f)), crlfDelay: Infinity });
    let start = 'UNKNOWN', entries = [];
    for await (const l of rl) {
      try {
        const r = JSON.parse(l);
        if (start === 'UNKNOWN') start = r.startTime || r.timestamp || start;
        if (r.type === 'user' || r.type === 'gemini') {
          const txt = (Array.isArray(r.content) ? r.content.map(i => i.text || '').join('') : r.content || '').trim();
          if (txt) entries.push(`ROLE: ${r.type}\n${txt}`);
        }
      } catch {}
    }
    if (entries.length) fs.writeFileSync(path.join(OUT, f.replace('.jsonl', '.txt')), `START: ${start}\n\n${entries.join('\n\n---\n\n')}\n`);
  }
};

const sort = async (dir = ROOT) => {
  for (const e of fs.readdirSync(dir, { withFileTypes: true })) {
    const p = path.join(dir, e.name);
    if (e.isDirectory() && !['node_modules', '.git', 'dist'].includes(e.name)) await sort(p);
    else if (e.name.endsWith('-gotchas.md') || (dir.endsWith('memories') && e.name.endsWith('.md'))) {
      const c = fs.readFileSync(p, 'utf8'), blocks = [], head = [];
      let cur = null;
      for (const l of c.split('\n')) {
        if (l.startsWith('## 🚨')) {
          if (cur) blocks.push(cur);
          cur = { l: [l], d: 0 };
        } else if (cur) {
          cur.l.push(l);
          if (l.includes('**Date Discovered:**')) {
            const d = l.split('**Date Discovered:**')[1].trim();
            cur.d = d.includes('UNKNOWN') ? 0 : new Date(d).getTime() || 0;
          }
        } else head.push(l);
      }
      if (cur) blocks.push(cur);
      if (blocks.length) {
        const out = head.join('\n').trim() + '\n\n' + blocks.sort((a, b) => a.d - b.d).map(b => b.l.join('\n').trim()).join('\n\n');
        fs.writeFileSync(p, out.replace(/(## 🚨 .*)/g, '\n$1\n').replace(/\n{3,}/g, '\n\n').trim() + '\n');
      }
    }
  }
};

const [cmd] = process.argv.slice(2);
if (cmd === 'mine') mine();
else if (cmd === 'sort') sort();
else console.log('Usage: node tools/memories.mjs [mine|sort]');