Technical Knowledge Consolidation Pipeline
Technical Knowledge Consolidation Pipeline
Run this sometimes to clean up documentation debt. It helps you merge and sort technical notes into a clear and organized knowledge base.
How to use
Use this periodically (e.g., weekly) to maintain the quality of your memory files.
Prompt
ROLE: Senior Principal Software Architect (Monorepo & AI-Infrastructures)
CONTEXT:
- REPOSITORY: Complex monorepo using Turborepo.
- SOURCE DATA: raw JSONL conversation history in
/Users/cebreus/.gemini/tmp/bits/chats. - TARGET: Hierarchical
memories/*.mdfiles andGEMINI.mdindexes.
TASK: REPRODUCIBLE KNOWLEDGE EXTRACTION PIPELINE
Execute a multi-stage pipeline to extract, consolidate, and sort technical gotchas.
STAGE 1: TECHNICAL PREPARATION (MANDATORY)
- You MUST NOT read JSONL files directly via standard read tools (too large).
- Write a Node.js helper script (
tools/extract_history.mjs) that:
- Uses
readlineto stream JSONL. - Extracts only
userandgeminitext content. - Captures the
startTimefrom the session header or the firsttimestamp. - Outputs a clean text file for each session.
- Run this script over all history files and store temporary extracts.
STAGE 2: SEMANTIC ANALYSIS & CHUNKING
- Process extracted text in chunks (max 50k tokens) using sub-agents.
- Identify "GOTCHAS": Non-obvious technical traps, dependency leaks, or build quirks.
- ZERO HALLUCINATION: Extract exact timestamps for "Date Discovered". If missing, use "UNKNOWN".
STAGE 3: SEMANTIC AUDIT & CONSOLIDATION
- AGGREGATE: Collect all existing
**/memories/*.mdfiles. - MERGE: If a new gotcha overlaps with an existing one, merge them into a single, high-fidelity entry.
- PROMOTION: If a rule is applicable monorepo-wide, move it to root
memories/. - PRUNING: Remove generic/non-technical advice.
STAGE 4: ATOMIC FILE SYSTEM UPDATE (SORTING)
- Write to
[scope]/memories/[category]-gotchas.md. - Apply STRICT SORTING:
- Header & Description at the top.
- All "Date Discovered: UNKNOWN" entries first.
- Dated entries sorted chronologically: OLDEST to NEWEST.
- FORMATTING: Use
## 🚨 [Gotcha Name](H2). - CLEANUP: Delete files that became empty and remove their references from
GEMINI.md.
STAGE 5: INDEXING & FINAL CLEANUP
- Ensure all
memories/*.mdare linked in the localGEMINI.md. - DELETE all temporary scripts and text extracts created in Stage 1.
OUTPUT:
Table of consolidated gotchas and confirmation of standardized file structure.
Attachments
tools/memories
import fs from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import readline from 'node:readline';
import { fileURLToPath } from 'node:url';
const ROOT = path.resolve(path.dirname(fileURLToPath(import.meta.url)), '..');
const CHATS = path.join(os.homedir(), '.gemini/tmp/bits/chats');
const OUT = path.join(ROOT, '.tmp/extracted_chats');
const mine = async () => {
fs.mkdirSync(OUT, { recursive: true });
for (const f of fs.readdirSync(CHATS).filter(n => n.endsWith('.jsonl'))) {
const rl = readline.createInterface({ input: fs.createReadStream(path.join(CHATS, f)), crlfDelay: Infinity });
let start = 'UNKNOWN', entries = [];
for await (const l of rl) {
try {
const r = JSON.parse(l);
if (start === 'UNKNOWN') start = r.startTime || r.timestamp || start;
if (r.type === 'user' || r.type === 'gemini') {
const txt = (Array.isArray(r.content) ? r.content.map(i => i.text || '').join('') : r.content || '').trim();
if (txt) entries.push(`ROLE: ${r.type}\n${txt}`);
}
} catch {}
}
if (entries.length) fs.writeFileSync(path.join(OUT, f.replace('.jsonl', '.txt')), `START: ${start}\n\n${entries.join('\n\n---\n\n')}\n`);
}
};
const sort = async (dir = ROOT) => {
for (const e of fs.readdirSync(dir, { withFileTypes: true })) {
const p = path.join(dir, e.name);
if (e.isDirectory() && !['node_modules', '.git', 'dist'].includes(e.name)) await sort(p);
else if (e.name.endsWith('-gotchas.md') || (dir.endsWith('memories') && e.name.endsWith('.md'))) {
const c = fs.readFileSync(p, 'utf8'), blocks = [], head = [];
let cur = null;
for (const l of c.split('\n')) {
if (l.startsWith('## 🚨')) {
if (cur) blocks.push(cur);
cur = { l: [l], d: 0 };
} else if (cur) {
cur.l.push(l);
if (l.includes('**Date Discovered:**')) {
const d = l.split('**Date Discovered:**')[1].trim();
cur.d = d.includes('UNKNOWN') ? 0 : new Date(d).getTime() || 0;
}
} else head.push(l);
}
if (cur) blocks.push(cur);
if (blocks.length) {
const out = head.join('\n').trim() + '\n\n' + blocks.sort((a, b) => a.d - b.d).map(b => b.l.join('\n').trim()).join('\n\n');
fs.writeFileSync(p, out.replace(/(## 🚨 .*)/g, '\n$1\n').replace(/\n{3,}/g, '\n\n').trim() + '\n');
}
}
}
};
const [cmd] = process.argv.slice(2);
if (cmd === 'mine') mine();
else if (cmd === 'sort') sort();
else console.log('Usage: node tools/memories.mjs [mine|sort]');