diff --git a/RULES.md b/RULES.md index 8a08d64..afcfe23 100644 --- a/RULES.md +++ b/RULES.md @@ -18,6 +18,25 @@ Dump in rules that I come across - could be anything, specific or general. 14. Feed AI agents currated context that provides the right info without overwhelming them. 15. Repos might contain contradictory information. AI agents will flag that and ask before proceeding to get clarification. +# Dev. Vs Production + +All projects worked on and tested in dev VM, then pushed to producton via Git. + +1. Dev work +- build features +- test +2. Push to GitHub +- commit +- push +3. Deploy to prod +- git pull +- restart +4. Validate +5. Tag release +- v1.2.0 + +Or, create branch, spin up VM and work on it, test, open PR, merge to main. + # AI Processes 0. Prep: Cleans the working tree by analyzing any uncommitted work and doing the right thing with it (stash or commit). Also runs the entire current test suite and fixes any failures it encounters. diff --git a/specs/2_systems_s/architecture.md b/specs/2_systems_s/architecture.md index 70f3ba7..7a06a78 100644 --- a/specs/2_systems_s/architecture.md +++ b/specs/2_systems_s/architecture.md @@ -1 +1,185 @@ -architecture.md +# **Long-Term Tech Stack** + +This is designed to be: + +* **Simple** +* **Durable** +* **Modular** +* **Vendor-independent** +* **AI-native** +* **Low-maintenance** +* **Future-proof** + +## Core Backend Layer (Stays Python Forever) + +### *FastAPI* {#fastapi} + +Your main backend framework indefinitely. + +* Lightweight +* Fast +* Async +* Great docs +* Perfect for APIs +* Perfect for JSON-store backends + +### *Pydantic (v2)* {#pydantic-(v2)} + +For: + +* Validation +* Schema definition +* Type enforcement +* Parsing & serializing JSON + +This gives you the “AI-Native Data Object” foundation. + +### *Python typing* {#python-typing} + +Static typing \+ Pydantic gives you: + +* Safety +* Predictable data models +* AI-generated UIs +* Clean code + +### *JSON object datastore on disk* {#json-object-datastore-on-disk} + +Your architecture: + +`/srv/data/json-db/lod/customers/*.json` + +`/srv/data/json-db/lod/files/` + +`/srv/data/json-db/bc/hubs/*.json` + +This becomes your **source of truth** — and it’s timeless: + +* No SQL migration pain +* AI-friendly +* Human-readable +* Portable across servers +* Works with agents +* Simple backups +* Easy to replicate across clouds + +### *Systemd services* {#systemd-services} + +For: + +* Backend uptime +* Simple restarts +* Zero Docker complexity +* Predictability + +### *CAddy* {#caddy} + +One global reverse proxy. + +* SSL termination +* Routing +* Static file serving +* Stable +* Minimal overhead + +This will last you a decade. + +## Frontend Layer (Optional TS Later) {#frontend-layer-(optional-ts-later)} + +For now: + +* Simple JS +* Plain HTML \+ small modals +* Works fine for LOD + +Long term (as projects expand): + +* **React \+ TypeScript** +* **HTMX** for hybrid pages +* Shared component library +* Shared TypeScript types autogenerated from Pydantic +* Local static hosting (`/srv/apps/ui/…`) + +You only bring in TS when: + +* The UIs become larger +* You need better autocompletion +* Indexing dashboards get complex +* You build cross-app shared components + +There is *no immediate need*. + +## AI Layer (Strong Long-Term Direction) {#ai-layer-(strong-long-term-direction)} + +### *Python for AI operations* {#python-for-ai-operations} + +Perfect for: + +* Embeddings +* Chunking +* Vectorization +* STT +* LLM calls +* Agents +* Parsing inbound data +* Publishing pipelines + +### *Agentic pipelines* {#agentic-pipelines-(future)} + +* Chunking → JSONL files +* Vectorization (Embeddings) → Python + * OpenAI embeddings + * Jina embedding + * Cohere + * Local embeddings later (GGUF) + * SentenceTransformers + * Python is the correct home for: +* running the embedder +* transforming the JSONL chunks +* updating embeddings +* building vector stores +* MCP +* OpenAI’s new agent tools +* Event-driven systems +* Scheduled analytical tasks (weeklies) +* Lightweight Database for Metadata \+ Embeddings + * SQLite \+ DuckDB + * Qdrant + * LanceDB + * Weaviate (local mode) + * ChromaDB +* The simplest long-term option I recommend for you: + * DuckDB or SQLite for metadata + * LanceDB or Qdrant for vectors + * Why? + * Very fast + * No server needed + * Easy to copy/backup + * Python-native + * AI-friendly + * Perfect with JSONL chunk pipelines +* Your JSONL holds the raw chunks, Your small local DB holds: +* chunk\_id +* metadata (source, tags, time ranges) +* vector embeddings +* up-to-date indexes +* Rag Layer \- Python +* Consiuder agentic RAG sitting in this \- see index main product development document https://docs.google.com/document/d/1GedfgKY78INGREJ5lgNSOv0OIAMjm0Lj268I40u0JuY/edit?tab=t.0\#heading=h.ksb8pdnacvtr + +### *Python scripts & CLIs* {#python-scripts-&-clis} + +For: + +* Import/export +* Data normalization +* Periodic JSON cleanup +* Building indexes +* Summaries +* AI-native publishing + +These will accumulate value into your system over years. + +### *API-Driven* {#api-driven} + +Even inside your system, set up so the pieces talk to each other via API, rather than tight coupling so that modules can be upgraded, replaced, outsourced, etc. + diff --git a/specs/2_systems_s/environments.md b/specs/2_systems_s/environments.md index 3d521ab..512525b 100644 --- a/specs/2_systems_s/environments.md +++ b/specs/2_systems_s/environments.md @@ -1 +1,22 @@ -environments.md \ No newline at end of file +# **Environment and Infrastructure Layer** + +Dev and production machines, with warm-backip of prodction machines available if possible. + +### *Sovereign Edge Center* + +* Static IP +* Simple Ubuntu instance +* NGINX \+ FastAPI apps or Caddy +* JSON datastores + +### *Backup strategy* + +* QNAP → nightly rsync pull +* Secondary QNAP for redundancy +* JSON makes backup/restore trivial + +### *Power resilience (home infrastructure)* + +* UPS +* Solar \+ generator integration +* Redundant internet (Starlink) diff --git a/specs/2_systems_s/system.md b/specs/2_systems_s/system.md index 0db7194..c9a5c31 100644 --- a/specs/2_systems_s/system.md +++ b/specs/2_systems_s/system.md @@ -1 +1 @@ -system.md \ No newline at end of file +system.md - runs inside an environment. Project specific. The combination of code, data, configuration, and infrastructure running together to produce behavior \ No newline at end of file