rag-architect
Design, debug, and optimize production RAG systems with expert architecture, hybrid search, and grounding strategies.
skill install https://www.promptspace.in/skills/rag-architectAdvanced RAG System Architecture & Debugging
Designing a production-ready Retrieval-Augmented Generation (RAG) system requires more than just a vector database and a prompt. The RAG Architect skill provides a developer-centric framework for building, hardening, and troubleshooting complex retrieval stacks, moving beyond generic implementations to high-performance architecture.
What it does
This skill acts as a senior systems architect for your AI pipeline. It analyzes ingestion workflows, document parsing, chunking strategies, embedding selection, and vector store performance. Whether you are building from scratch or fixing a broken implementation, it applies a rigorous, evidence-based methodology to ensure your agent stays grounded and accurate.
Supported Capabilities
- Architecture Design: Decisions for hybrid search, reranking, and context packing tailored to your specific corpus (Legal, Code, Product Docs, etc.).
- Truth-First Debugging: Systematic isolation of failures across the pipeline—from bad parsing to stale indexes and tenant leakage.
- Infrastructure Selection: Unbiased tradeoff analysis for vector databases (pgvector, Qdrant, Milvus), embedding models, and rerankers.
- Production Hardening: Implementing multi-tenant isolation, citation grounding, and incremental re-indexing strategies.
- Evaluation Frameworks: Establishing metrics for recall@k, precision, and faithfulness to ensure changes are data-driven rather than anecdotal.
Why use this skill?
Standard LLM prompts often treat "bad answers" as model hallucinations. This skill identifies when the problem is actually a metadata filter mismatch, poor chunking semantics, or an inefficient reranker. It helps you reduce latency and cost by optimizing the weakest stage of your pipeline rather than over-relying on expensive long-context windows.
Use cases
- Construct hybrid search pipelines combining semantic and keyword retrieval
- Debug hallucination risks by implementing strict source grounding protocols
- Optimize indexing strategies for low-latency document retrieval at scale
- Architect multi-stage re-ranking workflows to improve answer precision
Example
Prompt
Sample output preview is available after purchase.
Known limitations
- Cannot perform the actual vector DB migration or infrastructure provisioning. - Effectiveness is limited without access to specific log samples or retrieval metrics. - Does not generate frontend UI.