csv-analyzer
Automate data profiling with type detection, statistical analysis, and quality flags saved to a Markdown report.
skill install https://www.promptspace.in/skills/csv-analyzerDeep Data Quality Profiling for CSVs
The CSV Analyzer is a specialized skill designed to automate the initial 80% of data science and engineering workflows: understanding the shape, quality, and hidden patterns of a dataset. Instead of manually writing repetitive pandas or SQL scripts to check for nulls and outliers, this skill performs an exhaustive audit of any CSV file in seconds.
What it does
- Smart Type Detection: Goes beyond basic strings/ints to identify emails, URLs, UUIDs, dates, and categorical data.
- Statistical Deep-Dive: Calculates distributions, IQR-based outliers, and skewness for numeric data, alongside high-cardinality analysis for text.
- Data Quality Auditing: Flags mixed types, constant columns, leading/trailing whitespace, and near-duplicate columns.
- Relationship Mapping: Identifies strong Pearson correlations between numeric features to surface potential redundancies.
Why use this skill?
While a standard LLM can look at a few rows of data, it cannot accurately calculate statistics or scan 10,000+ rows for anomalies without help. This skill leverages Bash and file-system tools to process large datasets reliably, generating a structured CSV_REPORT.md that serves as a permanent documentation artifact for your project. The output provides actionable recommendations for data cleaning (imputation, deduplication, etc.) that you can hand off to your agent or a data team.
Use cases
- Generate comprehensive data quality reports for new datasets automatically.
- Detect outliers and missing value clusters before training ML models.
- Identify redundant columns using Pearson correlation analysis.
- Surface data integrity issues like mixed types or hidden whitespace.
Example
Prompt
Sample output preview is available after purchase.