The cloud workspace for your data

Understand your data the moment you upload it.

Upload a spreadsheet, connect a database or API. KatCore labels every column, scores its quality, and answers your questions in natural language — all in your browser. No SQL, no setup.

app.katcore.io / sources
customers_q3.csv12,402 ROWS · 2.4 MB · CSV · v3 ReadyB87 / 100
ColumnTypeAuto description
email PIICustomer email address — unmasked, flagged for exposure
signup_dateDateAccount creation date, parsed as ISO-8601
mrrCurrencyMonthly recurring revenue, USD
regionRegionSales territory — 6 distinct values
What was the growth trend of SaaS subscriptions in Q3 vs Q2?
Kat

SaaS subscriptions grew 14.2% in Q3, driven by the Enterprise tier (+22% seats). Source: sales_report.csv

Drop in any format
CSVJSONXLSXPDFTXTDOCXMDParquetHTML
The flow

From messy file to trusted answers — in minutes.

Bring data in, understand and trust it, keep it fresh, and walk away with a report. Four steps, one workspace, zero pipelines to build.

01 Bring any data in

Drop it in. KatCore handles the rest.

Drag and drop your files with real per-file progress — or pull straight from a public URL, a REST API, or a SQL database. Parsing, cleaning, and storage happen for you.

Spreadsheets, PDFs, databases, or live APIs — drop them in and KatCore handles parsing, cleaning, and storage. No pipelines to build.

File URL API Database
Drop files to upload
CSV · JSON · XLSX · PDF · TXT · DOCX · MD · Parquet · HTML
up to 50 files · 100 MB each
orders_2025.xlsx100%
support_logs.pdf68%
events.parquet34%
02 Understand & trust it · the centerpiece

A 0–100 trust score for any dataset — and the exact fixes.

On ingest, every column is labeled and described automatically. Then the Readiness Audit scores your data across six weighted dimensions — fully explainable, no black box — and hands you a prioritized fix-list where every fix shows the points it recovers. The checklist literally is your score, decomposed.

KatCore reads every column, scores how trustworthy your data is, and answers your questions in natural language — so you can act, not audit.

87
GRADE B
AI-Readiness Score
Good
+9 projected after fixes → 96
Completeness 25%92
Validity 25%78
Uniqueness 15%95
PII Exposure 15%60
Consistency 10%84
Semantic 10%100
Fix these to improve your score4 issues
Unmasked PII in email
312 unmasked addresses · always critical
+15.0 pts
Outliers detected in mrr
IQR fences 12–840 · 47 rows outside
+8.5 pts
Unparseable dates in signup_date
rows 14, 89, 203 … (+19 more)
+4.0 pts
Inconsistent values in region
"USA" vs "U.S.A." · 2 variants
+1.5 pts
Suggested cleaning actions · preview → apply
Before
jordan.lee@acme.io
After · PII masking
j••••••@acme.io
Which region had the highest churn last quarter?
Kat

EMEA had the highest churn at 6.8%, nearly double the 3.5% global average — concentrated in the SMB segment. Source: customers_q3.csv

03 Keep it fresh

Refresh on a schedule. Skip the pull when nothing changed.

Point KatCore at a URL or API and it re-ingests on a timezone-aware cron. Smart polling caches ETag and Last-Modified, so an unchanged source is a no-op — never duplicate data.

Point KatCore at a URL or API and it refreshes on your schedule — and skips the pull entirely when nothing changed.

exchange_rates · daily 06:000 6 * * * · UTC · next run in 4hUnchanged — skipped
inventory_api · every 15 min*/15 * * * * · pulled 2 min agoNew version v8
crm_contacts · hourly0 * * * * · Bearer auth · lineage trackedv23
04 Walk away with a report

Every audit becomes a notebook you can edit, re-run, and download.

The audit produces a real Jupyter notebook inside KatCore — an AI-Readiness scorecard, a natural-language narrative of findings with evidence, and a single DuckDB SQL block that applies every fix. Edit cells in place, re-run the audit to watch the score climb, or download the notebook to open anywhere.

Score, findings, evidence, and the exact SQL to fix your data — as a live notebook, not a dead PDF. Bring your own notebooks too.

Quality Report Upload Notebook
Data_Quality_Report.ipynb .ipynb
Markdown
AI-Readiness Scorecard
87/ 100 · Grade B · Good

Completeness 92 · Validity 78 · Uniqueness 95 · PII 60 · Consistency 84 · Semantic 100

Markdown

Findings. Column email holds 312 unmasked addresses (critical). mrr shows 47 outliers beyond the IQR fence [12, 840]. signup_date has 22 unparseable values at rows 14, 89, 203…

DuckDB SQL · remediation
-- one block applies every fix CREATE TABLE customers_clean AS SELECT mask_email(email) AS email, try_strptime(signup_date, '%Y-%m-%d') AS signup_date, CASE WHEN region IN ('U.S.A.') THEN 'USA' ELSE region END AS region FROM customers_q3 WHERE mrr BETWEEN 12 AND 840;
Cross-dataset chat

Ask across many datasets.

Kat doesn't stop at a single file. It pulls in every dataset your question touches, registers them together, and writes one query that joins them — CTEs, window functions, and all — then answers in natural language with every source cited.

app.katcore.io / chat
sales_2025.csv+ returns_2025.csv2 datasets · 1 query
Which regions are losing the most to returns, relative to sales this year?
Kat

EMEA has the steepest return rate at 9.4% — $84.2K refunded against $897K in sales, more than double the 4.1% company average. APAC follows at 6.7%. Sources: sales_2025.csv · returns_2025.csv

DuckDB SQL · one query, two datasets
-- one query spans both datasets WITH sales AS ( SELECT region, SUM(amount) AS revenue FROM sales_2025 GROUP BY region ), returns AS ( SELECT region, SUM(amount) AS refunded FROM returns_2025 GROUP BY region ) SELECT s.region, round(r.refunded / s.revenue * 100, 1) AS return_rate FROM sales s JOIN returns r USING (region) ORDER BY return_rate DESC;
Powered by AI

Real AI, exactly where it helps.

KatCore puts large language models and embeddings to work where they earn it — understanding your data, answering your questions, and cleaning the mess. The numbers you rely on stay deterministic.

It understands your columns

On ingest, an LLM reads and describes every column, and embeddings map what your data means — so search and answers are grounded in your real schema, not guesswork.

LLM + embeddings

It answers in natural language

Ask Kat a question. It reads your intent, finds the right files by meaning, writes and runs the query, then explains the result — with the source file cited.

Intent · retrieval · synthesis

It cleans with context

AI reconciles "USA" vs "U.S.A.", flags values that don't belong, parses messy dates, and detects PII — then writes the exact fix. You preview before anything changes.

LLM + embeddings + NER

Grounded, not guessed

Kat never invents a number. Every answer is computed from your actual data and traces back to the rows and the source file it came from — so you can verify it, not just trust it.

Computed · traceable

Scores, statistics, and duplicate detection stay deterministic and fully explainable — no model guesswork in the numbers you trust.

Why KatCore

Built to be trusted with your data.

Automatic understanding

Every column is labeled and described on ingest. KatCore understands your data before you even ask.

Natural-language answers

Ask a question, get a written answer with the numbers and the source file cited. No SQL required.

Trust score & quality

A 0–100 readiness score with a point-by-point fix-list. Know exactly what's wrong and how to fix it.

Isolated & encrypted

Every workspace is logically separated and encrypted at rest. Your data stays yours.

Built on infrastructure you already trust
Polars DuckDB Cloudflare R2 Railway Postgres
Pricing

Pricing that scales.

Start free and upload your first dataset in minutes. Upgrade when your team grows.

Individual
Free
$0/mo
  • 100 AI credits / mo
  • 1 project · 100 MB storage
  • File upload, semantic & chat
  • Community support
Get Started Free
Most Popular
Small teams
Starter
$10/mo
  • 2,000 AI credits / mo
  • 3 projects · 500 MB storage
  • File + URL sources
  • 50 quality audits · daily schedules
  • Email support
Upgrade to Starter
Growing teams
Growth
$20/mo
  • 10,000 AI credits / mo
  • 10 projects · 2 GB storage
  • All sources — URL, API, database
  • Unlimited quality audits
  • Priority support
Upgrade to Growth

Need higher limits or on-prem deployment? Talk to us.

Go from messy file to trusted answers — today.