Built to keep your data where it belongs
Savvina AI is self-hosted by design. Your data never leaves your infrastructure. On top of that, every layer of the stack — from the network edge to the query result — enforces strict controls to prevent accidental exposure.
Core principles
Data stays on your infrastructure
Savvina AI runs entirely within your own environment. Your query results, schema metadata, and connection credentials never leave your network.
LLM receives schema, not data
The LLM only ever sees table and column names, optional sample values, and business descriptions. Query results are never sent to any external model.
Read-only by design
The SQL validator enforces a strict allowlist — only SELECT and WITH statements are permitted. INSERT, UPDATE, DELETE, and all DDL are unconditionally rejected at the validation layer before any database connection is opened.
No phone-home, no license server
The Community Edition makes no external calls except to your configured LLM provider API. There is no telemetry, no usage tracking, and no license validation server.
10 layers — network edge to query result
Network Edge Middleware
Every request passes through ScannerGuard (blocks exploit-scanner paths like /wp-admin, /jndi: Log4Shell), CSRF origin checks on mutating requests, and OWASP security headers (nosniff, DENY frames, HSTS, strict CSP) — all before any application code runs.
Rate Limiting
SlowAPI enforces per-user-per-connection limits: 20 requests/minute on the chat endpoint, 5 requests/minute on connection tests, and configurable limits on all auth endpoints. Exceeded limits return 429.
Authentication & Org Isolation
JWT access tokens carry user ID, org ID, and role — validated on every protected route. Every database query is filtered by org_id from the token; a bare lookup by ID is never used. Refresh tokens are stored as SHA-256 hashes; reuse of a revoked token revokes all tokens for that user.
Credential Encryption
Database passwords and API keys are stored Fernet-encrypted at rest (AES-128-CBC + HMAC-SHA256). They are never logged. The plaintext is decrypted into memory only at the moment a query is dispatched — never written to disk or cache.
SQL Query Validation
Every query — LLM-generated, user-edited, or re-sorted — passes a two-layer validator. The base layer accepts only SELECT and WITH (CTEs), rejects all DML/DDL and blocked keywords, and auto-injects a LIMIT clause. Dialect-specific layers add PostgreSQL and MySQL function blocklists (pg_sleep, COPY TO, SLEEP, LOAD DATA, etc.).
Query Complexity Limits
CROSS JOIN is unconditionally rejected — cartesian products are almost never intentional and can produce unbounded result sets. Queries against tables with more than 1 million rows that lack a WHERE clause are also rejected, preventing accidental full sequential scans.
Schema Privacy Filtering
Per-connection privacy settings control exactly what schema metadata the LLM sees. Excluded schemas, tables, and columns are stripped from the prompt entirely. Sensitive columns (matching patterns like email, ssn, password, credit_card, token) are annotated [SENSITIVE] with sample values suppressed. Query results undergo the same masking — sensitive column values are replaced with [REDACTED] before being returned to the client.
Column-Schema Validation
After the LLM generates a query, column references are cross-checked against the live schema. Any column that does not exist — or is excluded by privacy settings — triggers an automatic self-correction LLM call before the error surfaces to the user.
Execution Mode Gate
Each connection has a configurable execution mode. Auto-execute runs validated queries immediately. Review First holds the query in a pending state until a human approves it. Generate Only returns the SQL without ever executing it — useful for DBA-review workflows or VPN-only databases.
Execution Limits & Audit Logging
Result sets are capped at 1,000 rows by default. Queries time out at 30 seconds (enforced at the database driver level). Every executed query is written to the audit log with user ID, org ID, connection ID, table names referenced, execution time, and bytes scanned.
Credentials encrypted at rest
Database passwords and LLM API keys are stored using Fernet symmetric encryption
(AES-128-CBC + HMAC-SHA256). The encryption key lives only in your .env file —
Savvina AI never has access to it, and it never leaves your server.
Plaintext credentials are decrypted into memory only at the moment a query is dispatched.
Questions or a vulnerability to report? Email info@savvina.ai.
