Security Architecture

Why clients can trust this.

Internal explainer for the team. Know it. Sell it. Mean it.

The Big Picture

Every client gets their own isolated server. Not a shared environment. Not a container on someone else's machine. A full Linux server, hardened from the ground up, with nothing exposed to the public internet.

Client VPS
Dedicated Linux server
Hardened install
Tailscale VPN
Encrypted tunnel
Authenticated devices only
Public Internet
No connection
No open ports

"Your data is on your server. Not ours. Not shared with anyone else."

Your Own Private Server

Per-client VPS isolation

Every customer runs on their own dedicated server. This is the foundation everything else builds on.

Why This Matters

Most AI platforms are multi-tenant — your data sits next to every other customer's on the same servers. One vulnerability exposes everyone. A dedicated server means a breach elsewhere literally cannot reach your data. There's no shared surface.

It's Your Server — We Hand Over the Keys

This is the fundamental difference. When you connect your Google account, your Carestack, your social media — you're granting access to your own server. Not our cloud. Not a shared platform. Your machine, in your name, that you own.

We set it up, harden it, and hand you the credentials. If you ever want to walk away, the server is yours. The data is yours. There's nothing to "export" — it's already on your hardware.

Dedicated, Not Shared

One Linux VPS per customer. No multi-tenant risk. No noisy neighbours. No shared databases. If another client's server is compromised, yours is completely unaffected — they're separate machines.

Hardened From the Ground Up

Standardised Linux install with locked-down ports, strict firewall rules, no unnecessary services running. Every server gets the same hardening treatment — it's automated, not manual. Nothing left to chance.

Nothing on the Public Internet

All services bind to the Tailscale VPN network only. No open ports. No public endpoints. No web-facing admin panels. The server simply doesn't exist on the public internet.

Data Sovereignty by Architecture

Australian clients get a Sydney VPS. Patient data stays in Australia — not by policy or promise, but because the server is physically in Sydney. There's no mechanism for data to leave.

"You're not granting access to our cloud service. You're granting access to your own server. We hand over the keys."

Invisible to the Internet

VPN-only access via Tailscale

Every connection to the server goes through Tailscale — a modern mesh VPN. No exceptions.

Why This Matters

Every major breach starts with an exposed endpoint. If a server isn't visible on the internet, it can't be port-scanned, can't be brute-forced, can't be targeted by automated attacks. The entire category of internet-facing vulnerabilities simply doesn't apply.

🔐
Every Device Authenticated

Each device on the network is cryptographically authenticated. You can't just know the address — your device has to be explicitly authorised to connect.

🛡
Every Connection Encrypted

All traffic is encrypted end-to-end through the VPN tunnel. Even if someone intercepted the network traffic, they'd see encrypted noise.

🚫
Zero Attack Surface

No open ports. No public endpoints. No attack surface. Even the management interface (OpenClaw Gateway) requires both VPN access and a valid authentication token.

Compare this to typical AI SaaS: public login pages, API endpoints exposed to the internet, shared infrastructure. Our clients have none of that.

Sensitive Data Detection

PII / PHI scanner — Australian identifiers

An offline scanner that detects sensitive personal and health information across the entire system. Runs nightly. Fully local — no data leaves the server.

Why This Matters

Under the Privacy Act, even accidental exposure of a Medicare number is a notifiable data breach. The health sector reports more breaches than any other industry in Australia. This scanner catches leaks before they become incidents — and it runs every night, not once a year.

What It Detects
Medicare numbers
Tax File Numbers
Individual Healthcare IDs
Credit card numbers
Person names
Email addresses
Phone numbers
Health context terms
ABN / ACN
IBAN codes
How It Works

Built on Microsoft Presidio (open source, MIT licensed). Runs entirely offline — no API calls, no data leaving the server.

Australian identifiers use checksum validation, not just pattern matching. It validates Medicare numbers with the same algorithm Medicare uses. It doesn't guess — it mathematically confirms.

Critical design: findings are redacted before output. Real PII never enters the security report, never reaches the AI. The scanner reports that it found something, not what it found.

What Gets Scanned

OpenClaw workspace and memory. Social tracker data. Published web reports. Collected website pages and blog posts. Council data assembly. Every .md, .txt, .log, .csv, .html, and .env file on the system.

"The scanner validates Medicare numbers with the same algorithm Medicare uses. It doesn't guess — it mathematically confirms."

Patient Data Never Reaches the AI

CDP de-identification proxy

When the AI connects to patient management systems (like Carestack), a local proxy sits between the connection and the AI. It strips all patient identifiers before the AI sees anything.

Think of it like Blu-ray HDCP encryption — data flows through every component encrypted, and only the TV screen decrypts it.

Why This Matters

When patient data crosses a border to reach a US-based AI service, it triggers APP 8 — cross-border disclosure. The practice remains legally liable for anything that happens to that data overseas, including if it ends up in training data, log files, or a vendor breach. Tokenisation means real patient data never leaves Australia, never enters AI logs, and never creates a cross-border disclosure event.

This means OpenClaw can work freely across bookings, medical records, patient notes — whatever it needs. All personal data is replaced with anonymous tokens before the AI sees it. A name becomes [name_a3f9], a date of birth becomes [dob_7e21]. The AI works with these tokens just as effectively — it can compare, sort, summarise, flag issues — without ever knowing the real data behind them. And it's safe by default: even if the AI wanders into a medical record you didn't point it at, there's nothing to leak. It's blind to the personal information, but fully capable of doing its job.

Carestack: John Smith proxy strips at entry
AI sees: [PATIENT_1] tokens through entire system
Reports: [PATIENT_1] logs, boards, memory — all tokenised
PII scanner: [REDACTED] catches leaks without escalating
SMS: "Hi John, your..." only rehydrated at verified exit
What Gets Stripped

Patient names. Dates of birth. Medicare numbers. Phone numbers (mobile and landline, AU and international format). Email addresses. Street addresses. Individual Healthcare Identifiers (IHI). Tax File Numbers. Credit card numbers.

What the AI Still Sees

Clinical codes and treatment types. Appointment dates and times. Financial amounts. Page structure and workflow steps. Error messages and system status. Everything it needs to work — nothing it shouldn't know.

Verified Exit Boundaries — The Only Places Real Data Appears
✓ Dentist's browser screen
✓ Patient SMS / email
✓ Carestack API calls
✓ Calendar bookings
✓ Printed documents
✓ WhatsApp to patient

Everything else — AI prompts, responses, disk writes, dashboards, project boards, council reports, backups — stays tokenised. Always.

"The AI never sees a patient's name. Not in memory. Not in logs. Not in reports. Only the dentist's screen and the patient's own phone see the real data."

Vault encryption: AES-256-GCM authenticated encryption (NIST SP 800-38D) with unique random 96-bit nonces per record. HMAC-SHA256 token derivation. All PII encrypted at rest in PostgreSQL — the database never stores plaintext.

Security Audit Every Night

12-point nightly review + AI security council

A 12-point infrastructure audit runs every night at 3:30 AM. Then four AI security specialists argue about what they found.

Why This Matters

The average time to detect a breach in Australia is 163 days. That's five months of exposure before anyone notices. A nightly audit compresses that detection window to 24 hours maximum. Issues found tonight are flagged by morning.

The 12-Point Check

Port lockdown. Firewall rules. Backup verification. Service health. Tailscale status. PII scan results. OpenClaw security audit. Version checks. Log review. Error analysis. Certificate status. Configuration drift detection.

Offensive

What could an attacker exploit? Thinks like a hacker trying to break in.

🛡
Defensive

Are protections adequate? Reviews each control and asks whether it's actually working.

🔒
Data Privacy

Is sensitive data handled correctly? Checks every data flow against privacy requirements.

🧐
Operational Realism

Are security measures practical or just theatre? Catches controls that look good on paper but don't work in practice.

Every night
3:30 AM automatic run
Instant alert
Critical findings push immediately
"Fix it"
Owner responds, system implements

"Most businesses get a security audit once a year. Our clients get one every night."

Protecting the AI From Bad Input

3-stage prompt injection defence

When the AI reads external content — emails, web pages, social media — attackers can try to hide instructions in that content. We have three layers of defence.

Why This Matters

Prompt injection is the #1 security risk for AI systems — it's OWASP's top LLM vulnerability. An attacker embeds instructions in an email or web page, and the AI follows them instead of yours. Without defences, a single malicious email could cause the AI to leak confidential data or take unauthorised actions.

1

Deterministic Sanitiser

Code-based, not AI-based

Traditional regex-based scanning reads all external content before the AI ever sees it. Known injection patterns are stripped. This layer doesn't need to be smart — it just needs to be fast and thorough.

2

Quarantine & Frontier Scan

Isolated, AI-reviewed

External data is placed in complete isolation — it cannot access anything else in the system. The most capable AI model available scans the quarantined content and assigns a risk score. The scanner itself is sandboxed.

3

Elevated Risk Assessment

Score-based decision

High confidence of safety → proceed. Low confidence → block and log reason. Any attempt to change config or system behaviour → ignore and report as an injection attempt. No grey area on config changes — those are always blocked.

Also Protected

Email-specific: all incoming email content is sanitised before AI classification. SSRF prevention: only http/https URLs accepted — file://, ftp://, javascript://, data:// schemes are rejected.

"External content goes through three security checkpoints before the AI reads it. Most systems have zero."

Who Sees What

Data classification & access control

Not all data is equal. The system classifies everything into three tiers and enforces access based on context — who's asking, and through which channel.

Why This Matters

AI assistants that can access everything and share anything are a liability. One message in the wrong channel and confidential financials, patient details, or strategic plans are exposed. Context-aware classification means the system enforces boundaries even when humans forget to.

Confidential

Owner direct message only.

Financial figures, CRM contact details, deal values, daily notes, personal emails, system memory.

Internal

Team channels OK, no external.

Strategic notes, council recommendations, tool outputs, knowledge base content, project tasks, system health.

Restricted

External only with explicit approval.

General knowledge responses only. Everything else requires "share this" before it leaves internal channels.

Context-Aware Enforcement

The system knows if it's in a DM, group chat, or external channel. Data surfaces accordingly. When the context is ambiguous, it defaults to the more restrictive tier. Never the other way.

Outbound Redaction

All outgoing messages are scanned for personal data, credential-looking strings, and sensitive information. Two layers: deterministic (pattern matching) AND AI-based (contextual understanding). Both must pass.

Hard Rules

The AI does NOT send emails on behalf of the owner — drafts only. Does NOT post to social media without explicit approval. Does NOT write to external systems (email, calendar, CRM) without permission. Every external action requires a human in the loop.

Built-In Platform Security

OpenClaw standard precautions

On top of everything we've built, OpenClaw itself ships with security precautions baked in.

Why This Matters

These are the basics that most AI deployments skip. Leaked API keys, credentials in git repos, secrets in log files — these are how real breaches happen. Not sophisticated attacks. Mundane mistakes that automated controls prevent.

Token-Based Gateway Auth

Every request to the gateway requires a valid authentication token. No token, no access.

File Permissions Enforced

Sensitive files like .env are locked to owner-only read (chmod 600). Not readable by other users or processes on the server.

Pre-Commit Hooks

Git hooks block API keys, bearer tokens, OAuth tokens, and other credential patterns from ever being committed to version control.

No Secrets in Logs

Secrets are never stored in log files and never sent to messaging channels. Environment files are excluded from version control.

SQL Injection Protection

Parameterised queries across all database operations. No raw string concatenation in SQL.

Controlled Updates

The system checks for OpenClaw updates but requires explicit approval before applying them. No silent auto-updates.

OAuth Credentials Under Lock and Key

Encrypted credential vault

When you connect Google services (YouTube, Analytics, Gmail, Calendar, etc.), the OAuth tokens that grant access are stored in a dedicated encrypted credential vault on your server — never in plain text, never in config files, never in the AI’s memory.

Why This Matters

OAuth tokens are the keys to your Google accounts. If a plain-text token.json file leaks — through a backup, a log file, or a misconfigured server — anyone can read your email or access your analytics. Encrypted vault storage means even if someone gains access to the database, the tokens are useless without the encryption key.

AES-256 Encryption at Rest

Every OAuth token is encrypted with AES-256-GCM before it hits the database. The encryption key is stored separately from the data.

Automatic Token Refresh

Google tokens expire every hour. The credential manager automatically refreshes them in the background — no manual intervention, no expired token errors, no credential files to manage.

One-Click Revoke

Disconnect any service instantly from the setup page. The token is deleted from the vault immediately — not just marked inactive, actually removed.

Localhost Only

The credential vault is only accessible from localhost on your server. It does not listen on any public port and cannot be reached from outside the VPN.

Minimal Scopes

Each service requests only the permissions it needs. YouTube gets read-only analytics. Calendar gets read-only events. No service gets more access than its specific purpose requires.

No Tokens in AI Context

OAuth tokens are never passed to the AI model. The AI requests data through server-side code that handles authentication — the model never sees, stores, or can leak an access token.

"Your Google credentials are encrypted at rest, auto-refreshed, accessible only from localhost, and never exposed to the AI. Disconnecting a service deletes the token permanently."

Backups You Can't Break Into

Encrypted hourly backups

Hourly automated backups. Encrypted before they leave the server. Two layers of protection even if someone compromises the backup destination.

Why This Matters

Ransomware doesn't just encrypt your data — it targets your backups first. Offline, encrypted backups that are verified hourly mean you can recover from a worst-case scenario without paying a ransom or losing more than an hour of data.

Database Backups

Runs hourly. Auto-discovers all databases on the system — no manual configuration needed. Bundles into an encrypted archive and uploads to Google Drive.

Two-layer encryption: Google Drive access credentials + separate archive password. You can't get to them even if you know where they are.

Code & Config Backups

Git autosync runs hourly — auto-commits workspace changes and pushes to the remote repository. Complete version history preserved.

Pre-commit hooks ensure no credentials make it into the repository. Browser profile cookies blocked from commits.

Hourly
Automatic backup cycle
7 days
Point-in-time restore
Instant alert
On any backup failure

Australian Regulations

Privacy Act, AHPRA, HRIP Act compliance

We're building for regulated industries. Australian dental is the first — and it has some of the strictest rules on earth.

Why This Matters

A dental practice that uses AI to process patient data without proper controls faces up to $50 million in penalties under the Privacy Act — and since June 2025, patients can sue directly. These aren't theoretical risks. The OAIC is actively running compliance sweeps on health providers right now.

Privacy Act 1988

All 13 Australian Privacy Principles apply. Health service providers have no small business exemption. Penalties up to $50 million, or 30% of annual turnover, or 3x the benefit gained — whichever is highest.

New statutory tort (June 2025): patients can now sue directly for serious privacy breaches.

AHPRA Advertising

The strictest healthcare advertising rules globally. Patient testimonials prohibited. Before/after photos need specific disclaimers. Can't even reply "thank you" to a review that mentions treatment outcomes.

Our Regulations Advisor scans all public content every night at 5 AM.

NSW HRIP Act 2002

Additional health privacy requirements for NSW practices. 7-year record retention for adults. Records for minors kept until the patient turns 25. Must log disposal or transfer of records.

Notifiable Data Breaches

Health sector is #1 for reported data breaches nationally (18% of all notifications). Health providers are covered regardless of turnover. Contain immediately, assess within 30 days, notify within 30 days.

APP 8: Cross-Border Disclosure — Our Answer

This is the big one for AI. If patient data crosses a border, the practice remains accountable for any breach by the overseas recipient. Our answer: Sydney VPS keeps data in Australia. CDP proxy tokenises patient identifiers before they reach any AI model. The AI processes tokens, not people.

This is why we take it seriously. Not because compliance is a feature to market — because getting it wrong costs a dental practice $50 million and its reputation.

Encryption at Rest UNDER CONSIDERATION

Currently evaluating full encryption at rest to complete the triple layer.

Why This Matters

If someone physically accesses the server — a data centre employee, a stolen drive, a decommissioned disk — unencrypted data is immediately readable. Encryption at rest means even physical access to the hardware reveals nothing without the decryption key.

In Transit

Tailscale encrypts all network traffic. Already done.

In Processing

CDP proxy tokenises patient data. Token vault uses AES-256-GCM. Already done.

At Rest

LUKS full-disk encryption. SQLite encryption extensions. Encrypted swap. Under evaluation.

"Data is already encrypted in transit and tokenised in processing. We're adding encryption at rest to complete the triple layer."

The Complete Picture

Every layer, when it runs, what it does.

Layer What When
VPS IsolationDedicated server per clientAlways
VPN OnlyTailscale, no public internetAlways
CDP ProxyPatient data tokenised at sourceEvery connection
PII ScannerAustralian identifiers with checksum validationNightly
Security Council4-perspective AI security auditNightly 3:30 AM
Regulations AdvisorAHPRA / Privacy Act content scanNightly 5:00 AM
Prompt Injection3-stage defence on external contentEvery input
Access Control3-tier data classificationEvery message
Encrypted BackupsDual-encrypted to Google DriveHourly
Encryption at RestFull-disk + database encryptionComing soon

"Security isn't a feature we added. It's the architecture we started with."