Security Architecture

Why clients can trust this.

Internal explainer for the team. Know it. Sell it. Mean it.

The Big Picture

Every client gets their own isolated server. Not a shared environment. Not a container on someone else's machine. A full Linux server, hardened from the ground up, with nothing exposed to the public internet.

Client VPS
Dedicated Linux server
Hardened install
Tailscale VPN
Encrypted tunnel
Authenticated devices only
Public Internet
No connection
No open ports

"Your data is on your server. Not ours. Not shared with anyone else."

Your Own Private Server

Per-client VPS isolation

Every customer runs on their own separate Linux server — a private VPS with its own operating system, its own firewall, and its own data. This is the foundation everything else builds on.

Why This Matters

Most AI platforms are multi-tenant — your data sits in a shared application alongside every other customer's. One vulnerability in the app exposes everyone. With per-client VPS isolation, each customer has their own operating system, their own processes, their own database. A compromise of another client's server doesn't give an attacker access to yours — there's no shared application layer to traverse.

Your Server — We Hand Over the Keys

This is the fundamental difference. When you connect your Google account, your Carestack, your social media — you're granting access to your own server instance. Not our cloud. Not a shared platform. A separate Linux install provisioned for you, with full admin credentials.

We set it up, harden it, and hand you the credentials. If you ever want to walk away, the data is yours. There's nothing to "export" — it's already sitting on your server. We can hand over full control or migrate to hardware you own.

Isolated, Not Shared

One Linux VPS per customer. Separate operating system, separate processes, separate database. No shared application layer. No noisy neighbours. Each server is its own isolated environment — not a partition of someone else's.

Hardened From the Ground Up

Standardised Linux install with locked-down ports, strict firewall rules, no unnecessary services running. Every server gets the same hardening treatment — it's automated, not manual. Nothing left to chance.

Nothing on the Public Internet

All services bind to the Tailscale VPN network only. No open ports. No public endpoints. No web-facing admin panels. The server simply doesn't exist on the public internet.

Data Sovereignty by Architecture

Australian clients get a Sydney VPS. Patient data stays in Australia — not by policy or promise, but because the server is physically in Sydney. There's no mechanism for data to leave.

"You're not granting access to our cloud service. You're granting access to your own server. We hand over the keys."

Invisible to the Internet

VPN-only access via Tailscale

Every connection to the server goes through Tailscale — a modern mesh VPN. No exceptions.

Why This Matters

Every major breach starts with an exposed endpoint. If a server isn't visible on the internet, it can't be port-scanned, can't be brute-forced, can't be targeted by automated attacks. The entire category of internet-facing vulnerabilities simply doesn't apply.

🔐
Every Device Authenticated

Each device on the network is cryptographically authenticated. You can't just know the address — your device has to be explicitly authorised to connect.

🛡
Every Connection Encrypted

All traffic is encrypted end-to-end through the VPN tunnel. Even if someone intercepted the network traffic, they'd see encrypted noise.

🚫
Zero Attack Surface

No open ports. No public endpoints. No attack surface. Even the management interface (OpenClaw Gateway) requires both VPN access and a valid authentication token.

Compare this to typical AI SaaS: public login pages, API endpoints exposed to the internet, shared infrastructure. Our clients have none of that.

Sensitive Data Detection

PII / PHI scanner — Australian identifiers

An offline scanner that detects sensitive personal and health information across the entire system. Runs nightly. Fully local — no data leaves the server.

Why This Matters

Under the Privacy Act, even accidental exposure of a Medicare number is a notifiable data breach. The health sector reports more breaches than any other industry in Australia. This scanner catches leaks before they become incidents — and it runs every night, not once a year.

What It Detects
Medicare numbers
Tax File Numbers
Individual Healthcare IDs
Credit card numbers
Person names
Email addresses
Phone numbers
Health context terms
ABN / ACN
IBAN codes
How It Works

Built on Microsoft Presidio (open source, MIT licensed). Runs entirely offline — no API calls, no data leaving the server.

Australian identifiers use checksum validation, not just pattern matching. It validates Medicare numbers with the same algorithm Medicare uses. It doesn't guess — it mathematically confirms.

Critical design: findings are redacted before output. Real PII never enters the security report, never reaches the AI. The scanner reports that it found something, not what it found.

What Gets Scanned

OpenClaw workspace and memory. Social tracker data. Published web reports. Collected website pages and blog posts. Council data assembly. Every .md, .txt, .log, .csv, .html, and .env file on the system.

"The scanner validates Medicare numbers with the same algorithm Medicare uses. It doesn't guess — it mathematically confirms."

Patient Data Never Reaches the AI

CDP de-identification proxy

When the AI connects to patient management systems (like Carestack), a local proxy sits between the connection and the AI. It strips all patient identifiers before the AI sees anything.

Think of it like Blu-ray HDCP encryption — data flows through every component encrypted, and only the TV screen decrypts it.

Why This Matters

When patient data crosses a border to reach a US-based AI service, it triggers APP 8 — cross-border disclosure. The practice remains legally liable for anything that happens to that data overseas, including if it ends up in training data, log files, or a vendor breach. Tokenisation means real patient data never leaves Australia, never enters AI logs, and never creates a cross-border disclosure event.

This means OpenClaw can work freely across bookings, medical records, patient notes — whatever it needs. All personal data is replaced with anonymous tokens before the AI sees it. A name becomes [name_a3f9], a date of birth becomes [dob_7e21]. The AI works with these tokens just as effectively — it can compare, sort, summarise, flag issues — without ever knowing the real data behind them. And it's safe by default: even if the AI wanders into a medical record you didn't point it at, there's nothing to leak. It's blind to the personal information, but fully capable of doing its job.

Carestack: John Smith proxy strips at entry
AI sees: [PATIENT_1] tokens through entire system
Reports: [PATIENT_1] logs, boards, memory — all tokenised
PII scanner: [REDACTED] catches leaks without escalating
SMS: "Hi John, your..." only rehydrated at verified exit
What Gets Stripped

Patient names. Dates of birth. Medicare numbers. Phone numbers (mobile and landline, AU and international format). Email addresses. Street addresses. Individual Healthcare Identifiers (IHI). Tax File Numbers. Credit card numbers.

What the AI Still Sees

Clinical codes and treatment types. Appointment dates and times. Financial amounts. Page structure and workflow steps. Error messages and system status. Everything it needs to work — nothing it shouldn't know.

Verified Exit Boundaries — The Only Places Real Data Appears
✓ Dentist's browser screen
✓ Patient SMS / email
✓ Carestack API calls
✓ Calendar bookings
✓ Printed documents
✓ WhatsApp to patient

Everything else — AI prompts, responses, disk writes, dashboards, project boards, council reports, backups — stays tokenised. Always.

"The AI never sees a patient's name. Not in memory. Not in logs. Not in reports. Only the dentist's screen and the patient's own phone see the real data."

Vault encryption: AES-256-GCM authenticated encryption (NIST SP 800-38D) with unique random 96-bit nonces per record. HMAC-SHA256 token derivation. All PII encrypted at rest in PostgreSQL — the database never stores plaintext.

Security Audit Every Night

12-point nightly review + AI security council

A 12-point infrastructure audit runs every night at 3:30 AM. Then four AI security specialists argue about what they found.

Why This Matters

The average time to detect a breach in Australia is 163 days. That's five months of exposure before anyone notices. A nightly audit compresses that detection window to 24 hours maximum. Issues found tonight are flagged by morning.

The 12-Point Check

Port lockdown. Firewall rules. Backup verification. Service health. Tailscale status. PII scan results. OpenClaw security audit. Version checks. Log review. Error analysis. Certificate status. Configuration drift detection.

Offensive

What could an attacker exploit? Thinks like a hacker trying to break in.

🛡
Defensive

Are protections adequate? Reviews each control and asks whether it's actually working.

🔒
Data Privacy

Is sensitive data handled correctly? Checks every data flow against privacy requirements.

🧐
Operational Realism

Are security measures practical or just theatre? Catches controls that look good on paper but don't work in practice.

Every night
3:30 AM automatic run
Instant alert
Critical findings push immediately
"Fix it"
Owner responds, system implements

"Most businesses get a security audit once a year. Our clients get one every night."

Protecting the AI From Bad Input

3-stage prompt injection defence

When the AI reads external content — emails, web pages, social media — attackers can try to hide instructions in that content. We have three layers of defence.

Why This Matters

Prompt injection is the #1 security risk for AI systems — it's OWASP's top LLM vulnerability. An attacker embeds instructions in an email or web page, and the AI follows them instead of yours. Without defences, a single malicious email could cause the AI to leak confidential data or take unauthorised actions.

1

Deterministic Sanitiser

Code-based, not AI-based

Traditional regex-based scanning reads all external content before the AI ever sees it. Known injection patterns are stripped. This layer doesn't need to be smart — it just needs to be fast and thorough.

2

Quarantine & Frontier Scan

Isolated, AI-reviewed

External data is placed in complete isolation — it cannot access anything else in the system. The most capable AI model available scans the quarantined content and assigns a risk score. The scanner itself is sandboxed.

3

Elevated Risk Assessment

Score-based decision

High confidence of safety → proceed. Low confidence → block and log reason. Any attempt to change config or system behaviour → ignore and report as an injection attempt. No grey area on config changes — those are always blocked.

Also Protected

Email-specific: all incoming email content is sanitised before AI classification. SSRF prevention: only http/https URLs accepted — file://, ftp://, javascript://, data:// schemes are rejected.

"External content goes through three security checkpoints before the AI reads it. Most systems have zero."

Who Sees What

Data classification & access control

Not all data is equal. The system classifies everything into three tiers and enforces access based on context — who's asking, and through which channel.

Why This Matters

AI assistants that can access everything and share anything are a liability. One message in the wrong channel and confidential financials, patient details, or strategic plans are exposed. Context-aware classification means the system enforces boundaries even when humans forget to.

Confidential

Owner direct message only.

Financial figures, CRM contact details, deal values, daily notes, personal emails, system memory.

Internal

Team channels OK, no external.

Strategic notes, council recommendations, tool outputs, knowledge base content, project tasks, system health.

Restricted

External only with explicit approval.

General knowledge responses only. Everything else requires "share this" before it leaves internal channels.

Context-Aware Enforcement

The system knows if it's in a DM, group chat, or external channel. Data surfaces accordingly. When the context is ambiguous, it defaults to the more restrictive tier. Never the other way.

Outbound Redaction

All outgoing messages are scanned for personal data, credential-looking strings, and sensitive information. Two layers: deterministic (pattern matching) AND AI-based (contextual understanding). Both must pass.

Hard Rules

The AI does NOT send emails on behalf of the owner — drafts only. Does NOT post to social media without explicit approval. Does NOT write to external systems (email, calendar, CRM) without permission. Every external action requires a human in the loop.

Built-In Platform Security

OpenClaw standard precautions

On top of everything we've built, OpenClaw itself ships with security precautions baked in.

Why This Matters

These are the basics that most AI deployments skip. Leaked API keys, credentials in git repos, secrets in log files — these are how real breaches happen. Not sophisticated attacks. Mundane mistakes that automated controls prevent.

Token-Based Gateway Auth

Every request to the gateway requires a valid authentication token. No token, no access.

File Permissions Enforced

Sensitive files like .env are locked to owner-only read (chmod 600). Not readable by other users or processes on the server.

Pre-Commit Hooks

Git hooks block API keys, bearer tokens, OAuth tokens, and other credential patterns from ever being committed to version control.

No Secrets in Logs

Secrets are never stored in log files and never sent to messaging channels. Environment files are excluded from version control.

SQL Injection Protection

Parameterised queries across all database operations. No raw string concatenation in SQL.

Controlled Updates

The system checks for OpenClaw updates but requires explicit approval before applying them. No silent auto-updates.

OAuth Credentials Under Lock and Key

Encrypted credential vault

When you connect Google services (YouTube, Analytics, Gmail, Calendar, etc.), the OAuth tokens that grant access are stored in a dedicated encrypted credential vault on your server — never in plain text, never in config files, never in the AI’s memory.

Why This Matters

OAuth tokens are the keys to your Google accounts. If a plain-text token.json file leaks — through a backup, a log file, or a misconfigured server — anyone can read your email or access your analytics. Encrypted vault storage means even if someone gains access to the database, the tokens are useless without the encryption key.

AES-256 Encryption at Rest

Every OAuth token is encrypted with AES-256-GCM before it hits the database. The encryption key is stored separately from the data.

Automatic Token Refresh

Google tokens expire every hour. The credential manager automatically refreshes them in the background — no manual intervention, no expired token errors, no credential files to manage.

One-Click Revoke

Disconnect any service instantly from the setup page. The token is deleted from the vault immediately — not just marked inactive, actually removed.

Localhost Only

The credential vault is only accessible from localhost on your server. It does not listen on any public port and cannot be reached from outside the VPN.

Minimal Scopes

Each service requests only the permissions it needs. YouTube gets read-only analytics. Calendar gets read-only events. No service gets more access than its specific purpose requires.

No Tokens in AI Context

OAuth tokens are never passed to the AI model. The AI requests data through server-side code that handles authentication — the model never sees, stores, or can leak an access token.

"Your Google credentials are encrypted at rest, auto-refreshed, accessible only from localhost, and never exposed to the AI. Disconnecting a service deletes the token permanently."

Backups You Can't Break Into

Encrypted hourly backups

Hourly automated backups. Encrypted before they leave the server. Two layers of protection even if someone compromises the backup destination.

Why This Matters

Ransomware doesn't just encrypt your data — it targets your backups first. Offline, encrypted backups that are verified hourly mean you can recover from a worst-case scenario without paying a ransom or losing more than an hour of data.

Database Backups

Runs hourly. Auto-discovers all databases on the system — no manual configuration needed. Bundles into an encrypted archive and uploads to Google Drive.

Two-layer encryption: Google Drive access credentials + separate archive password. You can't get to them even if you know where they are.

Code & Config Backups

Git autosync runs hourly — auto-commits workspace changes and pushes to the remote repository. Complete version history preserved.

Pre-commit hooks ensure no credentials make it into the repository. Browser profile cookies blocked from commits.

Hourly
Automatic backup cycle
7 days
Point-in-time restore
Instant alert
On any backup failure

Australian Regulations

Privacy Act, AHPRA, HRIP Act compliance

We're building for regulated industries. Australian dental is the first — and it has some of the strictest rules on earth.

Why This Matters

A dental practice that uses AI to process patient data without proper controls faces up to $50 million in penalties under the Privacy Act — and since June 2025, patients can sue directly. These aren't theoretical risks. The OAIC is actively running compliance sweeps on health providers right now.

Privacy Act 1988

All 13 Australian Privacy Principles apply. Health service providers have no small business exemption. Penalties up to $50 million, or 30% of annual turnover, or 3x the benefit gained — whichever is highest.

New statutory tort (June 2025): patients can now sue directly for serious privacy breaches.

AHPRA Advertising

The strictest healthcare advertising rules globally. Patient testimonials prohibited. Before/after photos need specific disclaimers. Can't even reply "thank you" to a review that mentions treatment outcomes.

Our Regulations Advisor scans all public content every night at 5 AM.

NSW HRIP Act 2002

Additional health privacy requirements for NSW practices. 7-year record retention for adults. Records for minors kept until the patient turns 25. Must log disposal or transfer of records.

Notifiable Data Breaches

Health sector is #1 for reported data breaches nationally (18% of all notifications). Health providers are covered regardless of turnover. Contain immediately, assess within 30 days, notify within 30 days.

APP 8: Cross-Border Disclosure — Our Answer

This is the big one for AI. If patient data crosses a border, the practice remains accountable for any breach by the overseas recipient. Our answer: Sydney VPS keeps data in Australia. CDP proxy tokenises patient identifiers before they reach any AI model. The AI processes tokens, not people.

This is why we take it seriously. Not because compliance is a feature to market — because getting it wrong costs a dental practice $50 million and its reputation.

Encryption at Rest UNDER CONSIDERATION

Currently evaluating full encryption at rest to complete the triple layer.

Why This Matters

If someone physically accesses the server — a data centre employee, a stolen drive, a decommissioned disk — unencrypted data is immediately readable. Encryption at rest means even physical access to the hardware reveals nothing without the decryption key.

In Transit

Tailscale encrypts all network traffic. Already done.

In Processing

CDP proxy tokenises patient data. Token vault uses AES-256-GCM. Already done.

At Rest (VPS)

LUKS full-disk encryption for cloud VPS. Under evaluation.

At Rest (Mac Mini)

FileVault full-disk encryption enabled by default on macOS. Already done for local AI deployments.

"For larger clients on dedicated Mac Mini hardware, encryption at rest is already solved — FileVault ships enabled. VPS encryption is next."

Private AI Processing

Local inference — another security plane

We already tokenise and redact sensitive data before it reaches cloud AI models. Local AI inference adds a fundamentally different layer: for the most sensitive operations, the data never leaves the building at all.

Why This Matters

This isn't about replacing cloud models or saving money. It's about collapsing the blast radius. When AI processing runs on a Mac in your office, patient data never crosses a network, never lands on an external server, never appears in a third-party log file. There is no cross-border transfer to assess, no vendor sub-processor to audit, no breach notification to draft — because the data never left the room. You can run directly on raw personal information inside the practice's own environment, because the processing stays within your local security boundary rather than being disclosed to an external model provider.

The Inflection Point

Three things converged in early 2026 to make this practical for small business:

Hardware Got Fast Enough

A Mac Mini M4 Pro (32 GB) runs 27–31B parameter models at 12–18 tokens/sec. Cost: ~AUD $3,000. Fits on a desk, runs silent. The upcoming M5 Pro (mid-2026) doubles AI throughput again.

Models Got Smart Enough

Google's Gemma 4 (April 2026) — open source, Apache 2.0. The top 31B model fits in 20 GB of RAM thanks to joint optimisations between Google and the Ollama team. It supports tool use and function calling, and benchmarks competitively with models 5× its size. It can do real agent work.

The Platform Supports It

OpenClaw natively connects to local model backends (Ollama, llama.cpp). Our routing layer (LiteLLM) already directs traffic between providers — adding a local inference endpoint is a config change, not a rebuild.

How It Fits the Security Architecture

Local AI
On-premise Mac Mini
Patient records, rehydration
← sensitive — routine →
Cloud AI
Claude, GPT via LiteLLM
Tokenised/redacted traffic

The routing layer decides per-request. Sensitive tasks stay local. Routine tasks use cloud models (still tokenised).

Use Cases

Patient Record Summarisation

Read and summarise clinical notes without tokenising names, dates, or conditions. The AI runs on the practice's own hardware — nothing to redact because nothing leaves.

AI-Assisted PII Verification

Previously, you couldn't ask an AI "did we catch all the PII?" — because asking the question is the breach. With local inference, you can have a second AI pass review tokenisation output, flag missed identifiers, and confirm redaction quality. The check is no longer the risk.

Token Rehydration Checks

When we need to rehydrate tokenised PII for a report or letter, local AI can validate and format the output without the real data ever touching a cloud model.

Overnight Batch Processing

Local models are slower than cloud. That's fine for batch jobs — compliance reviews, document classification, bulk record processing — that run overnight.

New Business Cases

Tasks previously closed for security reasons — like AI-assisted treatment planning or insurance pre-auth — become viable when processing is guaranteed private.

Industry Validation

At GTC 2026, NVIDIA CEO Jensen Huang announced NemoClaw — an enterprise security layer built on top of OpenClaw (the platform our system is built on). Jensen called OpenClaw "the operating system for personal AI."

NemoClaw uses the same architecture we're describing: a policy-enforced gateway that routes requests between local inference and cloud models based on data sensitivity. NVIDIA pairs it with their DGX Spark hardware (~USD $3–4K). We achieve the same outcome with commodity Apple Silicon and open-source models — no vendor lock-in.

Reference Hardware
Mac Mini M4 Pro — Available Now
  • • 32 GB unified memory, 275 GB/s bandwidth
  • • Runs Gemma 4 27B at 12–18 tok/sec
  • • ~AUD $3,000 / ~USD $2,000
  • • Silent, desktop form factor
Mac Mini M5 Pro — Expected Mid-2026
  • • Up to 64 GB unified memory
  • • ~4× AI throughput vs M4 generation
  • • Same price bracket expected
  • • Purpose-built for on-device AI workloads

"Tokenisation protects what leaves. Local inference reduces what has to leave in the first place. Together, they create a tighter privacy and security architecture for sensitive AI workflows."

The Complete Picture

Every layer, when it runs, what it does.

Layer What When
VPS IsolationDedicated server per clientAlways
VPN OnlyTailscale, no public internetAlways
CDP ProxyPatient data tokenised at sourceEvery connection
PII ScannerAustralian identifiers with checksum validationNightly
Security Council4-perspective AI security auditNightly 3:30 AM
Regulations AdvisorAHPRA / Privacy Act content scanNightly 5:00 AM
Prompt Injection3-stage defence on external contentEvery input
Access Control3-tier data classificationEvery message
Encrypted BackupsDual-encrypted to Google DriveHourly
Local AI InferenceOn-premise private AI for sensitive tasksPer-request routing
Encryption at RestVPS: LUKS full-disk (coming soon). Mac Mini: FileVault enabled by defaultVPS soon / Mac ✓

"Security isn't a feature we added. It's the architecture we started with."