Silicon Psyche Labs
See what your AI is actually doing.
We build behavioral telemetry for language models and agents — measure posture, drift, sycophancy, hallucination risk and human-AI safety, from the outside, with no access to weights.
The lab
Instruments for AI you can't see inside
Organizations deploy language models they cannot inspect — the model is a black box. Silicon Psyche Labs builds the instruments to classify, measure and track behavior over time, without access to weights, logits or training data. Why it matters: most failures don't announce themselves in the input. They show up in how the output behaves.
For developers & AI teams
One API call after your model's response returns deterministic behavioral scores — drift, sycophancy, hallucination risk. About five minutes to your first report, and no access to the model's internals.
For trust & safety
Detect when a conversation turns risky — suicidality, dissociation, crisis — and when your AI is under adversarial attack: prompt injection, jailbreaking, manipulation. Get real-time alerts so you can take corrective action. Fully deterministic, with auditable named-rule scoring.
For enterprise & compliance
Audit vendors, catch silent model updates, and keep a privacy-safe behavioral record — posture sequences only, no raw text retained, GDPR erasure in a single row.
Products
One platform, five instruments
Each one answers a different question about model behavior. They share a single fine-tuned encoder and a common scoring model.
Posture analysis
Single-agent posture on every response: 7 classifiers plus DRM dyadic risk for human-AI conversations.
Agentic analysis
Multi-agent systems as a graph: Swiss-Cheese alignment, cross-agent contagion, and temporal forecast.
Psychological risk profile
The Cybersecurity Psychology Framework: a 100-indicator behavioral risk profile across 10 categories, for human, AI or hybrid subjects (Canale, 2025).
Incident archive
Privacy-safe forensic memory: posture-sequence snapshots, zero raw text, single-row GDPR erasure.
Retrieval drift
Detects when conversational context biases RAG retrieval away from the user's original topic.
Resources
Explore the platform
Documentation, examples, and live demos — everything you need to get started.
Knowledge Base
Full encyclopedia of metrics, classifiers, API, and workflows. Search in any language.
Case Studies
Real-world behavioral incidents — annotated with PSA metrics and forensic traces.
Calibration Sessions
Browse 149 calibration sessions with full conversation transcripts — our AI behavioral calibration data library.
PSA in Action
15 curated sessions with full PSA analysis — posture grids, DRM badges, behavioral signals, and per-session scores.
CPF → PSA Mapping
How every PSA metric maps to a specific Cybersecurity Psychology Framework indicator.
How it works
From text to insight in three steps
Send text
Any model response, via our API or the web app. No API keys, no model access needed.
Analyze
Posture classifiers and agentic graph analysis, computed deterministically in real time.
Get insights
Drift, anomalies, crisis signals and forecasts — each with a named, auditable reason.
Research
Grounded in published science
Every PSA metric traces to a specific indicator in the Cybersecurity Psychology Framework — a published taxonomy of 100 pre-cognitive vulnerabilities (Canale, 2025).
Start measuring your models today
Free to start. 37 deterministic metrics. Real-time analysis. No model access required.