SESSION · LIVE FEED ACQUIRING

CH.01 EEG α-band —μV

CH.03 GSR / EDA —μS

CH.05 HRV interval —ms

PHYSIOLOGICAL STATE Sympathetic arousal ↑ Frontal α asymmetry L>R

AFFECTIVE INFERENCE

VALENCE

+0.72

AROUSAL

HIGH

GROUND TRUTH LABEL "Positive surprise · engaged"

Ground Truth Labeling Machine · Precision Biosignal Capture

Your labels
are the
bottleneck.

GTLM — Ground Truth Labeling Machine — is a controlled-environment biometric capture booth that automatically labels physiological and behavioral data at the resolution research has always needed, but never had.

Words are a lossy codec. Synchronized biosignals are the original file.

32+ Sensor channels

μs Sync precision

4× Label hierarchy

0 Manual annotations

Reserve Your Unit → See Sample Output

BUILD / A la carte · every sensor, dimension, material · Configure →

EEG · ECG · HRV · GSR · EMG · PROSODY · OCULOMETRY · PROXEMICS · RESPIRATION · FACIAL ACTION UNITS · POSTURAL ANALYSIS · SPEECH SENTIMENT · MICRO-EXPRESSIONS · BLINK RATE · PUPIL DILATION · VOCAL TREMOR · SKIN CONDUCTANCE · EEG · ECG · HRV · GSR · EMG · PROSODY · OCULOMETRY · PROXEMICS · RESPIRATION · FACIAL ACTION UNITS · POSTURAL ANALYSIS · SPEECH SENTIMENT · MICRO-EXPRESSIONS · BLINK RATE · PUPIL DILATION · VOCAL TREMOR · SKIN CONDUCTANCE · EEG · ECG · HRV · GSR · EMG · PROSODY · OCULOMETRY · PROXEMICS · RESPIRATION · FACIAL ACTION UNITS · POSTURAL ANALYSIS · SPEECH SENTIMENT · MICRO-EXPRESSIONS · BLINK RATE · PUPIL DILATION · VOCAL TREMOR · SKIN CONDUCTANCE ·

Data quality

What changes when the label is the measurement

Most affective datasets use self-report or rater annotation. That label is a compressed, delayed, language-encoded reconstruction of a physiological event. It is not the thing. It is a description of a memory of the thing.

	Standard workflow	With GTLM
Label source	Human rater or self-report, post-session	Concurrent physiological measurement
Temporal resolution	1-30 seconds per annotation	Sub-millisecond, all channels
Label type	Categorical (e.g. arousal 1-7)	Continuous, 32+ channels simultaneously
Inter-rater agreement	κ ≈ 0.4-0.6	N/A — deterministic
Annotation time	3-10× session duration	Zero — real time
Label entropy	~3 bits per annotation	~256 kbps continuous stream
What the model trains on	Description of a memory of a state	The state itself, measured directly

Sample output · single frame · t = 12.847 s into session

{
  "session_id":  "VRM-2024-0847",
  "t_unix_us":   1718432847392,

  "eeg":         { "channels": 32, "sample_rate_hz": 1000, "frame": [ /* 32 × float32 */ ] },
  "ecg":         { "bpm": 74.3, "hrv_rmssd_ms": 42.1, "r_peak": true },
  "gsr":         { "conductance_us": 4.82, "tonic": 4.61, "phasic": 0.21 },
  "respiration": { "rate_bpm": 15.2, "phase": "exhale", "depth_norm": 0.61 },
  "eye":         { "fixation": true, "pupil_diam_mm": 4.1, "gaze_x": 0.48, "gaze_y": 0.51 },
  "emg":         { "zygomaticus": 0.12, "corrugator": 0.04, "frontalis": 0.07 },
  "depth_cam":   { "head_pose_yaw_deg": -3.2, "au_06": 0.08, "au_12": 0.14 },

  "labels": {
    "raw":        "sympathetic_activation · eeg_alpha_suppression · resp_hold",
    "derived":    "arousal: +0.74 · valence: -0.21 · engagement: 0.68",
    "inferred":   "cognitive_load: elevated · stress: moderate",
    "behavioral": "attention: sustained · facial_affect: neutral_tense"
  }
}

In plain language

What this is and why it exists

When researchers want to understand how someone feels, they usually ask them. But asking changes the answer. People guess, round up, misremember. The body does not have that problem.

The booth

GTLM is an enclosed space, roughly phone-booth-sized. A participant sits inside for a session. Ten sensors record what the body is doing: heart rate, brain electrical activity, sweat response, eye movement, breathing, and muscle tension. None of it requires the participant to do anything except be there.

The labeling problem

Raw physiological data is not useful on its own. Every recording needs to be annotated: what was happening at each moment, what the stimulus was, what the context was. In standard research, someone does this by hand. It takes weeks and introduces error. GTLM does it automatically, in real time, as the session runs.

What you receive

At the end of a session: one structured file. Every signal synchronized to a single clock. Each time point labeled with four layers of context. No manual cleanup, no raw wrangling. The data is ready to use.

Hardware

The Booth, Up Close

Subject seated inside GTLM booth wearing EEG headset and eye-tracking glasses during a biometric capture session — GTLM PRO, active capture session. EEG array, eye-tracking oculometer, and depth camera visible on subject and booth walls.

GTLM booth exterior, matte anodized aluminum enclosure with smoked glass door, standing in a research laboratory — Exterior: matte anodized aluminum chassis with Faraday-shielded smoked glass door. Footprint: 1.2 m × 1.2 m.

Technical Specifications

Sensor Catalogue

Every GTLM shares the same booth enclosure, Faraday shielding, LSL middleware, and auto-labeling engine. The sensor suite is fully configurable — select any combination in the Build section below.

SENSOR CATALOGUE

All sensors available for GTLM configuration — consumer and medical grade. Select any combination in the Build section below.

Audio / Visual

Device	Modality	Rate	Resolution	Signal
Logitech Brio 4K	Facial capture	60 fps	4K UHD	RGB video
ReSpeaker 4-Mic Array	Spatial audio	48 kHz	24-bit	PCM audio
Intel RealSense D455	Depth / proxemics	90 fps	848×480	RGB-D

Neurophysiological

Device	Modality	Rate	Resolution	Signal
Emotiv EPOC X	EEG	128 Hz	14 channels	μV
Empatica E4	GSR / EDA	4 Hz	0.01 μS	μS
Polar H10	ECG / HRV	130 Hz	R-R interval	ms
Garmin HRM-Pro	SpO₂ / PPG	1 Hz	1%	%
Hexoskin belt	Respiration	128 Hz	chest excursion	a.u.

Movement / Posture

Device	Modality	Rate	Resolution	Signal
Noraxon IMU	Posture / IMU	200 Hz	6-DOF	deg/s, g
Zebris FDM	Floor pressure	120 Hz	4 sensors/cm²	N/cm²

Eye Tracking

Device	Modality	Rate	Resolution	Signal
Tobii Eye Tracker 5	Oculometry	90 Hz	0.4° accuracy	gaze XY

Neurophysiological — Medical Grade

Device	Modality	Rate	Resolution	Signal
BrainProducts LiveAmp 32	EEG	500 Hz	32 channels	μV
Biopac ECG100C	12-lead ECG	1000 Hz	clinical	mV
Biopac EDA100C	Medical GSR	2000 Hz	0.001 μS	μS
Masimo EMMA	Capnography	100 Hz	EtCO₂ mmHg	mmHg
cEEGrid array	Around-ear EEG	500 Hz	10 channels	μV

Movement / Posture — Medical Grade

Device	Modality	Rate	Resolution	Signal
Xsens DOT (×8)	Full-body mocap	120 Hz	<1° RMS	deg, m
Delsys Trigno Wireless	EMG (8–16ch)	2000 Hz	16-bit	mV

Voice / Affect

Device	Modality	Rate	Resolution	Signal
Shure MXA310	Clinical mic array	96 kHz	24-bit	PCM
On-device prosody engine	Prosody F0/jitter	real-time	1 Hz	Hz, %

Process

From Subject to Dataset

Six steps from booth entry to structured multimodal output. Every step is standardized, reproducible, and auditable.

Enter the Booth

Subject steps in. Acoustically dampened (–42 dB), Faraday-shielded, climate controlled at 21°C ±0.5°C. No external noise, no signal bleed, no electromagnetic interference. The enclosure is the experiment's first independent variable: it holds constant every environmental factor that cannot otherwise be controlled.
Sensor Initialization

All devices boot and synchronize via GTLM's proprietary LSL-compatible middleware. Baseline calibration runs for 90 seconds of resting-state capture. Impedance checks, electrode contact verification, and sampling rate confirmation are performed automatically. Any sensor below threshold triggers an alert before the session begins.
Session Begins

Subject interacts with a presented stimulus: video, audio, structured task, interview protocol, or physical product. Session duration is configurable between 5 and 120 minutes. The experimenter monitors signal quality via the external dashboard in real time without entering the booth or disturbing the electromagnetic environment.
Live Capture

Every sensor streams simultaneously. The LSL sync layer timestamps all streams to microsecond precision, enabling true multimodal temporal alignment. Jitter between streams is below 1 ms. Data is written to NVMe storage in real time with redundant backup. No network dependency during capture.
Auto-Labeling Engine

Proprietary software analyzes each timepoint across all modalities simultaneously. The output follows a four-level label hierarchy: (a) raw signal state: measured physiological values at each sample; (b) derived physiological state: computed metrics such as HRV, SCR amplitude, EEG band power; (c) inferred affective/cognitive state: Valence-Arousal-Dominance coordinates and categorical probability distributions; (d) behavioral annotation: action units, postural events, gaze fixations, vocal events. All labels are time-indexed and timestamped to the same microsecond reference frame as the raw signals.
Export

Output is a structured dataset: time-indexed multimodal signals paired with label columns at every temporal resolution. Available export formats: CSV · Parquet · HDF5 · JSON-LD. Schema documentation is included with every export. Datasets are ready for ingestion into research pipelines without further preprocessing.

Signal Output Preview

Live Simulation

30-second multimodal capture window. Five synchronized signal streams rendered in real time.

RAW SIGNAL STATE

GSR spike +2.3 μS | EEG α suppression (40%→22%) | AU6+AU12: active (intensity 0.9)

DERIVED PHYSIOLOGICAL STATE

Sympathetic arousal onset | SNS:PNS ratio 1.4:1 | Frontal α asymmetry: L > R | HRV RMSSD ↓ 31ms

AFFECTIVE INFERENCE

Valence: +0.72 | Arousal: +0.68. Probable state: DELIGHT / SURPRISE (p=0.81)

BEHAVIORAL ANNOTATION

Duchenne smile confirmed (AU6+AU12+AU25) | Forward postural lean: +12° | Blink rate suppressed (2.1/min)

Download Sample Dataset (.csv)

Methodology

Why Biosignals Are Ground Truth

Why Words Are a Lossy Codec

Language compresses rich internal states into discrete tokens that lose temporal resolution, contextual co-occurrence, and physiological ground truth. Annotation happens hours after the event, by someone other than the subject, using vocabulary that does not exist for most internal states. The mean inter-rater agreement for standard emotion annotation corpora is rarely above 0.6 Cohen's κ, a ceiling imposed by language, not by the complexity of the underlying cognition.

"A word is a lossy approximation. A synchronized biosignal array is the original file."

The Labeling Bottleneck

The quality ceiling of any model is the label quality of its training data. This constraint is particularly acute for affective computing, RLHF, and human preference modeling. Natarajan et al. (2013)¹ demonstrated that label noise induces systematic bias that degrades model accuracy non-linearly. AffectNet (Mollahosseini et al., 2017)², one of the largest affective datasets, relies on crowdsourced keyword annotations, resulting in inter-rater reliability of approximately 0.55 for compound expressions. Biosignals bypass this bottleneck by providing continuous, high-resolution, physiologically grounded labels that do not require a human annotator to agree on a word.

"The most important variable in your model is not your architecture. It is the reliability of your labels."

Controlled Environment as Scientific Instrument

Signal quality degrades catastrophically in ambulatory settings. EEG artifact rejection rates exceed 40% in mobile recordings versus under 8% in controlled environments (Gramann et al., 2011)³. Electrodermal activity in unshielded environments introduces 50/60 Hz interference that masks genuine skin conductance response events. The GTLM booth is not a limitation on ecological validity; it is a methodological requirement for scientific validity. Faraday shielding (attenuation: –60 dB at 100 kHz), acoustic dampening (–42 dB SIL), and standardized ambient lighting (200 lux ±5%) are not product features. They are the experimental controls that make the data usable.

Natarajan, N., Dhillon, I. S., Ravikumar, P. K., & Tewari, A. (2013). Learning with noisy labels. Advances in Neural Information Processing Systems (NeurIPS), 26.
Mollahosseini, A., Hasani, B., & Mahoor, M. H. (2017). AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild. IEEE Transactions on Affective Computing, 10(1), 18–31.
Gramann, K., Gwin, J. T., Ferris, D. P., Oie, K., Jung, T.-P., Lin, C.-T., & Makeig, S. (2011). Cognition in action: imaging brain/body dynamics in mobile humans. Reviews in the Neurosciences, 22(6), 593–608.

The Data Quality Debate

Less Data or More Data?

The most consequential argument in AI training today pits two mathematically defensible camps against each other. How you answer it determines what kind of infrastructure you build — and whether GTLM belongs in your pipeline.

For — Low-Volume, High-Quality (LV-HQ)

Less Is More for Alignment

The LIMA Principle
Large models learn facts during pre-training — alignment only teaches format and tone. A few thousand pristine, expert-crafted examples outperform millions of mediocre ones. You don't need more data; you need better data.
Avoiding the Sycophancy Trap
High-volume crowdsourcing incentivizes polite-looking responses over accurate ones. Gig workers paid per task favour well-formatted hallucinations. Expert-driven annotation penalises factual errors regardless of how confident they sound.
Raising the Capability Ceiling
No crowd can produce expert-grade data for medicine, law, or advanced affect. Pushing a model past "helpful assistant" into specialist reasoning requires PhDs and clinicians crafting a small, dense dataset — not scale.
Complete Auditability
A toxic signal in 3M crowdsourced samples is a needle in a haystack. A flaw in 3K expert examples can be read by an engineering team in an afternoon, fixed, and retrained — the whole dataset is auditable.

Against — For High-Volume, Low-Quality (HV-LQ)

Scale Has Always Won

The Bitter Lesson
Historically, throwing massive amounts of messy data at a problem beats hand-crafted approaches. Neural networks are statistical engines: if you feed 10M examples with a 20% error rate, sheer volume lets the algorithm smooth the noise and lock onto the true signal.
Long Tail of Edge Cases
5,000 curated interactions can perfectly capture how humans should behave. But humans are chaotic. High-volume scraping exposes models to the weird, grammatically incorrect, unpredictable edge cases that small datasets simply miss.
The Diversity Problem
"Quality" is defined by researchers in a lab. Expert-curated data produces AI that sounds academic and lacks cultural nuance, regional slang, and the diverse perspectives that only scale can capture. The general public is the user — the general public should be in the data.
Economic Scalability
At $150–$300/hr per expert, building datasets for trillion-parameter models across hundreds of languages and tasks is economically unsustainable. Crowdsourcing is the only mathematically viable path to the millions of annotations modern AI requires.

Industry Synthesis — Where Labs Land Today

Frontier AI labs are realising they need both approaches — but applied at different layers of the stack. High-volume data teaches the model general safety, conversational robustness, and broad linguistic coverage. Then, injections of extremely low-volume, extremely high-quality data push specific capabilities: advanced mathematics, medical reasoning, nuanced emotional inference. The debate is not either/or. It is architectural.

Where GTLM fits

GTLM is infrastructure for the second layer. When you need to push a model past general competence — into emotion recognition, physiological state inference, or affective alignment — crowdsourced ratings cannot get you there. You need deterministic, expert-grade ground truth measured at the moment it occurs, not recalled 30 seconds later. That is what GTLM produces: low-volume data at a quality ceiling no crowd-based approach can touch.

"The question is not volume versus quality. It is knowing which layer of the stack demands which. GTLM exists for the layer that tolerates no compromise."

GTLM Box — The Lean Stack

80% of the Signal.
20% of the Cost.

The full GTLM is built for clinical precision — for labs publishing in Nature Neuroscience. But if your goal is training AI on human emotional ground truth, you can strip out several massive expenses and still capture the data that matters. Here is the 80/20 version.

GTLM PRO hardware ~$100,000

GTLM Box ~$2,500

AI training capability retained ~100%

Cut

What to Remove — Without Losing the Signal

–$25,000–$30,000

The Faraday / Acoustic Booth

Why it's overkill: A Faraday cage is mandatory for measuring raw microvolts of brain activity in clinical neurology. For training AI on general human affect, it isn't. Modern sensors have built-in hardware filters that strip standard 50/60 Hz noise.

The swap: A quiet office room with consistent LED lighting (200 lux, dimmable). No construction. No shielding.

–$14,000–$19,000

Medical-Grade 32-Channel EEG

Why it's overkill: EEG requires gel, degrades with jaw movement, and mapping brainwaves to specific emotions is still actively debated in neuroscience. For AI training, the peripheral nervous system is a more reliable arousal signal.

The swap: Drop EEG entirely and rely on heart rate and skin conductance — or downgrade to an $800 consumer headset (Emotiv EPOC X, Muse 2) for frontal alpha asymmetry as an engagement proxy.

–$15,000

Facial EMG Electrodes

Why it's overkill: Delsys EMG sensors taped to the corrugator and zygomaticus muscles measure micro-tension. Effective. Also: wires on every subject's face, 20 minutes of setup, and participant discomfort.

The swap: Computer vision. A 4K webcam running facial Action Unit (AU) algorithms tracks micro-expressions, smiles, and frowns optically — near-perfect accuracy for affective research, zero hardware on the face.

–$9,900

12-Lead Clinical ECG

Why it's overkill: A 12-lead ECG is for diagnosing heart disease. For HRV analysis — which is what you actually need to measure sympathetic arousal — it is a $10,000 instrument doing a $90 job.

The swap: Polar H10 chest strap (~$90). Clinically validated in dozens of peer-reviewed studies for R-R interval accuracy virtually identical to a medical ECG for HRV purposes.

Keep

The Lean Stack — What You Must Not Cut

GSR / EDA Wristband — Skin Conductance

~$1,500

The single best ground-truth metric for physiological arousal and stress. The body cannot fake a sweat response. Empatica E4 or Shimmer3 GSR provide research-grade electrodermal data from the wrist, continuously, without subject effort.

e.g. Empatica E4, Shimmer3 GSR+

Polar H10 Chest Strap — HRV

~$90

Measures vagal tone and autonomic nervous system shifts via R-R intervals. HRV is the gold standard for distinguishing calm from stress, engagement from boredom. The H10 is peer-reviewed and trusted for research-grade HRV analysis.

Polar H10

Intel RealSense D455 + 4K Webcam

~$450

Depth camera for spatial proxemics (leaning in/out, postural shifts) plus high-resolution facial capture for AU detection. Together these replace the facial EMG entirely while adding spatial body language as a signal channel.

Intel RealSense D455 + Logitech Brio 4K

Tobii Eye Tracker 5

~$300

Tells you exactly what on the screen triggered the physiological spike. Without gaze data, you have a response with no stimulus reference. The Tobii Eye Tracker 5 sits below your monitor and requires no calibration headset.

Tobii Eye Tracker 5

Lab Streaming Layer (LSL)

Free

The open-source middleware protocol that timestamps and synchronises all four streams to a single clock. Without LSL, you have four separate files with no shared time axis. With it, you have GTLM-grade microsecond sync on $2,500 of hardware.

labstreaminglayer.org

The Verdict

Strip the booth, the medical EEG, the clinical ECG, and the facial EMG. What remains is a desktop setup that still knows exactly when a participant got stressed, when they smiled, what they were looking at, and how their heart rate reacted — perfectly synchronised to the millisecond. You lose the ability to publish in strict neuro-clinical journals. You retain almost all of the capability to train an AI on human emotional ground truth.

You will still know, per subject, per stimulus:

When they got stressed — EDA spike
When their heart rate shifted — HRV / R-R interval
What they were looking at — gaze fixation
Whether they smiled, frowned, or raised their brow — facial AU
Whether they leaned in or pulled back — depth proxemics

"A $2,500 desktop stack, synchronised by LSL, tells the model almost everything a $100,000 booth does. The difference is not what you capture — it is how much noise you keep out."

About Us

Biometric research, since 2013

Founded in London, we are a multidisciplinary team of computer scientists, neuroscience experts, MDs, psychologists, researchers, and hardware and software engineers. Since incorporation we have worked alongside universities, research institutions, and corporate customers to build instruments that measure what self-report cannot.

What we build

Over a decade of research has produced a suite of commercially available products — sold under our own brand and under the brands of our customers. Every product we ship is grounded in the same principle: physiological measurement is more reliable than retrospective annotation.

Who we work with

Our team has designed, developed, and deployed products for research institutions, government agencies, and Fortune 100 companies. Clients include Pedigree, Hasbro, NIH, Vodafone, and the United Nations — projects spanning consumer insight, clinical research, human factors, and applied neuroscience.

Selected clients & partners

Pedigree Hasbro NIH Vodafone United Nations

For investors

We are open to conversations with venture capital firms and strategic investors interested in the future of affective computing, AI training data infrastructure, and precision biometric capture. Get in touch for demos, case studies, and a detailed briefing on the commercial pipeline.

Contact Us →

pedja@idguardian.co

Lab Access

Experience the Booth

Lab visits are available at our partner facilities in four cities. Each session is 90 minutes and includes a full data capture demonstration, live signal visualization, and a post-session signal debrief with a GTLM researcher.

London

51.5321°N, 0.1233°W

King's Cross Neurotech Hub, London

Available Dates

Tue 9 Sep 202610:00–11:30
Thu 18 Sep 202614:00–15:30
Wed 7 Oct 202610:00–11:30
Fri 23 Oct 202613:00–14:30

Book This Location

New York

40.7291°N, 73.9965°W

NYU Center for Biomedical Imaging, New York

Available Dates

Mon 15 Sep 202609:00–10:30
Fri 26 Sep 202614:00–15:30
Tue 13 Oct 202610:00–11:30
Thu 29 Oct 202609:00–10:30

Book This Location

Zürich

47.3769°N, 8.5417°E

ETH Zürich Neurotechnology Center

Available Dates

Wed 10 Sep 202610:00–11:30
Mon 22 Sep 202614:00–15:30
Thu 15 Oct 202610:00–11:30
Tue 3 Nov 202613:00–14:30

Book This Location

Berlin

52.5247°N, 13.3786°E

Charité Medical Innovation Lab, Berlin

Available Dates

Fri 12 Sep 202611:00–12:30
Wed 24 Sep 202609:00–10:30
Mon 19 Oct 202614:00–15:30
Thu 5 Nov 202610:00–11:30

Book This Location

Your labels are the bottleneck.

The booth

The labeling problem

What you receive

Audio / Visual

Neurophysiological

Movement / Posture

Eye Tracking

Neurophysiological — Medical Grade

Movement / Posture — Medical Grade

Voice / Affect

Enter the Booth

Sensor Initialization

Session Begins

Live Capture

Auto-Labeling Engine

Export

RAW SIGNAL STATE

DERIVED PHYSIOLOGICAL STATE

AFFECTIVE INFERENCE

BEHAVIORAL ANNOTATION

RAW SIGNAL STATE

DERIVED PHYSIOLOGICAL STATE

AFFECTIVE INFERENCE

BEHAVIORAL ANNOTATION

EMG ACTIVATION

32-CHANNEL EEG TOPOGRAPHY

PROSODY F0 CONTOUR

Why Words Are a Lossy Codec

The Labeling Bottleneck

Controlled Environment as Scientific Instrument

Less Is More for Alignment

Scale Has Always Won

Base Unit Included in all configurations

Sensors — Audio / Visual

Sensors — Neurophysiological

Sensors — Movement & Posture

Sensors — Eye Tracking

Enclosure — Dimensions

Enclosure — Exterior Material

Enclosure — Exterior Colour

Software & Services

Frequently Asked Questions

What to Remove — Without Losing the Signal

The Faraday / Acoustic Booth

Medical-Grade 32-Channel EEG

Facial EMG Electrodes

12-Lead Clinical ECG

The Lean Stack — What You Must Not Cut

GSR / EDA Wristband — Skin Conductance

Polar H10 Chest Strap — HRV

Intel RealSense D455 + 4K Webcam

Tobii Eye Tracker 5

Lab Streaming Layer (LSL)

What we build

Who we work with

For investors

London

Available Dates

New York

Available Dates

Zürich

Available Dates

Berlin

Available Dates

Your labels
are the
bottleneck.