Ground Truth Labeling Machine · Precision Biosignal Capture
Your labels are the bottleneck.
GTLM — Ground Truth Labeling Machine — is a controlled-environment biometric capture booth that automatically labels physiological and behavioral data at the resolution research has always needed, but never had.
Words are a lossy codec. Synchronized biosignals are the original file.
Most affective datasets use self-report or rater annotation. That label is a compressed, delayed, language-encoded reconstruction of a physiological event. It is not the thing. It is a description of a memory of the thing.
Standard workflow
With GTLM
Label source
Human rater or self-report, post-session
Concurrent physiological measurement
Temporal resolution
1-30 seconds per annotation
Sub-millisecond, all channels
Label type
Categorical (e.g. arousal 1-7)
Continuous, 32+ channels simultaneously
Inter-rater agreement
κ ≈ 0.4-0.6
N/A — deterministic
Annotation time
3-10× session duration
Zero — real time
Label entropy
~3 bits per annotation
~256 kbps continuous stream
What the model trains on
Description of a memory of a state
The state itself, measured directly
Sample output · single frame · t = 12.847 s into session
Training on this means optimizing against a signal that was physically present, at the moment it was present — not against a rating someone gave 30 seconds later.
In plain language
What this is and why it exists
When researchers want to understand how someone feels, they usually ask them. But asking changes the answer. People guess, round up, misremember. The body does not have that problem.
The booth
GTLM is an enclosed space, roughly phone-booth-sized. A participant sits inside for a session. Ten sensors record what the body is doing: heart rate, brain electrical activity, sweat response, eye movement, breathing, and muscle tension. None of it requires the participant to do anything except be there.
The labeling problem
Raw physiological data is not useful on its own. Every recording needs to be annotated: what was happening at each moment, what the stimulus was, what the context was. In standard research, someone does this by hand. It takes weeks and introduces error. GTLM does it automatically, in real time, as the session runs.
What you receive
At the end of a session: one structured file. Every signal synchronized to a single clock. Each time point labeled with four layers of context. No manual cleanup, no raw wrangling. The data is ready to use.
Hardware
The Booth, Up Close
GTLM PRO, active capture session. EEG array, eye-tracking oculometer, and depth camera visible on subject and booth walls.Exterior: matte anodized aluminum chassis with Faraday-shielded smoked glass door. Footprint: 1.2 m × 1.2 m.
// 01Specifications
Technical Specifications
Sensor Catalogue
Every GTLM shares the same booth enclosure, Faraday shielding, LSL middleware, and auto-labeling engine. The sensor suite is fully configurable — select any combination in the Build section below.
SPECIMENGTLM·SC·2026
SENSOR CATALOGUE
All sensors available for GTLM configuration — consumer and medical grade. Select any combination in the Build section below.
Audio / Visual
Device
Modality
Rate
Resolution
Signal
Logitech Brio 4K
Facial capture
60 fps
4K UHD
RGB video
ReSpeaker 4-Mic Array
Spatial audio
48 kHz
24-bit
PCM audio
Intel RealSense D455
Depth / proxemics
90 fps
848×480
RGB-D
Neurophysiological
Device
Modality
Rate
Resolution
Signal
Emotiv EPOC X
EEG
128 Hz
14 channels
μV
Empatica E4
GSR / EDA
4 Hz
0.01 μS
μS
Polar H10
ECG / HRV
130 Hz
R-R interval
ms
Garmin HRM-Pro
SpO₂ / PPG
1 Hz
1%
%
Hexoskin belt
Respiration
128 Hz
chest excursion
a.u.
Movement / Posture
Device
Modality
Rate
Resolution
Signal
Noraxon IMU
Posture / IMU
200 Hz
6-DOF
deg/s, g
Zebris FDM
Floor pressure
120 Hz
4 sensors/cm²
N/cm²
Eye Tracking
Device
Modality
Rate
Resolution
Signal
Tobii Eye Tracker 5
Oculometry
90 Hz
0.4° accuracy
gaze XY
Neurophysiological — Medical Grade
Device
Modality
Rate
Resolution
Signal
BrainProducts LiveAmp 32
EEG
500 Hz
32 channels
μV
Biopac ECG100C
12-lead ECG
1000 Hz
clinical
mV
Biopac EDA100C
Medical GSR
2000 Hz
0.001 μS
μS
Masimo EMMA
Capnography
100 Hz
EtCO₂ mmHg
mmHg
cEEGrid array
Around-ear EEG
500 Hz
10 channels
μV
Movement / Posture — Medical Grade
Device
Modality
Rate
Resolution
Signal
Xsens DOT (×8)
Full-body mocap
120 Hz
<1° RMS
deg, m
Delsys Trigno Wireless
EMG (8–16ch)
2000 Hz
16-bit
mV
Voice / Affect
Device
Modality
Rate
Resolution
Signal
Shure MXA310
Clinical mic array
96 kHz
24-bit
PCM
On-device prosody engine
Prosody F0/jitter
real-time
1 Hz
Hz, %
// 02Process
Process
From Subject to Dataset
Six steps from booth entry to structured multimodal output. Every step is standardized, reproducible, and auditable.
01
Enter the Booth
Subject steps in. Acoustically dampened (–42 dB), Faraday-shielded, climate controlled at 21°C ±0.5°C. No external noise, no signal bleed, no electromagnetic interference. The enclosure is the experiment's first independent variable: it holds constant every environmental factor that cannot otherwise be controlled.
02
Sensor Initialization
All devices boot and synchronize via GTLM's proprietary LSL-compatible middleware. Baseline calibration runs for 90 seconds of resting-state capture. Impedance checks, electrode contact verification, and sampling rate confirmation are performed automatically. Any sensor below threshold triggers an alert before the session begins.
03
Session Begins
Subject interacts with a presented stimulus: video, audio, structured task, interview protocol, or physical product. Session duration is configurable between 5 and 120 minutes. The experimenter monitors signal quality via the external dashboard in real time without entering the booth or disturbing the electromagnetic environment.
04
Live Capture
Every sensor streams simultaneously. The LSL sync layer timestamps all streams to microsecond precision, enabling true multimodal temporal alignment. Jitter between streams is below 1 ms. Data is written to NVMe storage in real time with redundant backup. No network dependency during capture.
05
Auto-Labeling Engine
Proprietary software analyzes each timepoint across all modalities simultaneously. The output follows a four-level label hierarchy: (a) raw signal state: measured physiological values at each sample; (b) derived physiological state: computed metrics such as HRV, SCR amplitude, EEG band power; (c) inferred affective/cognitive state: Valence-Arousal-Dominance coordinates and categorical probability distributions; (d) behavioral annotation: action units, postural events, gaze fixations, vocal events. All labels are time-indexed and timestamped to the same microsecond reference frame as the raw signals.
06
Export
Output is a structured dataset: time-indexed multimodal signals paired with label columns at every temporal resolution. Available export formats: CSV · Parquet · HDF5 · JSON-LD. Schema documentation is included with every export. Datasets are ready for ingestion into research pipelines without further preprocessing.
// 03Output
Signal Output Preview
Live Simulation
30-second multimodal capture window. Five synchronized signal streams rendered in real time.
Language compresses rich internal states into discrete tokens that lose temporal resolution, contextual co-occurrence, and physiological ground truth. Annotation happens hours after the event, by someone other than the subject, using vocabulary that does not exist for most internal states. The mean inter-rater agreement for standard emotion annotation corpora is rarely above 0.6 Cohen's κ, a ceiling imposed by language, not by the complexity of the underlying cognition.
"A word is a lossy approximation. A synchronized biosignal array is the original file."
The Labeling Bottleneck
The quality ceiling of any model is the label quality of its training data. This constraint is particularly acute for affective computing, RLHF, and human preference modeling. Natarajan et al. (2013)¹ demonstrated that label noise induces systematic bias that degrades model accuracy non-linearly. AffectNet (Mollahosseini et al., 2017)², one of the largest affective datasets, relies on crowdsourced keyword annotations, resulting in inter-rater reliability of approximately 0.55 for compound expressions. Biosignals bypass this bottleneck by providing continuous, high-resolution, physiologically grounded labels that do not require a human annotator to agree on a word.
"The most important variable in your model is not your architecture. It is the reliability of your labels."
Controlled Environment as Scientific Instrument
Signal quality degrades catastrophically in ambulatory settings. EEG artifact rejection rates exceed 40% in mobile recordings versus under 8% in controlled environments (Gramann et al., 2011)³. Electrodermal activity in unshielded environments introduces 50/60 Hz interference that masks genuine skin conductance response events. The GTLM booth is not a limitation on ecological validity; it is a methodological requirement for scientific validity. Faraday shielding (attenuation: –60 dB at 100 kHz), acoustic dampening (–42 dB SIL), and standardized ambient lighting (200 lux ±5%) are not product features. They are the experimental controls that make the data usable.
Natarajan, N., Dhillon, I. S., Ravikumar, P. K., & Tewari, A. (2013). Learning with noisy labels. Advances in Neural Information Processing Systems (NeurIPS), 26.
Mollahosseini, A., Hasani, B., & Mahoor, M. H. (2017). AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild. IEEE Transactions on Affective Computing, 10(1), 18–31.
Gramann, K., Gwin, J. T., Ferris, D. P., Oie, K., Jung, T.-P., Lin, C.-T., & Makeig, S. (2011). Cognition in action: imaging brain/body dynamics in mobile humans. Reviews in the Neurosciences, 22(6), 593–608.
Since you opened this page, approximately
0.0hours of behavioral data
have been discarded by researchers using inferior annotation methods. Every session without GTLM is signal you cannot recover.
Select every variable — sensors, dimensions, materials, colour, and services. Prices shown are indicative approximations; final quote is confirmed on submission.
BUILD REF.GTLM-2026-????·INDICATIVE CONFIG
Base Unit Included in all configurations
Faraday-shielded enclosure structure (–60 dB at 100 kHz)
Acoustic dampening (–42 dB SIL)
Climate control (21°C ±0.5°C)
Rack-mount processing unit + LSL sync hub
Internal display (stimulus delivery)
Wiring harness + cable management system
Base enclosure€28,500
Sensors — Audio / Visual
Sensors — Neurophysiological
Sensors — Movement & Posture
Sensors — Eye Tracking
Enclosure — Dimensions
Enclosure — Exterior Material
Enclosure — Exterior Colour
Software & Services
Frequently Asked Questions
Yes. Institutional lease arrangements are available for 6- and 12-month terms. Contact us for current availability and pricing at your location. Lease terms include your configured sensor suite, middleware license, and remote technical support.
All data collected using a GTLM system remains exclusively owned by the purchasing institution. GTLM Systems collects no usage telemetry. The auto-labeling engine runs entirely on-device, with no data transmission to external servers at any point during or after capture.
The standard booth requires a floor area of 1.2 m × 1.2 m, a ceiling height of ≥2.5 m, and a 230 V / 16 A outlet. Compact and Extended configurations differ — see dimension options above. The unit is delivered flat-packed and assembled on-site; minimum doorway clearance required is 80 cm.
Yes. The Ethics Documentation Package (available as an add-on) includes a full sensor disclosure with radiation emissions data, contact materials inventory, and participant information templates formatted for IRB, REC, and equivalent ethics bodies in EU, US, and UK jurisdictions.
Typically 6–10 weeks from order confirmation, depending on sensor availability and delivery location. On-site installation is scheduled separately and typically occurs within 2 weeks of hardware delivery. Remote configuration is completed via video call within 5 business days of delivery.
// 06About
About Us
Biometric research, since 2013
Founded in London, we are a multidisciplinary team of computer scientists, neuroscience experts, MDs, psychologists, researchers, and hardware and software engineers. Since incorporation we have worked alongside universities, research institutions, and corporate customers to build instruments that measure what self-report cannot.
What we build
Over a decade of research has produced a suite of commercially available products — sold under our own brand and under the brands of our customers. Every product we ship is grounded in the same principle: physiological measurement is more reliable than retrospective annotation.
Who we work with
Our team has designed, developed, and deployed products for research institutions, government agencies, and Fortune 100 companies. Clients include Pedigree, Hasbro, NIH, Vodafone, and the United Nations — projects spanning consumer insight, clinical research, human factors, and applied neuroscience.
Selected clients & partners
Pedigree·Hasbro·NIH·Vodafone·United Nations
For investors
We are open to conversations with venture capital firms and strategic investors interested in the future of affective computing, AI training data infrastructure, and precision biometric capture. Get in touch for demos, case studies, and a detailed briefing on the commercial pipeline.
Lab visits are available at our partner facilities in four cities. Each session is 90 minutes and includes a full data capture demonstration, live signal visualization, and a post-session signal debrief with a GTLM researcher.