A quarterly clinic model cannot see what determines outcomes.
Glaucosim is a browser-and-phone layer that runs a longitudinal home cadence of clinically grounded tests, captures medication adherence, and surfaces a trend to the eye-care professional before the next appointment.
Most home tools cover one test and depend on dedicated hardware. Glaucosim runs a multi-modality session on the devices the patient already owns.
Continuous IOP needs an implant or a contact lens (Eyemate, Triggerfish). Home perimetry needs a VR headset or a tablet kiosk (Olleyes, Heru, RadiusXR, Imo Vifa). Home anterior-segment imaging needs a clip-on lens. Home tools that ship without dedicated hardware cover a single test — refraction (EyeQue, Easee), VF (MRF, iPad ZEST), or screening (Peek Vision).
Glaucosim is the only point in the top-right quadrant covering visual function + anterior segment + IOP screen (β) + adherence in one home session, on devices the patient already owns. Peek Vision is the closest conceptual peer but is built for community-screening triage, not longitudinal glaucoma monitoring.
Per-measurement precision is lower than instrument-bound counterparts. The trade is an order-of-magnitude increase in sampling cadence, and slope-estimate variance falls as 1/n³ when test occasions are added.5
How do we run clinical-grade tests remotely, without dedicated devices, and still trust the data?
Visual acuity, contrast and perimetry each assume a different luminance window. A test outside its window is not interpretable.
Stimulus angle, optotype size, and pixel pitch all depend on the patient-to-screen distance — and on which eye is actually being tested.
Peripheral perimetry assumes central fixation. A 4° saccade away from target makes the stimulus land at the wrong location.
No additional hardware. No data leaves the device until results are signed and synced.
Iris-pinhole projection from MediaPipe FaceMesh.
EAR + hand-landmark + iris occlusion fusion.
Iris-relative-to-canthi, Kalman-filtered.
Calibrated webcam-mean luminance proxy + glare.
Anterior-segment focus, exposure, framing scorer.
Each module reads all five channels before allowing a trial. Out-of-band readings prompt re-positioning or invalidate the affected stimulus. Every event is logged for retrospective audit.
We use the interpupillary distance (IPD) as the real-world anchor — the population mean for adults is 63 mm (SD ~3.5 mm).7 MediaPipe FaceMesh returns the two iris-center landmarks (468 left, 473 right). We measure the IPD in pixels and recover patient-to-screen distance from the pinhole projection.
d patient-to-camera (mm) · fpx camera focal length (px), recovered with a one-time on-screen calibration step · IPDpx live pixel distance between iris centers (FaceMesh 468 ↔ 473).
Why IPD and not iris diameter: the iris edge is harder to segment reliably under variable lighting and lashes, while iris centers are detected by MediaPipe with sub-pixel stability and remain visible even when the lid covers part of the limbus.
SIMILAR TRIANGLES · REAL IPD FIXED AT 63 MM · PIXEL IPD INVERTS WITH DISTANCE
Monocular tests assume the operator knows which eye is tested. At home, a left-eye trial labelled as right-eye produces a clean, plausible, incorrect record. Glaucosim verifies cover state from three independent signals — any single one of which is brittle alone.
Open eye ≈ 0.27–0.32; closed eye < 0.15. Threshold calibrated per subject over a 25-frame baseline at session start.8
PER-EYE STATE GATES EVERY STIMULUS · LIVE @ ~30 HZ
Perimetry assumes the patient is looking at the central target. If gaze drifts, the stimulus meant to land at 21° lands at 17° or 25°, and the threshold at the labelled location is wrong without the algorithm knowing.
Gaze is computed as iris position relative to the eye corners, in a head-relative frame — so translating the head does not move the vector; only a saccade does. A 1-D Kalman filter is applied to each component, with measurement noise inflated during blinks.
Reads as: the offset of the iris center from the eye's center, normalised by the width of the eye opening.
After a 30-frame baseline at session start g₀, drift is Δ = g − g₀. Stimuli presented while ‖Δ‖ > 4° are flagged and excluded from the ZEST posterior update. Heijl-Krakau blind-spot catches run in parallel for the standard reliability indices.
DRIFTED STIMULI ARE DROPPED FROM THE BAYESIAN POSTERIOR · FL / FP / FN COMPUTED IN PARALLEL
Visual function thresholds are luminance-dependent. Acuity assumes ISO 8596 background; Pelli-Robson assumes ~85 cd/m²; perimetry assumes a dim room so stimulus contrast reaches operating range.
Glaucosim derives an operational ambient proxy from the webcam: mean greyscale intensity of the central patch, exposure-compensated, calibrated against an on-screen reference step at session start.
⟨Igrey⟩ mean intensity of central patch · ecam camera exposure from MediaStream constraints · k per-device constant from a 5 s on-screen reference.
EACH TEST DEFINES ITS OWN WINDOW · OUT-OF-WINDOW SESSIONS ARE TAGGED ADVISORY OR REJECTED
A patient's phone records a short anterior-segment clip per take. To be useful for surface review, each frame has to be in focus, well exposed, and framed on the iris. A quality scorer runs over every frame so the patient is guided in real time.
Fvar Laplacian variance (focus) · Ehist exposure flatness · Riris iris coverage from FaceMesh ROI · Mblur motion blur from optical-flow magnitude.
Only takes that pass the threshold are kept. The voice avatar tells the patient to come a little closer, hold still, or retake.
ONE FRAME PER EYE · PHONE OR LAPTOP · IMAGES ENCRYPTED AT REST
ZEST Bayesian adaptive thresholding on the 54-location grid. Same family as SITA.
At each of the 54 locations, the threshold is treated as a probability distribution, not a single number. Every stimulus shifts that distribution toward the patient's true value. The test stops at a given location only when the distribution is tight enough to commit.
Termination when posterior SD < 1.5 dB. Drifted-gaze stimuli (Model 03) are dropped from the update.
Turpin showed ZEST ≈ SITA in threshold accuracy with fewer presentations.9 Schulz validated iPad ZEST against HFA in glaucoma.10
Contactless pressure screen — laptop emits, phone listens. Research-only signal. Not a replacement for Goldmann.
The eye is a viscoelastic ball under pressure. Drive it with low-frequency sound and it has a mechanical resonance whose frequency depends on the stiffness of the cornea + sclera — and within a patient, that stiffness is dominated by intraocular pressure. Higher IOP, stiffer eye, higher resonance frequency.
We don't listen for an echo with the microphone. The speaker drives the eye; the phone selfie camera tracks the iris landmarks frame-by-frame. The iris is rigidly coupled to the cornea, so its sub-pixel motion in the video is a direct read-out of the eye vibrating under acoustic excitation.
The pipeline:
f0 = 12 Hz, f1 = 22 Hz, T = 5 s. Resonance peak f* expected in 14–20 Hz. Per-patient mapping f* → mmHg from iop-mmhg.js.
Proposed validation at Shiley: acoustic mmHg estimate vs same-day Goldmann across IOP ranges, retest reliability over 4 weeks.
FOR RESEARCH ONLY · NOT A TONOMETER · NOT A SUBSTITUTE FOR GOLDMANN
ETDRS / Bailey-Lovie logMAR on a physically calibrated display, at the patient's measured distance.
A 20/20 letter is defined as one that occupies exactly 5 arcminutes of visual angle. The patient is rarely 4 m from a laptop, so the optotype is physically resized in real time to preserve that same angular subtense at the live measured distance.
Computing the letter height in millimetres is the easy step. Rendering that height correctly on a screen the browser refuses to describe is the hard one — DOM physical units (1cm, 1mm) are reference units pinned to 96 DPI, not the actual display.
Glaucosim recovers the device's pixel pitch by identifying the screen, not by asking the patient. The user agent, screen.width × screen.height, and devicePixelRatio together fingerprint the device against an internal database of iPads, iPhones, MacBooks, Android flagships and common external monitors (Studio Display, Dell UltraSharp, LG UltraFine, BenQ PD27) — each indexed to a known CSS DPI. For external displays on macOS we read the monitor label exposed by the Window Management API, which the OS derives from the EDID.
25.4 mm/inch divided by the device's CSS DPI gives mm per CSS pixel — matching window.innerWidth. A webcam cross-check optionally validates the estimate by comparing measured iris-pair pixel span against the expected size at the live distance. Source: core/calibration.js.
Sloan optotypes, 2-down-1-up staircase, 0.1 logMAR step, 5 reversals.12 Clinically meaningful Δ ≈ 0.1 logMAR.13
DISTANCE FROM MODEL 01 · OPTOTYPE HEIGHT RECOMPUTED PER FRAME
Pelli-Robson, age-normed. Background luminance gated by Model 04 before the run starts.
Pelli-Robson fixes letter size well above acuity threshold, then varies only one thing: contrast. Letters are shown in triplets that step down 0.15 log units of contrast. The contrast threshold is the last triplet the patient reads with at least two of three letters correct.
C Michelson contrast — ( Lmax − Lmin ) / ( Lmax + Lmin ). Normal log CS ≈ 1.95; ≤ 1.5 is impaired.14
CS loss often precedes detectable acuity change in early glaucoma — and is sensitive to drug-induced ocular-surface change.
LETTER SIZE FIXED · ONLY CONTRAST VARIES · LAST CORRECT TRIPLET = THRESHOLD
Four graded outputs from a single frame per eye. Phone or laptop — patient picks the device.
Per-frame score combining Fvar (Laplacian focus), Ehist (exposure flatness), Riris (iris ROI coverage from FaceMesh) and Mblur (motion blur). Reported alongside the three clinical grades so reviewers see how confident the capture is.
MediaPipe FaceMesh segments the bulbar conjunctiva ROI in the primary-gaze frame. Redness index = ⟨R / (R + G + B)⟩ over the ROI, illumination-normalised against the patient's own ambient-lit cheek patch. Continuous score → ordinal Efron 0–4.
Upper-eyelid skin patch sampled in CIE L*a*b*. The melanin proxy ITA° = arctan( ( L* − 50 ) / b* ) · 180/π; we report the within-patient ΔITA° vs an infraorbital cheek reference patch, then map to the Periocular Hyperpigmentation Severity Scale (Sheth 2014).
Prostaglandin-associated periorbitopathy. MRD1 (pupil-center → upper-lid-margin) converted to mm via per-frame IPD scale, plus an upper-lid-sulcus depth proxy from shadow contrast. Mapped to the Aakalu 0–3 ordinal scale.
EVERY FRAME TAGGED WITH DEVICE · DISTANCE · LUX · Q · MODEL VERSION · TIME
V0 ships with hand-engineered features per output. V1+ is a multi-task CNN, on the three clinical grades only, trained on labels the clinician writes in the dashboard. Image quality stays deterministic.
Each grading is computed deterministically from MediaPipe landmarks + per-pixel color in a stable ROI. Calibrated against published reference photographs of each scale.
Q = α·Fvar + β·Ehist + γ·Riris − δ·Mblur · gates retake in real time · reported alongside the three grades so reviewers see the capture's confidence.
ROI = bulbar conjunctiva mask (MediaPipe) · feature = ⟨R / (R+G+B)⟩ normalised against the patient's cheek patch · ordinal map to Efron 0–4 via reference-photo LUT.
Upper-lid skin patch in CIE L*a*b* · feature = ITA° + ΔITA° vs cheek · ordinal map to POHSS 0–3.
FaceMesh upper-lid + pupil → MRD1 (mm) via per-frame IPD scale · sulcus shadow contrast as a depth proxy · ordinal map to Aakalu PAP 0–3.
Every clinician review in the dashboard adds three ordinal labels per take. The platform is the labelling tool.
Versioned model files; predictions never overwrite labels. The dashboard surface lets a fellow drag a slider to re-grade — every correction lands in the training set. UCSD-labelled corpus stays UCSD-owned.
The standard 25-item PRO, voice or tap, on a home cadence rather than annual.
25 questions split into 12 subscales — general vision, near, distance, peripheral, ocular pain, role limitations, dependency, social, mental, color, driving, plus a general-health item. Each response is rescaled 0–100. Subscale = mean of items; composite = mean of vision-targeted subscales.
Calibrated and validated by Mangione et al. 2001.15 The shift here is cadence: we run it every 90 days at home, so trajectory becomes visible.
VOICE OR TAP · ~7 MIN · 90-DAY CADENCE
Reminders, single-tap confirmation, structured missed-dose reason — then overlaid on visual-function trend.
Adherence is invisible because self-report at the next visit overstates it by ~31% versus objective measurement.3 Glaucosim turns adherence into a continuously logged variable: a reminder fires at every scheduled dose, the patient confirms with one tap, and missed doses are captured with a structured reason rather than a generic apology.
The clinician dashboard overlays missed-dose density on the MD trend, so adherence vs progression is one chart — and a behavioural conversation has a concrete artefact behind it.
ADHERENCE BECOMES A VARIABLE, NOT A SELF-REPORT
[1] Olthoff CMG et al. Ophthalmology 2005
[2] Newman-Casey PA et al. Ophthalmology 2015
[3] Friedman DS et al. IOVS 2014
[4] Stagg BC et al. JAMA Ophthalmol 2022
[5] Chauhan BC et al. Br J Ophthalmol 2008
[6] Sakata R et al. Am J Ophthalmol 2021
[7] Caroline P, André M. Contact Lens Spectrum 2002
[8] Soukupová T, Čech J. CVWW 2016 (Eye Aspect Ratio)
[9] Turpin A et al. IOVS 2003 (ZEST validation)
[10] Schulz AM et al. JAMA Ophthalmol 2018 (iPad ZEST)
[11] Heijl A et al. Acta Ophthalmol 1989
[12] Bailey IL, Lovie JE. Am J Optom 1976
[13] Rosser DA et al. Br J Ophthalmol 2003
[14] Pelli DG et al. Clin Vis Sci 1988
[15] Mangione CM et al. Arch Ophthalmol 2001 (NEI VFQ-25)