Agent Task: Generate MonStim Analyzer Demo Data for Portfolio Website
Agent Task: Generate MonStim Analyzer Demo Data for Portfolio Website
Context
This task is for a portfolio website demo. The goal is to create a standalone Python script that generates a realistic synthetic H-reflex EMG dataset and exports it as a single demo_data.json file. No real patient/animal data is used. The JSON will be consumed by a Plotly.js interactive demo page embedded in a static GitHub Pages site — no server, no Python in the browser.
The script should live at tools/generate_demo_data.py in the MonStim-Analyzer repo.
What You Need to Understand First
Before writing anything, read the following source files to understand existing signal processing APIs you should reuse:
monstim_signals/transform/filtering.py— bandpass filter implementationmonstim_signals/transform/amplitude.py— amplitude calculation methods (RMS, peak-to-trough, etc.)monstim_signals/transform/plateau.py— M-max plateau detectionmonstim_signals/domain/recording.py— Recording data model (scan_rate, stim_amplitude, channel_types, raw_view)
Read all four files in full before proceeding. The goal is to reuse these exact functions on synthetic raw waveform arrays rather than re-implementing them. This makes the demo data genuinely representative of what the real app computes.
Script Specification: tools/generate_demo_data.py
Purpose
Generate a synthetic but physiologically realistic single-session EMG H-reflex recruitment dataset and write it to tools/demo_data.json.
Physiological Parameters (do not change these — they reflect real H-reflex biology)
| Parameter | Value | Notes |
|---|---|---|
| Sampling rate | 30,000 Hz | Typical for MonStim acquisitions |
| Recording window | 80 ms total | Pre-stim: 10 ms, post-stim: 70 ms |
| Number of sweeps / recordings | 35 | Stimulus intensities spanning recruitment curve |
| Stimulus intensities | 0.5 mA → 12.0 mA, log-spaced | Covers sub-threshold through M-max saturation |
| Stimulus delivery time | 10 ms into the window | Index = 300 samples at 30 kHz |
| M-wave onset latency post-stim | ~5–6 ms | Corresponds to ~150–180 samples post-stim |
| M-wave peak latency post-stim | ~8–10 ms | |
| M-wave duration (window) | ~6 ms | Use 5–11 ms post-stim as analysis window |
| H-wave onset latency post-stim | ~25 ms | |
| H-wave peak latency post-stim | ~28–30 ms | |
| H-wave duration (window) | ~8 ms | Use 24–32 ms post-stim as analysis window |
| Background noise floor | ~0.05 mV RMS | Gaussian white noise |
Waveform Shape
Each recording sweep should be synthesized as follows:
- Baseline noise: Gaussian white noise at 0.05 mV RMS across all 2400 samples
- Stimulus artifact: A sharp biphasic spike at the stimulus sample (index 300):
- Duration: 3 samples
- Amplitude: scales slightly with stimulus intensity (1–3 mV peak) — clipped by the real hardware
- M-wave (compound muscle action potential): A biphasic Gaussian waveform
- Positive peak then negative trough (ratio ~1.5:1)
- Peak at stim+9 ms; model as
A_m * sin(2π * 200Hz * t) * exp(-t²/(2σ²))with σ=1.2ms - Amplitude
A_mfollows a sigmoid as a function of stimulus intensity:- Threshold: ~2 mA; saturation: ~7 mA; max amplitude: ~1.2 mV peak-to-trough
- Use:
A_m = A_m_max / (1 + exp(-k*(stim - stim_m_threshold)))with k=1.2
- H-wave (H-reflex): A smaller biphasic waveform at a longer latency
- Peak at stim+29 ms; same shape model as M-wave with σ=1.8ms, dominant frequency ~100 Hz
- Amplitude
A_hfollows an inverted-U (bell curve) over stimulus intensity:- Appears at ~1.5 mA, peaks around 4–5 mA (~0.4 mV), disappears above ~8 mA
- Use:
A_h = A_h_max * exp(-((stim - stim_h_peak)**2) / (2 * sigma_h**2))withstim_h_peak=4.0,sigma_h=1.5 - Only present when
A_h > 0.02 mV(below noise floor threshold)
- Apply bandpass filter to the full waveform using the existing
monstim_signals.transform.filteringfunctions (100–3500 Hz Butterworth, order 4,filtfilt). If the module import is inconvenient fromtools/, just implementscipy.signal.butter+filtfiltinline with identical parameters.
Amplitude Extraction
After generating the filtered waveforms, compute for each sweep:
- M-wave amplitude: RMS in the M-wave window (5–11 ms post-stim) — use the existing
amplitude.pyfunction - H-wave amplitude: RMS in the H-wave window (24–32 ms post-stim)
- M-wave peak_to_trough: peak minus trough in window (store this too)
- H-wave peak_to_trough: peak minus trough in window
Do not use the plateau/M-max detection for the demo — just store the raw per-sweep amplitudes.
Output Schema: tools/demo_data.json
Produce exactly this JSON structure (no extra nesting, no deviation):
{
"meta": {
"scan_rate": 30000,
"num_samples": 2400,
"stim_onset_ms": 10.0,
"m_window_ms": [5.0, 11.0],
"h_window_ms": [24.0, 32.0],
"channel_name": "Tibialis Anterior (Synthetic)",
"generated_at": "<ISO-8601 timestamp>"
},
"recordings": [
{
"index": 0,
"stim_ma": 0.5,
"time_ms": [/* 2400 floats, 2 decimal places, from 0.0 to 79.97 ms */],
"emg_mv": [/* 2400 floats, filtered waveform, 5 significant figures */],
"m_wave": {
"window_ms": [5.0, 11.0],
"amplitude_rms_mv": 0.0,
"amplitude_p2t_mv": 0.0,
"present": false
},
"h_wave": {
"window_ms": [24.0, 32.0],
"amplitude_rms_mv": 0.0,
"amplitude_p2t_mv": 0.0,
"present": false
}
}
/* ... 34 more recording objects ... */
],
"recruitment_curve": {
"stim_ma": [/* 35 floats */],
"m_wave_rms_mv": [/* 35 floats */],
"h_wave_rms_mv": [/* 35 floats */],
"m_wave_p2t_mv": [/* 35 floats */],
"h_wave_p2t_mv": [/* 35 floats */]
}
}
Encoding rules:
time_msandemg_mv: round to 5 significant figures to keep file size manageablepresentflag:trueif the wave’samplitude_rms_mv > 0.02(i.e., above noise floor)- Use Python’s
jsonmodule; do NOT use numpy types directly (call.tolist()on arrays,round(float(x), 5)on scalars) - Target file size: under 1.5 MB. If the file exceeds this, reduce
num_samplesto 1200 (40 ms window, same relative timing) and recalculate accordingly. Updatemetato match whatever you choose. - Set a fixed random seed (
np.random.seed(42)) for reproducibility
Execution & Validation
After writing the script:
- Run it:
python tools/generate_demo_data.pyfrom the repo root (activate thealv_labconda environment first) - Verify the output file exists at
tools/demo_data.json - Run these validation checks and report the results:
- File size in KB
- Number of recordings in the JSON
- Max M-wave RMS amplitude across all recordings
- Max H-wave RMS amplitude across all recordings
- Stimulus intensity at which H-wave is maximum
- Stimulus intensity at which M-wave first exceeds 0.1 mV RMS (M-threshold)
- Print the first recording’s
emg_mvmin and max values (sanity check on filter/noise)
- If any validation fails (e.g., H-wave never appears, M-wave never saturates, file > 1.5 MB), fix the synthesis parameters and re-run before reporting results.
Deliverables
Return the following in your response:
- The complete source of
tools/generate_demo_data.py - The full terminal output from running it (including validation checks)
- The complete contents of
tools/demo_data.json(paste the entire file — it should be under 1.5 MB) - A brief note on any deviations from this spec (e.g., if you had to adjust physiological parameters to get realistic-looking curves, explain what and why)
The agent consuming this output will use the JSON to build a Plotly.js interactive page and needs the schema to be exactly as specified. Do not invent extra fields or change key names.
