Editor's Note
gsDesignNB
>
Install
npx skills add https://github.com/keaven/gsDesignNB --skill gsDesignNBgsDesignNB — Comprehensive AI Skill
Last updated: 2026-04-30 — matches package version 0.3.2
Package purpose
gsDesignNB is an R package for planning, monitoring, and adapting clinical trials whose primary endpoint is a recurrent event count analyzed with a negative binomial (NB) model. It extends the gsDesign ecosystem to handle the special features of NB designs: information that depends on exposure and dispersion (not just event counts), protocol event gaps, calendar-time interim looks, and blinded/unblinded sample size re-estimation.
Statistical model
Conditional on a subject-level gamma frailty, events follow a Poisson process with constant rate. Marginally, the event count $Y_i$ for subject $i$ with exposure $t_i$ is:
- Mean: $\mu_i = \lambda_g t_i$
- Variance: $\mu_i + k \mu_i^2$
where $\lambda_g$ is the group rate and $k$ is the dispersion parameter. The treatment effect is the log rate ratio $\theta = \log(\lambda_2 / \lambda_1)$, estimated via NB GLM with log link and exposure offset.
Fisher information per subject: $\mathcal{I}_i = \mu_i / (1 + k \mu_i)$.
Wald vs score test
mutze_test() supports test_type = "wald" (default) and test_type = "score":
- Wald test: Fits arm-specific NB GLMs; $z = \hat\theta / \text{SE}(\hat\theta)$.
- Score test: Fits a pooled null NB GLM via
MASS::glm.nb(); score statistic $U = \sum (y_i - \hat\mu_i) / (1 + \hat k_0 \hat\mu_i)$; Fisher information $I_0 = W_1 W_2 / (W_1 + W_2)$; $z = U / \sqrt{I_0}$.
The score test is slightly more conservative under $H_0$ and better controls Type I error in small-sample / low-rate scenarios.
Two-variance sample size formula
sample_size_nbinom() supports test_type = "wald" (default) and test_type = "score":
- Wald (single variance): $n_1 = (z_\alpha + z_\beta)^2 V_1 / \theta^2$
- Score (two variance): $n_1 = (z_\alpha \sqrt{V_0} + z_\beta \sqrt{V_1})^2 / \theta^2$
where $V_1 = (1/\mu_1 + k_1)(1 + 1/r) + (1/\mu_2 + k_2)/r$ is the alternative-hypothesis variance, and $V_0 = (1/\mu_0 + k_0)(1 + 1/r)$ uses a pooled rate $\lambda_0 = (\lambda_1 + r \lambda_2) / (1 + r)$ under $H_0$.
The score formula is matched to the score test's null-variance reference distribution, but it is not automatically more powerful. In the package superiority grid, Wald/Zhu--Lakkis sizing paired with the score test preserved Type I error and gave slightly higher score-test power than score sizing because it retained a small sample-size margin. Compare both sizing rules and verify Type I error and power by simulation for the design setting.
Core workflow
sample_size_nbinom() → gsNBCalendar() → sim_gs_nbinom() or sim_ssr_nbinom() → summarize
- Fixed design:
sample_size_nbinom()computes sample size/power under planned rates, dispersion, piecewise accrual, exponential dropout, max follow-up, and event gaps. - GS design:
gsNBCalendar()converts the fixed design to agsNBobject (inheritsgsDesign) with calendar-time analysis looks and NB-specific information. - Simulation:
sim_gs_nbinom()(fixed GS timing) orsim_ssr_nbinom()(adaptive SSR) generates trial-level operating characteristics. - Summary:
check_gs_bound()+summarize_gs_sim(), orsummarize_ssr_sim().
Function reference
Sample size and design
| Function | Purpose |
|---|---|
sample_size_nbinom() | Fixed-design sample size/power; test_type = "wald" (V₁ only) or "score" (V₀+V₁) |
compute_info_at_time() | Information at a given calendar time (internal to sample_size_nbinom) |
gsNBCalendar() | Calendar-time group sequential design from fixed design result |
toInteger() | Round design to integer sample sizes (S3 method for gsNB) |
Data simulation
| Function | Purpose |
|---|---|
nb_sim() | Simulate recurrent events — piecewise accrual, exponential dropout, gamma-Poisson mixture |
nb_sim_seasonal() | Same as nb_sim() but with season-varying event rates |
Interim data handling
| Function | Purpose |
|---|---|
cut_data_by_date() | Censor simulated data at a calendar date; aggregate events/exposure per subject |
get_cut_date() | Find analysis date from calendar, event, completer, or information targets |
get_analysis_date() | Find calendar time when target event count is reached |
cut_date_for_completers() | Find calendar time when target completer count is reached |
cut_completers() | Convenience wrapper: find completer date + cut data |
Statistical analysis and information
| Function | Purpose |
|---|---|
mutze_test() | Wald or score test for treatment effect (test_type = "wald" / "score") |
calculate_blinded_info() | Blinded dispersion/rate estimation and statistical information |
estimate_nb_mom() | Method-of-moments NB parameter estimation (blinded or unblinded) |
Group sequential simulation
| Function | Purpose |
|---|---|
sim_gs_nbinom() | Monte Carlo GS trial simulation with calendar-time cuts |
check_gs_bound() | Apply GS boundaries to simulation results on chosen info scale |
summarize_gs_sim() | Operating characteristic summaries (power, futility, per-analysis) |
Sample size re-estimation
| Function | Purpose |
|---|---|
blinded_ssr() | Blinded SSR using pooled NB fit (Friede & Schmidli 2010) |
unblinded_ssr() | Unblinded SSR using arm-specific rate/dispersion estimates |
sim_ssr_nbinom() | Full adaptive simulation engine (No adaptation / Blinded / Unblinded) |
summarize_ssr_sim() | Summary of SSR simulation results |
Utilities
| Function | Purpose |
|---|---|
run_ssr_shiny() | Interactive Shiny interface for SSR exploration |
preview_pkgdown_site() | Serve docs locally with correct CSS (not pkgdown::preview_site()) |
Re-exported from gsDesign
gsDesign(), gsBoundSummary(), and all sf*() spending functions are re-exported so users
can work in one namespace.
Jensen's inequality correction for event gaps
When a protocol defines a minimum gap $\gamma$ between counted events, the naive effective rate $\lambda_{\text{eff}} = \lambda / (1 + \lambda\gamma)$ is biased upward due to Jensen's inequality applied over the gamma-distributed subject-level rates. The second-order correction is:
$$ \lambda_{\text{eff}} \approx \frac{\lambda}{1 + \lambda\gamma} \left(1 - \frac{k\lambda\gamma}{(1 + \lambda\gamma)^2}\right) $$
When it matters: $k \geq 0.5$ and gap $\geq 20$ days. A 64-scenario simulation sweep showed the naive formula underpowers by 1–2.5 pp in these regimes; the correction reduces effective-rate approximation error from ~8% to ~1%.
When it is negligible: $k \leq 0.1$ or gap $\leq 10$ days.
The same gap rule flows through sample_size_nbinom(), nb_sim(), and cut_data_by_date() so
that planning, simulation, and interim cutting are consistent.
Three kinds of "gaps"
| Concept | Scale | Package interface |
|---|---|---|
| Analysis gap $\Delta_i$ | Trial calendar | analysis_times in gsNBCalendar(), search criteria in get_cut_date() |
| Timing heterogeneity | Accrual, dropout, follow-up | accrual_rate/duration, dropout_rate, max_followup, trial_duration |
| Event gap $g$ | Patient time | event_gap argument, carried through planning → simulation → interim cut |
SSR guidance
sim_ssr_nbinom()supports strategies:"No adaptation","Blinded SSR","Unblinded SSR".bound_infocontrols which information scale drives boundaries:"unblinded_ml"(default),"blinded_ml","unblinded_mom","blinded_mom".- For SSR studies, treat the final test statistic as central. The current production SSR study uses the score test for power simulations and compares Wald vs score Type I error at the same nominal one-sided $\alpha = 0.025$.
- In that non-binding Type I comparison, Wald is mildly anti-conservative (about 2.8%--3.1%) and score is conservative (about 2.1%--2.3%). Prefer the score final test when Type I calibration is the priority, then check whether Wald sizing, a higher target power, or a modest information margin is needed for power.
- SSR adaptation is driven mainly by interim nuisance estimates and the adaptation cap; the score test reduces anti-conservative rejection after adaptation rather than eliminating the need to simulate adapted sample-size behavior.
- Blinded SSR preserves masking. Unblinded SSR may be more sample-efficient, but the relative behavior is design-specific and should be compared under the trial's nuisance range, adaptation cap, and final test statistic.
- Spending is not accelerated when observed information exceeds planned: uses
min(planned IF, actual IF). - Futility assessment is deferred until ≥ 30% of planned information.
Time-scale checklist
Before writing or reviewing code, confirm all of these use the same time unit:
enroll_rate$durationandenroll_rate$ratefail_rate$ratedropout_rate$rateanddropout_rate$durationtrial_durationmax_followup- Calendar cut times /
analysis_times event_gap
If units are mixed, fix that first.
Key arguments shared across functions
| Argument | Used in | Meaning |
|---|---|---|
lambda1, lambda2 | sample_size_nbinom() | Control and experimental event rates |
dispersion / k | Multiple | NB dispersion (scalar or arm-specific vector) |
accrual_rate, accrual_duration | sample_size_nbinom() | Piecewise enrollment |
enroll_rate | nb_sim(), sim_gs_nbinom(), sim_ssr_nbinom() | data.frame(rate, duration) |
fail_rate | Simulation functions | data.frame(treatment, rate, dispersion) |
dropout_rate | Multiple | data.frame(treatment, rate, duration) for exponential dropout |
max_followup | Multiple | Maximum per-subject follow-up |
trial_duration | sample_size_nbinom() | Total calendar time of the trial |
event_gap | sample_size_nbinom(), nb_sim(), cut_data_by_date() | Minimum gap between counted events |
rr0 | sample_size_nbinom() | Null rate ratio (default 1; set < 1 for non-inferiority) |
test_type | mutze_test(), sample_size_nbinom(), sim_gs_nbinom(), sim_ssr_nbinom() | "wald" (default) or "score" |
S3 class hierarchy
sample_size_nbinom_result: output ofsample_size_nbinom(); hasprint()andsummary().gsNB: output ofgsNBCalendar(); inherits fromgsDesign; hassummary(),toInteger().nb_sim_data/nb_sim_seasonal: output ofnb_sim()/nb_sim_seasonal(); dispatchescut_data_by_date().mutze_test: output ofmutze_test(); hasprint().
Vignettes
| Vignette | Topic |
|---|---|
sample-size-nbinom | Fixed-design NB sample size planning |
simulation-example | Basic recurrent-event data generation |
seasonal-simulation | Seasonal rate simulation |
group-sequential-simulation | Calendar-time GS design workflow |
ssr-example | Short SSR workflow |
ssr-simulation-study | Large SSR simulation case study |
completers-interim-example | Completer-based interim looks |
non-inferiority-example | Non-inferiority design |
verification-simulation | Simulation verification of sample size and test |
score-vs-wald-simulation | 2×2 comparison: Wald/score sizing × Wald/score test (interactive DT/plotly) |
ai-skills | Demonstrates how to use SKILL.md and llms.txt to guide package-native workflows |
blinded-info-diagnostics | Blinded information estimation diagnostics |
blinded-info-diagnostics | Edge cases for blinded information |
verification-simulation | Design-formula verification against simulation |
Common pitfalls
- Do not re-implement SSR logic if
sim_ssr_nbinom()covers the workflow. mutze_test()may fall back to Poisson based onpoisson_threshold; this is defensive, not a model choice.event_gapchanges both counted events and at-risk exposure; it is not cosmetic.check_gs_bound()acceptsinfo_colto select the information column; use that rather than renaming.- For pkgdown preview, use
gsDesignNB::preview_pkgdown_site(), notpkgdown::preview_site().
Disease-area calibration
Published NB trial parameters for reference:
| Therapeutic area | ~λ (events/month) | ~k | Gap |
|---|---|---|---|
| Heart failure hospitalizations | 0.3 | 0.5 | Extended hospitalization-free window |
| COPD exacerbations | 0.5–0.8 | 0.5–1.0 | ≥ 28 symptom-free days |
| MS relapses | 1–2/year | moderate–high | Variable |
| Asthma exacerbations | Variable | ~0.8 | Protocol-specific |
Dependencies
- Runtime: gsDesign (≥ 3.9.0), data.table, simtrial, MASS, stats
- Suggested: ggplot2, dplyr, gt, DT, shiny, foreach, doFuture, future, testthat, knitr