CLAUDE CODE MARKETPLACES

gsDesignNB

>

npx skills add https://github.com/keaven/gsDesignNB --skill gsDesignNB
SKILL.md

gsDesignNB — Comprehensive AI Skill

Last updated: 2026-04-30 — matches package version 0.3.2

Package purpose

gsDesignNB is an R package for planning, monitoring, and adapting clinical trials whose primary endpoint is a recurrent event count analyzed with a negative binomial (NB) model. It extends the gsDesign ecosystem to handle the special features of NB designs: information that depends on exposure and dispersion (not just event counts), protocol event gaps, calendar-time interim looks, and blinded/unblinded sample size re-estimation.

Statistical model

Conditional on a subject-level gamma frailty, events follow a Poisson process with constant rate. Marginally, the event count $Y_i$ for subject $i$ with exposure $t_i$ is:

  • Mean: $\mu_i = \lambda_g t_i$
  • Variance: $\mu_i + k \mu_i^2$

where $\lambda_g$ is the group rate and $k$ is the dispersion parameter. The treatment effect is the log rate ratio $\theta = \log(\lambda_2 / \lambda_1)$, estimated via NB GLM with log link and exposure offset.

Fisher information per subject: $\mathcal{I}_i = \mu_i / (1 + k \mu_i)$.

Wald vs score test

mutze_test() supports test_type = "wald" (default) and test_type = "score":

  • Wald test: Fits arm-specific NB GLMs; $z = \hat\theta / \text{SE}(\hat\theta)$.
  • Score test: Fits a pooled null NB GLM via MASS::glm.nb(); score statistic $U = \sum (y_i - \hat\mu_i) / (1 + \hat k_0 \hat\mu_i)$; Fisher information $I_0 = W_1 W_2 / (W_1 + W_2)$; $z = U / \sqrt{I_0}$.

The score test is slightly more conservative under $H_0$ and better controls Type I error in small-sample / low-rate scenarios.

Two-variance sample size formula

sample_size_nbinom() supports test_type = "wald" (default) and test_type = "score":

  • Wald (single variance): $n_1 = (z_\alpha + z_\beta)^2 V_1 / \theta^2$
  • Score (two variance): $n_1 = (z_\alpha \sqrt{V_0} + z_\beta \sqrt{V_1})^2 / \theta^2$

where $V_1 = (1/\mu_1 + k_1)(1 + 1/r) + (1/\mu_2 + k_2)/r$ is the alternative-hypothesis variance, and $V_0 = (1/\mu_0 + k_0)(1 + 1/r)$ uses a pooled rate $\lambda_0 = (\lambda_1 + r \lambda_2) / (1 + r)$ under $H_0$.

The score formula is matched to the score test's null-variance reference distribution, but it is not automatically more powerful. In the package superiority grid, Wald/Zhu--Lakkis sizing paired with the score test preserved Type I error and gave slightly higher score-test power than score sizing because it retained a small sample-size margin. Compare both sizing rules and verify Type I error and power by simulation for the design setting.

Core workflow

sample_size_nbinom() → gsNBCalendar() → sim_gs_nbinom() or sim_ssr_nbinom() → summarize
  1. Fixed design: sample_size_nbinom() computes sample size/power under planned rates, dispersion, piecewise accrual, exponential dropout, max follow-up, and event gaps.
  2. GS design: gsNBCalendar() converts the fixed design to a gsNB object (inherits gsDesign) with calendar-time analysis looks and NB-specific information.
  3. Simulation: sim_gs_nbinom() (fixed GS timing) or sim_ssr_nbinom() (adaptive SSR) generates trial-level operating characteristics.
  4. Summary: check_gs_bound() + summarize_gs_sim(), or summarize_ssr_sim().

Function reference

Sample size and design

FunctionPurpose
sample_size_nbinom()Fixed-design sample size/power; test_type = "wald" (V₁ only) or "score" (V₀+V₁)
compute_info_at_time()Information at a given calendar time (internal to sample_size_nbinom)
gsNBCalendar()Calendar-time group sequential design from fixed design result
toInteger()Round design to integer sample sizes (S3 method for gsNB)

Data simulation

FunctionPurpose
nb_sim()Simulate recurrent events — piecewise accrual, exponential dropout, gamma-Poisson mixture
nb_sim_seasonal()Same as nb_sim() but with season-varying event rates

Interim data handling

FunctionPurpose
cut_data_by_date()Censor simulated data at a calendar date; aggregate events/exposure per subject
get_cut_date()Find analysis date from calendar, event, completer, or information targets
get_analysis_date()Find calendar time when target event count is reached
cut_date_for_completers()Find calendar time when target completer count is reached
cut_completers()Convenience wrapper: find completer date + cut data

Statistical analysis and information

FunctionPurpose
mutze_test()Wald or score test for treatment effect (test_type = "wald" / "score")
calculate_blinded_info()Blinded dispersion/rate estimation and statistical information
estimate_nb_mom()Method-of-moments NB parameter estimation (blinded or unblinded)

Group sequential simulation

FunctionPurpose
sim_gs_nbinom()Monte Carlo GS trial simulation with calendar-time cuts
check_gs_bound()Apply GS boundaries to simulation results on chosen info scale
summarize_gs_sim()Operating characteristic summaries (power, futility, per-analysis)

Sample size re-estimation

FunctionPurpose
blinded_ssr()Blinded SSR using pooled NB fit (Friede & Schmidli 2010)
unblinded_ssr()Unblinded SSR using arm-specific rate/dispersion estimates
sim_ssr_nbinom()Full adaptive simulation engine (No adaptation / Blinded / Unblinded)
summarize_ssr_sim()Summary of SSR simulation results

Utilities

FunctionPurpose
run_ssr_shiny()Interactive Shiny interface for SSR exploration
preview_pkgdown_site()Serve docs locally with correct CSS (not pkgdown::preview_site())

Re-exported from gsDesign

gsDesign(), gsBoundSummary(), and all sf*() spending functions are re-exported so users can work in one namespace.

Jensen's inequality correction for event gaps

When a protocol defines a minimum gap $\gamma$ between counted events, the naive effective rate $\lambda_{\text{eff}} = \lambda / (1 + \lambda\gamma)$ is biased upward due to Jensen's inequality applied over the gamma-distributed subject-level rates. The second-order correction is:

$$ \lambda_{\text{eff}} \approx \frac{\lambda}{1 + \lambda\gamma} \left(1 - \frac{k\lambda\gamma}{(1 + \lambda\gamma)^2}\right) $$

When it matters: $k \geq 0.5$ and gap $\geq 20$ days. A 64-scenario simulation sweep showed the naive formula underpowers by 1–2.5 pp in these regimes; the correction reduces effective-rate approximation error from ~8% to ~1%.

When it is negligible: $k \leq 0.1$ or gap $\leq 10$ days.

The same gap rule flows through sample_size_nbinom(), nb_sim(), and cut_data_by_date() so that planning, simulation, and interim cutting are consistent.

Three kinds of "gaps"

ConceptScalePackage interface
Analysis gap $\Delta_i$Trial calendaranalysis_times in gsNBCalendar(), search criteria in get_cut_date()
Timing heterogeneityAccrual, dropout, follow-upaccrual_rate/duration, dropout_rate, max_followup, trial_duration
Event gap $g$Patient timeevent_gap argument, carried through planning → simulation → interim cut

SSR guidance

  • sim_ssr_nbinom() supports strategies: "No adaptation", "Blinded SSR", "Unblinded SSR".
  • bound_info controls which information scale drives boundaries: "unblinded_ml" (default), "blinded_ml", "unblinded_mom", "blinded_mom".
  • For SSR studies, treat the final test statistic as central. The current production SSR study uses the score test for power simulations and compares Wald vs score Type I error at the same nominal one-sided $\alpha = 0.025$.
  • In that non-binding Type I comparison, Wald is mildly anti-conservative (about 2.8%--3.1%) and score is conservative (about 2.1%--2.3%). Prefer the score final test when Type I calibration is the priority, then check whether Wald sizing, a higher target power, or a modest information margin is needed for power.
  • SSR adaptation is driven mainly by interim nuisance estimates and the adaptation cap; the score test reduces anti-conservative rejection after adaptation rather than eliminating the need to simulate adapted sample-size behavior.
  • Blinded SSR preserves masking. Unblinded SSR may be more sample-efficient, but the relative behavior is design-specific and should be compared under the trial's nuisance range, adaptation cap, and final test statistic.
  • Spending is not accelerated when observed information exceeds planned: uses min(planned IF, actual IF).
  • Futility assessment is deferred until ≥ 30% of planned information.

Time-scale checklist

Before writing or reviewing code, confirm all of these use the same time unit:

  • enroll_rate$duration and enroll_rate$rate
  • fail_rate$rate
  • dropout_rate$rate and dropout_rate$duration
  • trial_duration
  • max_followup
  • Calendar cut times / analysis_times
  • event_gap

If units are mixed, fix that first.

Key arguments shared across functions

ArgumentUsed inMeaning
lambda1, lambda2sample_size_nbinom()Control and experimental event rates
dispersion / kMultipleNB dispersion (scalar or arm-specific vector)
accrual_rate, accrual_durationsample_size_nbinom()Piecewise enrollment
enroll_ratenb_sim(), sim_gs_nbinom(), sim_ssr_nbinom()data.frame(rate, duration)
fail_rateSimulation functionsdata.frame(treatment, rate, dispersion)
dropout_rateMultipledata.frame(treatment, rate, duration) for exponential dropout
max_followupMultipleMaximum per-subject follow-up
trial_durationsample_size_nbinom()Total calendar time of the trial
event_gapsample_size_nbinom(), nb_sim(), cut_data_by_date()Minimum gap between counted events
rr0sample_size_nbinom()Null rate ratio (default 1; set < 1 for non-inferiority)
test_typemutze_test(), sample_size_nbinom(), sim_gs_nbinom(), sim_ssr_nbinom()"wald" (default) or "score"

S3 class hierarchy

  • sample_size_nbinom_result: output of sample_size_nbinom(); has print() and summary().
  • gsNB: output of gsNBCalendar(); inherits from gsDesign; has summary(), toInteger().
  • nb_sim_data / nb_sim_seasonal: output of nb_sim() / nb_sim_seasonal(); dispatches cut_data_by_date().
  • mutze_test: output of mutze_test(); has print().

Vignettes

VignetteTopic
sample-size-nbinomFixed-design NB sample size planning
simulation-exampleBasic recurrent-event data generation
seasonal-simulationSeasonal rate simulation
group-sequential-simulationCalendar-time GS design workflow
ssr-exampleShort SSR workflow
ssr-simulation-studyLarge SSR simulation case study
completers-interim-exampleCompleter-based interim looks
non-inferiority-exampleNon-inferiority design
verification-simulationSimulation verification of sample size and test
score-vs-wald-simulation2×2 comparison: Wald/score sizing × Wald/score test (interactive DT/plotly)
ai-skillsDemonstrates how to use SKILL.md and llms.txt to guide package-native workflows
blinded-info-diagnosticsBlinded information estimation diagnostics
blinded-info-diagnosticsEdge cases for blinded information
verification-simulationDesign-formula verification against simulation

Common pitfalls

  • Do not re-implement SSR logic if sim_ssr_nbinom() covers the workflow.
  • mutze_test() may fall back to Poisson based on poisson_threshold; this is defensive, not a model choice.
  • event_gap changes both counted events and at-risk exposure; it is not cosmetic.
  • check_gs_bound() accepts info_col to select the information column; use that rather than renaming.
  • For pkgdown preview, use gsDesignNB::preview_pkgdown_site(), not pkgdown::preview_site().

Disease-area calibration

Published NB trial parameters for reference:

Therapeutic area~λ (events/month)~kGap
Heart failure hospitalizations0.30.5Extended hospitalization-free window
COPD exacerbations0.5–0.80.5–1.0≥ 28 symptom-free days
MS relapses1–2/yearmoderate–highVariable
Asthma exacerbationsVariable~0.8Protocol-specific

Dependencies

  • Runtime: gsDesign (≥ 3.9.0), data.table, simtrial, MASS, stats
  • Suggested: ggplot2, dplyr, gt, DT, shiny, foreach, doFuture, future, testthat, knitr
Installs0
GitHub Stars1
LanguageHTML
AddedJun 10, 2026
View on GitHub