| name | adding-distribution |
| description | How to add a univariate distribution to stochastic-rs-distributions. Covers SimdXxx struct, sampling pattern (transformation / ziggurat / rejection / inversion), DistributionExt closed-form moments/pdf/cdf/cf, KS-test, and the py_distribution! macro. |
Adding distribution โ stochastic-rs-distributions
Each distribution lives at stochastic-rs-distributions/src/<name>.rs
and ships a SimdXxx<T> struct that implements:
- The
rand_distr::Distribution<T> trait (per-sample sample(rng)).
- Bulk fillers
fill_slice(rng, dst) and fill_slice_fast(dst) (with
internal RNG seed advancement).
DistributionExt for closed-form pdf / cdf / characteristic
function / moments.
- The
py_distribution! macro at the bottom for Python exposure.
The ยง1.5 audit note "DistributionExt is 18/19 closed-form (not 3/19)"
plus the feedback_no_statrs_distributions memory entry are the
load-bearing constraints: closed-form math, written from scratch in
this crate, never statrs::distribution::*.
1. Pick a sampling strategy
Three patterns, in order of preference:
| Pattern | When to use | Reference impl |
|---|
| Transformation | Closed-form F^{-1}(U) exists and is fast to evaluate. | SimdExponential, SimdLogNormal |
| Ziggurat | Density is unimodal & smooth; need throughput. | SimdNormal, SimdGamma |
| Rejection / inversion | Density has heavy tails or kink; need correctness. | SimdInverseGamma, SimdNig |
For tail-heavy distributions (NIG, Variance-Gamma, CGMY), the rejection
step needs a documented acceptance ratio in the source comments โ the
reviewer needs to verify that the proposal density majorises the target.
2. Mandatory surface
use crate::traits::FloatExt;
use crate::traits::DistributionExt;
use rand::SeedableRng;
use rand_xoshiro::Xoshiro256PlusPlus;
pub struct SimdFoo<T: FloatExt> {
pub a: T, pub b: T,
rng: std::cell::RefCell<Xoshiro256PlusPlus>,
}
impl<T: FloatExt> SimdFoo<T> {
pub fn new(a: T, b: T) -> Self {
let seed = rand::random::<u64>();
Self::with_seed(a, b, seed)
}
pub fn with_seed(a: T, b: T, seed: u64) -> Self {
Self {
a, b,
rng: std::cell::RefCell::new(Xoshiro256PlusPlus::seed_from_u64(seed)),
}
}
pub fn from_seed_source<R: rand::SeedableRng>(a: T, b: T, src: R) -> Self {
let mut bytes = [0u8; 8];
src.fill_bytes(&mut bytes);
let seed = u64::from_le_bytes(bytes);
Self::with_seed(a, b, seed)
}
pub fn fill_slice(&self, dst: &mut [T]) { }
pub fn fill_slice_fast(&self, dst: &mut [T]) { }
}
impl<T: FloatExt> rand::distributions::Distribution<T> for SimdFoo<T> {
fn sample<R: rand::Rng + ?Sized>(&self, rng: &mut R) -> T {
}
}
3. DistributionExt โ closed-form math
impl<T: FloatExt> DistributionExt<T> for SimdFoo<T> {
fn pdf(&self, x: T) -> T { }
fn cdf(&self, x: T) -> T { }
fn cf(&self, u: T) -> num_complex::Complex<T> { }
fn mean(&self) -> T { }
fn variance(&self) -> T { }
fn skewness(&self) -> T { unimplemented!("skewness not implemented for SimdFoo") }
fn kurtosis(&self) -> T { unimplemented!("kurtosis not implemented for SimdFoo") }
}
The 5 currently-unimplemented unimplemented! distributions (per the
project_distribution_ext_status memory) are intentional: where the
literature has no closed form (e.g. NIG raw moments require Bessel-K
identities), the panic is a documentation device โ users should use
empirical moments via crate::estimators::*.
4. Source-file documentation
The //! header MUST include:
Example: SimdNig cites Barndorff-Nielsen (1997) eq. 3; SimdCgmy
cites Carr-Geman-Madan-Yor (2002) eq. 3.4.
5. Testing โ KS test + reference comparison
Two mandatory tests:
#[cfg(test)]
mod tests {
use super::*;
use crate::stats::ks_test;
#[test]
fn ks_test_passes() {
let d = SimdFoo::<f64>::with_seed(2.0, 3.0, 42);
let mut samples = vec![0.0; 100_000];
d.fill_slice_fast(&mut samples);
let p = ks_test(&samples, |x| d.cdf(x));
assert!(p > 0.05, "KS p-value = {p}");
}
#[test]
fn moments_match_closed_form() { ... }
}
Plus the workspace-level distribution_ext_vs_reference integration
test (in stochastic-rs-distributions/tests/) โ add a row for the new
distribution comparing pdf/cdf/cf at fixed reference points to a
manually-computed Mathematica/scipy table.
6. Python wrapper โ py_distribution!
Append at the bottom of src/foo.rs:
py_distribution!(PyFoo, SimdFoo,
sig: (a, b, seed = None, dtype = None),
params: (a: f64, b: f64),
);
The macro generates PyFoo, __new__, sample(n), sample_par(m, n),
all routed through the IntoF32 / IntoF64 shims. Then in
stochastic-rs-py/src/lib.rs:
use stochastic_rs_distributions::foo::PyFoo;
m.add_class::<PyFoo>()?;
7. CLAUDE.md / prelude updates
stochastic-rs-distributions/CLAUDE.md โ list the new distribution.
- The umbrella
CLAUDE.md workspace layout doesn't list individual
distributions; only update if the count crosses a notable boundary.
8. Anti-patterns
- Do not import
statrs::distribution::*. The
feedback_no_statrs_distributions memory entry is explicit.
- Do not return
0.0 from unimplemented moments. Use
unimplemented!("...") so callers fail loudly.
- Do not sample without a seeded path. Both
new (thread-local
seed) and with_seed (explicit) are mandatory.
- Do not skip the LaTeX
//! header โ the rust-docs need the
formula for users skimming.
9. Reference impls
SimdNormal (normal.rs) โ ziggurat; the canonical reference.
SimdExponential (exponential.rs) โ transformation; thinnest possible.
SimdGamma (gamma.rs) โ ziggurat with shape > 1, transformation
fallback for shape โค 1.
SimdNig (nig.rs) โ rejection; tail-heavy; Bessel-K-based pdf.
SimdCgmy (cgmy.rs) โ Lรฉvy density with rejection; CGMY 2002.
Related SKILLs
add-jump-process โ consumes a distribution as the jump-size
parameter D.
python-bindings โ py_distribution! macro details.
stats-estimator โ for an MLE / MoM estimator that fits the
distribution to data.