ワンクリックで
code-best-practice
// PyMC-Marketing coding conventions, preferred implementations, and style guidelines.
// PyMC-Marketing coding conventions, preferred implementations, and style guidelines.
| name | code-best-practice |
| description | PyMC-Marketing coding conventions, preferred implementations, and style guidelines. |
| disable-model-invocation | true |
This document outlines the coding conventions, preferred implementations, and style guidelines for contributing to pymc-marketing.
We follow strict linting and formatting rules enforced by Ruff and MyPy.
# Good
def calculate_metric(data: pd.DataFrame, col: str) -> xarray.DataArray: ...
# Bad
def calculate_metric(data, col): ...
pydantic for runtime validation of user inputs in __init__ methods.All public classes and functions must have docstrings following the NumPy style.
.. code-block:: python directive for code examples.def expected_purchases(self, future_t: int) -> xarray.DataArray:
"""
Compute expected number of future purchases.
Parameters
----------
future_t : int
Number of time periods to predict.
Returns
-------
xarray.DataArray
The expected number of purchases.
Examples
--------
.. code-block:: python
model = MyModel(data)
model.fit()
model.expected_purchases(future_t=12)
References
----------
.. [1] Fader, P. S., et al. (2005). "Counting Your Customers..."
"""
Avoid Python loops for mathematical operations. Use numpy, xarray, or pytensor (via pymc) broadcasting.
# Bad: Iterating over customers
results = []
for customer in customers:
results.append(calculate_val(customer))
# Good: Vectorized operation
results = alpha * np.exp(-beta * data)
Use pm.Data for Mutable Inputs:
Allows you to change data (e.g., for out-of-sample predictions) without rebuilding the model graph.
# In build_model
self.model_coords = {"customer_id": unique_ids}
with pm.Model(coords=self.model_coords) as self.model:
# Mutable data container
x_data = pm.Data("x_data", data[cols], dims="customer_id")
...
Batch Dimensions (Coords):
Use named dimensions (dims) instead of raw shapes. This integrates with xarray for post-processing.
# Good
alpha = pm.Normal("alpha", mu=0, sigma=1, dims="channel")
HSGP for Gaussian Processes: For time-varying parameters (like in MMM), prefer Hilbert Space Gaussian Processes (HSGP) over standard GPs. HSGP approximates the GP using basis functions, reducing complexity from $O(n^3)$ to $O(n \cdot m)$.
When implementing functionality like sensitivity analysis, optimization routines, or complex transformations, prefer pytensor operations over pure Python/NumPy.
import pytensor.tensor as pt
# Good: PyTensor implementation
def saturation(x, alpha):
return 1 - pt.exp(-alpha * x)
# This graph can now be differentiated with respect to alpha
Models should inherit from base classes (MMM, CLVModel) and implement specific lifecycle methods:
__init__:
validate_call or pydantic.model_config dictionary.build_model:
pm.Data containers and random variables._extract_predictive_variables:
pandas to xarray.Use a default_model_config property to define default priors. This allows users to easily override specific priors without rewriting the whole model.
Always use pymc_extras.prior.Prior for defining distributions. This provides a dictionary-based specification that is serializable and easy for users to modify.
from pymc_extras.prior import Prior
@property
def default_model_config(self) -> dict:
return {
"alpha": Prior("Weibull", alpha=2, beta=10),
"beta": Prior("Normal", mu=0, sigma=1),
}
pytest.@pytest.mark.parametrize to test multiple scenarios efficiently.@pytest.mark.parametrize("future_t", [1, 10])
def test_expected_purchases(model, future_t):
pred = model.expected_purchases(future_t=future_t)
assert pred.shape == (4000, 100) # (chains*draws, customers)
Before submitting a PR:
make lint (runs Ruff).pre-commit run mypy --all-files.make test.Media Mix Modeling with PyMC-Marketing. Use when building MMMs, specifying adstock/saturation transformations, setting priors, fitting multidimensional (geo-level) models, computing channel contributions, ROAS, running budget optimization, calibrating with lift tests, or performing sensitivity analysis. Covers the MMM class, GeometricAdstock, LogisticSaturation, BudgetOptimizerWrapper, and ArviZ diagnostics for marketing models.
Create git commits for changes made during the session.
Create a plan based on document research through an interactive, iterative process.
Structure a research based on the user request. Identify what must change in order to complete the task.
Learn how to work in the current folder (repository)
Create git commits for changes made during the session.