Model Configuration#
This notebook provides details about the model configuration in PyMC-Marketing. This API provides a very flexible way to express the model Priors and likelihoods so that we can have enough expressiveness to build custom models.
Setup#
import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
from pymc_extras.prior import Prior
from pymc_marketing.model_config import ModelConfigError, parse_model_config
az.style.use("arviz-darkgrid")
plt.rcParams["figure.figsize"] = [12, 7]
plt.rcParams["figure.dpi"] = 100
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = "retina"
seed = sum(map(ord, "Create priors to reflect your marketing beliefs"))
rng = np.random.default_rng(seed)
Prior Distributions#
The Prior class is our way to expression distributions and relationships between them.
Basic Usage#
Every prior will need a distribution name that comes from PyMC. View the full list in the documentation here.
scalar_distribution = Prior("Normal")
scalar_distribution
Prior("Normal")
If not, then an exception will be raised.
try:
    Prior("UnknownDistribution")
except Exception as e:
    print(e)
PyMC doesn't have a distribution of name 'UnknownDistribution'
Specific parameters can be passed as keyword arguments but are not required if the PyMC distribution has defaults.
scalar_distribution = Prior("Normal", mu=0, sigma=3.5)
scalar_distribution
Prior("Normal", mu=0, sigma=3.5)
There will be a check at initialization against the PyMC distribution.
try:
    Prior("Normal", mu=0, b=1)
except Exception as e:
    print(e)
Parameters {'b', 'mu'} are not a subset of the pymc distribution parameters {'mu', 'sigma', 'tau'}
However, there are some limitations to that. Take this invalid parameterization below:
invalid_distribution = Prior("Normal", mu=1, sigma=1, tau=2)
The create_variable method is used to make variables from this distribution and it isn’t until then that there is an error.
with pm.Model():
    try:
        invalid_distribution.create_variable("mu")
    except Exception as e:
        print(e)
Can't pass both tau and sigma
This goes for incorrect values as well!
invalid_distribution = Prior("Normal", mu=1, sigma=-1)
invalid_distribution
Prior("Normal", mu=1, sigma=-1)
The dims parameter are used to express the dimensions of the distribution. This variable is not part of a model so the values are not reflected with the distribution itself.
vector_distribution = Prior("Normal", dims="channel")
vector_distribution
Prior("Normal", dims="channel")
If there are dimensions, then the coords need to exist in the larger model. But they exist separately and can defined in advance.
Tip
Each Prior instance is just instructions to create a variable, so they can be used multiple times!
coords = {"channel": ["C1", "C2"]}
with pm.Model(coords=coords) as model:
    alpha = vector_distribution.create_variable("alpha")
    beta = vector_distribution.create_variable("beta")
pm.model_to_graphviz(model)
If the coords were not specified then this would cause a PyMC error.
with pm.Model() as model:
    try:
        vector_distribution.create_variable("var")
    except Exception as e:
        print(e)
"Dimensions {'channel'} are unknown to the model and cannot be used to specify a `shape`."
The variables can get arbitrarily large as well by providing additional dims in a tuple.
matrix_distribution = Prior("Normal", dims=("channel", "geo"))
matrix_distribution
Prior("Normal", dims=("channel", "geo"))
tensor_distribution = Prior("Normal", dims=("channel", "geo", "store"))
tensor_distribution
Prior("Normal", dims=("channel", "geo", "store"))
Hierarchical Variables#
Hierarchical variables can be defined by using distributions as the parameters of another distribution. The parent distributions will usually have a larger dimensionality than each of the parameters.
hierarchical_variable = Prior(
    "Normal",
    mu=Prior("Normal"),
    sigma=Prior("HalfNormal"),
    dims="channel",
)
We can use the to_graph method to create visualize the variable with dummy coordinates.
Note
No need to worry about variable naming! The child parameters will all be automatically named based on the parent.
hierarchical_variable.to_graph()
Warning
The validity of the parameter values will not be checked.
There can be negative sigma parameter in a Normal distribution whether that comes from value or another distribution!
Prior("Normal", mu=1, sigma=-1)
Prior("Normal", mu=1, sigma=-1)
Prior("Normal", sigma=Prior("Normal"))
Prior("Normal", sigma=Prior("Normal"))
Tip
Model reparamterizations can help with model convergence depending on the model posterior!
For Normal distribution, the common non-centered parameterization is supported with the centered flag.
non_centered_hierarchical_variable = Prior(
    "Normal",
    mu=Prior("Normal"),
    sigma=Prior("HalfNormal"),
    dims="channel",
    # Flag for non-centered
    centered=False,
)
non_centered_hierarchical_variable.to_graph()
Other distributions can be hierarchical as well. Just use distributions for parameters of a parent distribution. For instance, the Beta distribution has two positive parameters, alpha and beta which can be reflect as HalfNormal distributions.
zero_to_one_variable = Prior(
    "Beta",
    alpha=Prior("HalfNormal"),
    beta=Prior("HalfNormal"),
    dims="channel",
)
zero_to_one_variable.to_graph()
Transformations#
The transform variable can be used for any of the distributions to change its domain.
These transformations will be taken from pytensor.tensor or pm.math in that order.
Below is a non-centered hierarchical Normal distribution that is put through a sigmoid function in order to change the domain to (0, 1).
hierarchical_zero_to_one_distribution = Prior(
    "Normal",
    mu=Prior("Normal"),
    sigma=Prior("HalfNormal"),
    dims="channel",
    centered=False,
    transform="sigmoid",
)
hierarchical_zero_to_one_distribution.to_graph()
Prior Visualization#
For scalar distributions, make use of the preliz attribute of the model in order to visualize.
Tip
Defaults for the parameters will have to be provided or there will be an error.
beta_distribution = Prior("Beta", alpha=1, beta=4)
beta_distribution.preliz.plot_pdf();
 
Tip
Using the constrain method can help hone in on a prior. It is a wrapper around PreliZ’s maxent function.
constrained_distribution = Prior("Normal").constrain(lower=4, upper=6)
constrained_distribution
Prior("Normal", mu=4.999999933423391, sigma=0.5102139704250529)
To keep a parameter value fixed, pass it to the Prior instance. In the following example, the mean is fixed at 2.
constrained_distribution = Prior("Gamma", mu=2).constrain(lower=1, upper=10)
constrained_distribution
Prior("Gamma", mu=2.0, sigma=0.7030365085992409)
For more complicated distributions, consider using the sample_prior method in order to access the prior. Or PreliZ’s functions like predictive explorer.
Note
Any coordinates will have to be passed like in pm.Model.
complicated_distribution = Prior(
    "Normal",
    mu=Prior("Normal", sigma=1),
    sigma=Prior("Normal", mu=-1, sigma=0.15, transform="exp"),
    centered=False,
    dims="channel",
    transform="sigmoid",
)
coords = {
    "channel": ["C1", "C2"],
}
prior = complicated_distribution.sample_prior(
    coords=coords, samples=1_000, random_seed=rng
)
prior
Sampling: [var_raw_mu, var_raw_offset, var_raw_sigma_raw]
<xarray.Dataset>
Dimensions:            (chain: 1, draw: 1000, channel: 2)
Coordinates:
  * chain              (chain) int64 0
  * draw               (draw) int64 0 1 2 3 4 5 6 ... 994 995 996 997 998 999
  * channel            (channel) <U2 'C1' 'C2'
Data variables:
    var                (chain, draw, channel) float64 0.336 0.3785 ... 0.7889
    var_raw            (chain, draw, channel) float64 -0.6811 -0.496 ... 1.318
    var_raw_mu         (chain, draw) float64 -0.527 1.038 1.526 ... 0.8435 1.136
    var_raw_offset     (chain, draw, channel) float64 -0.4456 0.08964 ... 0.5605
    var_raw_sigma      (chain, draw) float64 0.3459 0.4046 ... 0.4709 0.3259
    var_raw_sigma_raw  (chain, draw) float64 -1.062 -0.905 ... -0.7531 -1.121
Attributes:
    created_at:                 2024-09-09T13:18:06.084229+00:00
    arviz_version:              0.20.0.dev0
    inference_library:          pymc
    inference_library_version:  5.15.1The prior variables can be visualized in whatever way works best. Here is a KDE of the variables to see the marginal distributions.
fig, axes = plt.subplots(
    nrows=2, ncols=1, figsize=(10, 9), sharex=True, sharey=True, layout="constrained"
)
az.plot_posterior(prior, var_names=["variable"], grid=(2, 1), ax=axes)
fig.suptitle("Prior Distribution", fontsize=18, fontweight="bold");
 
Use any workflow to make sense of the prior distribution! For instance, the joint distribution which shows these two channels are correlated due to the hierarchical generation.
def plot_correlation(df: pd.DataFrame, x: str = "C1", y: str = "C2") -> plt.Axes:
    corr = df.loc[:, [x, y]].corr().iloc[0, 1]
    title = f"Joint distribution between {x} and {y} ({corr = :.3f})"
    return df.plot.scatter(x=x, y=y, title=title)
ax = prior["variable"].to_series().unstack().pipe(plot_correlation)
padding = 0.025
bounds = (0 - padding, 1 + padding)
ax.set(xlim=bounds, ylim=bounds);
 
Automatic Broadcasting#
The broadcasting of the dims will be handled automatically.
For instance, the mu of the variable needs to be transposed in order to work with ("channel", "geo") dims.
With all this functionality, we can see that the prior distributions can become quite expressive!
def create_2d_variable(mu_dims, sigma_dims) -> Prior:
    mu = Prior(
        "Normal",
        mu=Prior("Normal"),
        sigma=Prior("HalfNormal"),
        dims=mu_dims,
    )
    sigma = Prior(
        "Normal",
        mu=Prior("Normal"),
        sigma=Prior("HalfNormal"),
        centered=False,
        dims=sigma_dims,
        transform="exp",
    )
    return Prior(
        "Normal",
        mu=mu,
        sigma=sigma,
        dims=("channel", "geo"),
    )
variable_2d = create_2d_variable(mu_dims="channel", sigma_dims="geo")
variable_2d.to_graph()
And the user can spend more time thinking about just the assumptions of the variables rather than logic to ensure the dimensions work.
different_assumptions_2d = create_2d_variable(mu_dims="channel", sigma_dims="channel")
different_assumptions_2d.to_graph()
Serialization#
The to_dict and from_dict methods can be helpful for storage of the distributions.
variable_2d_dict = variable_2d.to_dict()
variable_2d_dict
{'dist': 'Normal',
 'kwargs': {'mu': {'dist': 'Normal',
   'kwargs': {'mu': {'dist': 'Normal'}, 'sigma': {'dist': 'HalfNormal'}},
   'dims': ('channel',)},
  'sigma': {'dist': 'Normal',
   'kwargs': {'mu': {'dist': 'Normal'}, 'sigma': {'dist': 'HalfNormal'}},
   'centered': False,
   'dims': ('geo',),
   'transform': 'exp'}},
 'dims': ('channel', 'geo')}
Prior.from_dict(variable_2d_dict).to_graph()
Use in PyMC-Marketing#
Distributions will be expressed in this manner throughout the package including but not limited to:
- MMM components 
Backwards compatibility#
The from_dict method will create a Prior instance from the previous format.
For instance, take this previous configuration:
old_model_config = {
    "alpha": {
        "dist": "Normal",
        "kwargs": {
            "mu": 0,
            "sigma": 1,
        },
    },
    "beta": {
        "dist": "Laplace",
        "kwargs": {
            "mu": 1,
            "b": 0.5,
        },
    },
}
This can be parsed with the Prior.from_dict constructor for each key. Much more consise too!
new_model_config = {
    name: Prior.from_dict(key) for name, key in old_model_config.items()
}
new_model_config
{'alpha': Prior("Normal", mu=0, sigma=1),
 'beta': Prior("Laplace", mu=1, b=0.5)}
new_model_config["alpha"].to_dict()
{'dist': 'Normal', 'kwargs': {'mu': 0, 'sigma': 1}}
The parse_model_config function will do just this and is used internally. It also provides some deprecation warnings.
parse_model_config(old_model_config)
{'alpha': Prior("Normal", mu=0, sigma=1),
 'beta': Prior("Laplace", mu=1, b=0.5)}
As well as the ability to catch some errors while parsing.
invalid_model_config = {
    "alpha": {
        "dist": "InvalidDistribution",
    },
    "beta": {
        "dist": "Normal",
        "kwargs": {"mu": "one", "sigma": "two"},
    },
    "gamma": {
        "dist": "HalfNormal",
        "kwargs": {"mu": 1},
    },
}
try:
    parse_model_config(invalid_model_config)
except ModelConfigError as e:
    print(e)
3 errors occurred while parsing model configuration. Errors: Parameter alpha: PyMC doesn't have a distribution of name 'InvalidDistribution', Parameter beta: Parameters must be one of the following types: (int, float, np.array, Prior, pt.TensorVariable). Incorrect parameters: {'mu': <class 'str'>, 'sigma': <class 'str'>}, Parameter gamma: Parameters {'mu'} are not a subset of the pymc distribution parameters {'sigma', 'tau'}
Appendix: Validation#
Behind the scenes, the Prior class uses th validate_call from Pydantic to ensure that the parameters are valid.`
try:
    Prior()
except Exception as e:
    print(e)
1 validation error for __init__
distribution
  Missing required argument [type=missing_argument, input_value=ArgsKwargs(<unprintable tuple object>), input_type=ArgsKwargs]
    For further information visit https://errors.pydantic.dev/2.9/v/missing_argument
try:
    Prior(1)
except Exception as e:
    print(e)
1 validation error for __init__
1
  Input should be a valid string [type=string_type, input_value=1, input_type=int]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type
%load_ext watermark
%watermark -n -u -v -iv -w -p pymc_marketing,pytensor
Last updated: Mon Sep 09 2024
Python implementation: CPython
Python version       : 3.11.5
IPython version      : 8.16.1
pymc_marketing: 0.8.0
pytensor      : 2.22.1
pandas    : 2.1.2
matplotlib: 3.8.4
arviz     : 0.20.0.dev0
numpy     : 1.24.4
pymc      : 5.15.1
Watermark: 2.4.3
