Simulate a Node Using Poisson Regression
node_poisson.Rd
Data from the parents is used to generate the node using poisson regression by predicting the covariate specific lambda and sampling from a poisson distribution accordingly.
Arguments
- data
A
data.table
(or something that can be coerced to adata.table
) containing all columns specified byparents
.- parents
A character vector specifying the names of the parents that this particular child node has. If non-linear combinations or interaction effects should be included, the user may specify the
formula
argument instead.- formula
An optional
formula
object to describe how the node should be generated orNULL
(default). If supplied it should start with~
, having nothing else on the left hand side. The right hand side may contain any valid formula syntax, such asA + B
orA + B + I(A^2)
, allowing non-linear effects. If this argument is defined, there is no need to define theparents
argument. For example, usingparents=c("A", "B")
is equal to usingformula= ~ A + B
.- betas
A numeric vector with length equal to
parents
, specifying the causal beta coefficients used to generate the node.- intercept
A single number specifying the intercept that should be used when generating the node.
Details
Essentially, this function simply calculates the linear predictor defined by the betas
-coefficients, the intercept
and the values of the parents
. The exponential function is then applied to this predictor and the result is passed to the rpois
function. The result is a draw from a subject-specific poisson distribution, resembling the user-defined poisson regression model.
Formal Description:
Formally, the data generation can be described as:
$$Y \sim Poisson(\lambda),$$
where \(Poisson()\) means that the variable is Poisson distributed with:
$$P_\lambda(k) = \frac{\lambda^k e^{-\lambda}}{k!}.$$
Here, \(k\) is the count and \(e\) is eulers number. The parameter \(\lambda\) is determined as:
$$\lambda = \exp(\texttt{intercept} + \texttt{parents}_1 \cdot \texttt{betas}_1 + ... + \texttt{parents}_n \cdot \texttt{betas}_n),$$
where \(n\) is the number of parents (length(parents)
).
For example, given intercept=-15
, parents=c("A", "B")
, betas=c(0.2, 1.3)
the data generation process is defined as:
$$Y \sim Poisson(\exp(-15 + A \cdot 0.2 + B \cdot 1.3)).$$