Skip to contents

Data from the parents is used to generate the node using negative binomial regression by applying the betas to the design matrix and sampling from the rnbinom function.

Usage

node_negative_binomial(data, parents, formula=NULL, betas,
                       intercept, theta)

Arguments

data

A data.table (or something that can be coerced to a data.table) containing all columns specified by parents.

parents

A character vector specifying the names of the parents that this particular child node has. If non-linear combinations or interaction effects should be included, the user may specify the formula argument instead.

formula

An optional formula object to describe how the node should be generated or NULL (default). If supplied it should start with ~, having nothing else on the left hand side. The right hand side may contain any valid formula syntax, such as A + B or A + B + I(A^2), allowing non-linear effects. If this argument is defined, there is no need to define the parents argument. For example, using parents=c("A", "B") is equal to using formula= ~ A + B.

betas

A numeric vector with length equal to parents, specifying the causal beta coefficients used to generate the node.

intercept

A single number specifying the intercept that should be used when generating the node.

theta

A single number specifying the theta parameter (size argument in rnbinom).

Details

This function uses the linear predictor defined by the betas and the input design matrix to sample from a subject-specific negative binomial distribution. It does to by calculating the linear predictor using the data, betas and intercept, exponentiating it and passing it to the mu argument of the rnbinom function of the stats package.

Author

Robin Denz

Value

Returns a numeric vector of length nrow(data).

Examples

library(simDAG)

set.seed(124554)

dag <- empty_dag() +
  node("age", type="rnorm", mean=50, sd=4) +
  node("sex", type="rbernoulli", p=0.5) +
  node("smoking", type="negative_binomial", theta=0.05,
       formula= ~ -2 + sexTRUE*1.1 + age*0.4)

sim_dat <- sim_from_dag(dag=dag, n_sim=100, sort_dag=FALSE)