Simulate a Node Using Multinomial Regression
node_multinomial.Rd
Data from the parents is used to generate the node using multinomial regression by predicting the covariate specific probability of each class and sampling from a multinomial distribution accordingly.
Usage
node_multinomial(data, parents, betas, intercepts,
labels=NULL, output="factor",
return_prob=FALSE)
Arguments
- data
A
data.table
(or something that can be coerced to adata.table
) containing all columns specified byparents
.- parents
A character vector specifying the names of the parents that this particular child node has.
- betas
A numeric matrix with
length(parents)
columns and one row for each class that should be simulated, specifying the causal beta coefficients used to generate the node.- intercepts
A numeric vector with one entry for each class that should be simulated, specifying the intercepts used to generate the node.
- labels
An optional character vector giving the factor levels of the generated classes. If
NULL
(default), the integers are simply used as factor levels.- output
A single character string specifying the output format. Must be one of
"factor"
(default),"character"
or"numeric"
. If the argumentlabels
is supplied, the output will coerced to"character"
by default.- return_prob
Either
TRUE
orFALSE
(default). Specifies whether to return the matrix of class probabilities or not. If you are using this function inside of anode
call, you cannot set this toTRUE
because it will return a matrix. It may, however, be useful when using this function by itself, or as a probability generating function for thenode_competing_events
function.
Details
This function works essentially like the node_binomial
function. First, the matrix of betas
coefficients is used in conjunction with the values defined in the parents
nodes and the intercepts
to calculate the expected subject-specific probabilities of occurrence for each possible category. This is done using the standard multinomial regression equations. Using those probabilities in conjunction with the rcategorical
function, a single one of the possible categories is drawn for each individual.
Since this function produces categorical output (as it should), it may be difficult to use this node type as a parent for other nodes. Nevertheless, it is of course possible using a user-defined node type (see node_custom
for some infos on how to define those).
Value
Returns a vector of length nrow(data)
. Depending on the used arguments, this vector may be of type character, numeric of factor. If return_prob
was used it instead returns a numeric matrix containing one column per possible event and nrow(data)
rows.
Examples
library(simDAG)
set.seed(3345235)
dag <- empty_dag() +
node("age", type="rnorm", mean=50, sd=4) +
node("sex", type="rbernoulli", p=0.5) +
node("UICC", type="multinomial", parents=c("sex", "age"),
betas=matrix(c(0.2, 0.4, 0.1, 0.5, 1.1, 1.2), ncol=2),
intercepts=1)
sim_dat <- sim_from_dag(dag=dag, n_sim=100)