

All fully-capitalized words and phrases have the meanings specified in RFC 2119.

Agent-Based Model
Agent-Based Models

Agent-based modeling is a modeling paradigm in which population-level phenomena are emergent from interactions among simple agents. An agent-based model is a model constructed using this paradigm.

Boundary Store
Boundary Stores

Compartments interact through boundary stores that represent how the compartments affect each other. For example, between an environment compartment and a cell compartment, there might be a boundary store to track the flux of metabolites from the cell to the environment and vice versa.


We organize our models into compartments, each of which is like an agent in an agent-based model. Each compartment stores a collection of processes that operate on the compartment’s stores simultaneously. Compartments can be nested and interact with neighbor, parent, and child compartments through boundary stores. Thus, a model might contain a compartment for the environment that contains two child compartments for the two cells in the environment. For more details, see our guide to compartments.


Derivers run after all processes have run for a timepoint and compute values from the state of the model. These computed values are generally stored in the global store. For example, one common deriver uses the cell’s mass and density to compute the volume. For details on the particular derivers available, see the documentation for vivarium.core.repository.


When a cell divides, we have to decide how to generate the states of its daughter cells. Dividers specify how to generate these daughter cells, for example by assigning half of the value of the variable in the mother cell to each of the daughter cells. We assign a divider to each variable in the schema. For more details, see the documentation for vivarium.core.repository.

Embedded Timeseries

An embedded timeseries has nearly the same shape as a simulation state dictionary, only each variable’s value is a list of values over time, and there is an additional time key. For details, see the guide on simulation data formats.


While a simulation is running, the current state is stored in stores, but this information is overwritten at each timestep with an updated state. When we want to save off variable values for later analysis, we send these data to one of our emitters, each of which formats the data for a storage medium, for example a database or a Kafka message. We then query the emitter to get the formatted data.


The flux between a cell and its environment. This is stored in a boundary store.


Vivarium defines simulations using vivarium.core.experiment.Experiment objects. These simulations can contain arbitrarily nested compartments, and you can run them to simulate your model over time. See the documentation for the Experiment class and our guide to experiments for more details.


When Vivarium passes stores to processes, it includes only the variables the process has requested. We call this filtering masking.

Multiscale Model
Multiscale Models

Multiscale models use different spatial and temporal scales for their component sub-models. For example, Vivarium models a cell’s internal processes and the interactions between cells and their environment at different temporal scales since these processes require different degrees of temporal precision.

Path Timeseries

A path timeseries is a flattened form of an embedded timeseries where keys are paths in the simulation state dictionary and values are lists of the variable value over time. We describe simulation data formats in more detail in our guide to simulation data formats.


When a process needs access to part of the model state, it will be provided a store. The ports of a process are what the process calls those stores. When running a process, you provide a store to each of the process’s ports. Think of the ports as physical ports into which a cable to a store can be plugged.


A process in Vivarium models a cellular process by defining how the state of the model should change at each timepoint, given the current state of the model. During the simulation, each process is provided with the current state of the model and the timestep, and the process returns an update that changes the state of the model. Each process is an instance of a process class.

To learn how to write a process, check out our process-writing tutorial. For a detailed guide to processes, see our guide to processes.

Process Class
Process Classes

A process class is a Python class that defines a process’s model. These classes can be instantiated, and optionally configured, to create processes. Each process class must subclass either vivarium.core.process.Process or another process class.

Raw Data

The primary format for simulation data is “raw data.” See the guide on simulation data formats.


A schema defines the properties of a set of variables by associating with each variable a set of schema key-value pairs.

Schema Key
Schema Keys
Schema Value
Schema Values
Schema Key-Value Pair
Schema Key-Value Pairs

Each variable is defined by a set of schema key-value pairs. The available keys are defined in vivarium.core.experiment.Store.schema_keys. These keys are described in more detail in the documentation for vivarium.core.experiment.Store.


The state of the model is broken down into stores, each of which represents the state of some physical or conceptual subset of the overall state. For example, a cell model might have a store for the proteins in the cytoplasm, another for the transcripts in the cytoplasm, and one for the transcripts in the nucleus. Each variable must belong to exactly one store.


A template describes a genetic element, its binding site, and the available downstream termination sites on genetic material. A chromosome has operons as its templates which include sites for RNA binding and release. An mRNA transcript also has templates which describe where a ribosome can bind and will subsequently release the transcript. Templates are defined in template specifications.

Template Specification
Template Specifications

Template specifications define templates as dict objects with the following keys:

  • id (str): The template name. You SHOULD use the name of the associated operon or transcript.

  • position (int): The index in the genetic sequence of the start of the genetic element being described. In a chromosome, for example, this would denote the start of the modeled operon’s promoter. On mRNA transcripts (where we are describing how ribosomes bind), this SHOULD be set to 0.

  • direction (int): 1 if the template should be read in the forward direction, -1 to proceed in the reverse direction. For mRNA transcripts, this SHOULD be 1.

  • sites (list): A list of binding sites. Each binding site is specified as a dict with the following keys:

    • position (int): The offset in the sequence from the template position to the start of the binding site. This value is not currently used and MAY be set to 0.

    • length (int): The length, in base-pairs, of the binding site. This value is not currently used and MAY be set to 0.

    • thresholds (list): A list of tuples, each of which has a factor name as the first element and a concentration threshold as the second. When the concentration of the factor exceeds the threshold, the site will bind the factor. For example, in an operon the factor would be a transcription factor.

  • terminators (list): A list of terminators, which halt reading of the template. As such, which genes are encoded on a template depends on which terminator halts transcription or translation. Each terminator is specified as a dict with the following keys:

    • position (int): The index in the genetic sequence of the terminator.

    • strength (int): The relative strength of the terminator. For example, if there remain two terminators ahead of RNA polymerase, the first of strength 3 and the second of strength 1, then there is a 75% chance that the polymerase will stop at the first terminator. If the polymerase does not stop, it is guaranteed to stop at the second terminator.

    • products (list): A list of the genes that will be transcribed or translated should transcription/translation halt at this terminator.


We discretize time into timepoints and update the model state at each timepoint. We collect data from the model at each timepoint. Note that each compartment may be running with different timesteps depending on how finely we need to discretize time.


“Timeseries” can refer to the general way in whcih we store simulation data or to an embedded timeseries. See the guide on simulation data formats for details.


The amount of time elapsed between two timepoints. This is the amount of time for which processes compute an update. For example, if we discretize time into two-second intervals, then each process will be asked to compute an update for how the state changes over the next two seconds. The timestep is two seconds.


A topology defines how stores are associated to ports. This tells Vivarium which store to pass to each port of each process during the simulation. See the constructor documentation for vivarium.core.experiment.Experiment for a more detailed specification of the form of a topology.


We nest the stores of a model to form a tree called a hierarchy. Each internal node is a store and each leaf node is a variable. This tree can be traversed like a directory tree, and stores are identified by paths. For details see the hierarchy guide. Note that this used to be called a tree.


An update describes how the model state should change due to the influence of a process over some period of time (usually a timestep).


An updater describes how an update should be applied to the model state to produce the updated state. For example, the update could be added to the old value or replace it. Updaters are described in more detail in the documentation for vivarium.core.repository.


The state of the model is a collection of variables. Each variable stores a piece of information about the full model state. For example, the concentration of glucose in the cytoplasm might be a variable, while the concentration of glucose-6-phosphate in the cytoplasm is another variable. The extracellular concentration of glucose might be a third variable. As these examples illustrate, variables are often track the amount of a molecule in a physical region. Exceptions exist though, for instance whether a cell is dead could also be a variable.

Each variable is defined by a set of schema key-value pairs.

Whole-Cell Model
Whole-Cell Models

Whole-cell models seek to simulate a cell by modeling the molecular mechanisms that occur within it. For example, a cell’s export of antibiotics might be modeled by the transcription of the appropriate genes, translation of the produced transcripts, and finally complexation of the translated subunits. Ideally the simulated phenotype is emergent from the modeled processes, though many such models also include assumptions that simplify the model.