Skip to content

Connect data to simulations

By specifying data in a useful way, it is possible to automate the creation of simulations, and connect them to the data. There are most certainly many ways to do this, but one that we have found to be useful is to do the following:

  1. Define data in a .json format (which can be imported to dicts)
    1. The top level corresponds to an experiment
    2. There is a field in each experiment that defines the input
    3. There is a field for each observable (i.e. the thing measured) in the experiment
  2. Make sure that the models have one feature for each measurable (or handle it however you see fit)
  3. Create simulations for the experiment using the input in each experiment
  4. Iterate over the experiments, simulate, iterate over the observables/features

The data file

Here is an example of a datafile:

data.json
{
  "dose1": {
    "inputs": { "activation": { "t": [5], "f": [0, 1], "unit": "nM", "type": "Piecewise constant" } },
    "time unit": "m",
    "observables": {
      "Rp": {
        "time": [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55],
        "mean": [
          0.0022111802021002169, 0.0017466503673310578, 0.076964673078638177,
          0.060160763706753204, 0.050972701882345074, 0.037680234623606632,
          0.02794873341532917, 0.034981410094982145, 0.028260930343034556,
          0.028707701931256373, 0.019013511890922985, 0.021886781125434334
        ],
        "SEM": [
          0.0033467258866709568, 0.0033467258866709568, 0.0035224608493589068,
          0.0045910672016274707, 0.0051341955689141313, 0.0034730839444259542,
          0.0036044146415795809, 0.0038831063575777217, 0.0019989076589235627,
          0.0047650618307062019, 0.0036907308145306206, 0.0018120692445672429
        ]
      }
    }
  },
  "dose2": {
    "inputs": { "activation": { "t": [5], "f": [0, 2], "unit": "nM", "type": "Piecewise constant" } },
    "time unit": "m",
    "observables": {
      "Rp": {
        "time": [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55],
        "mean": [
          0.0030111802021002169, 0.0022719853305979516, 0.126991710579753,
          0.08527788255432267, 0.076459052823517615, 0.056520351935409947,
          0.041923100122993759, 0.0325327113883334, 0.039307986709475058,
          0.026698162796068427, 0.028520267836384477, 0.0328301716881515
        ],
        "SEM": [
          0.0033467258866709568, 0.0033467258866709568, 0.00528369127403836,
          0.006886600802441206, 0.0077012933533711966, 0.0052096259166389313,
          0.0054066219623693712, 0.004659727629093266, 0.0029983614883853441,
          0.0071475927460593024, 0.0055360962217959313, 0.0071475927460593024
        ]
      }
    }
  }
}

The model file

Here is an example of a model file, which contains features for the observables in the data file:

M1.txt
########## NAME
M1
########## METADATA
time_unit = m
########## MACROS
########## STATES
d/dt(R) = r3+r2-r1
d/dt(Rp) = r1-r2-r3

R(0) = 1.0
Rp(0) = 0.0

########## PARAMETERS
k1 = 1.0
k2 = 0.0001
kfeed = 1000000.0

########## VARIABLES
r1 = R*A*k1
r2 = Rp*k2
r3 = Rp*kfeed

########## FUNCTIONS
########## EVENTS
########## OUTPUTS
########## INPUTS
A = activation @ 1

########## FEATURES
Rp = Rp

Automatically creating the simulations

Now that we have data which defines the inputs and observables, and a model file which has features for all observables, we can create simulations from the data and model file:

with open("data.json", "r") as f:
    data = json.load(f)

# Get all times in all observables for each experiment:
for experiment in data:
    all_times = [t
                 for observable in data[experiment]["observables"]
                 for t in data[experiment]["observables"][observable]["time"]]
    data[experiment]["all_times"] = list(sorted(np.unique(all_times)))

# Load the model (we assume it is installed already)
model = sund.load_model("model")

# Create the simulation objects using `data` and `model`
sims = {}

for experiment, experiment_values in data.items():
    time_unit = experiment_values["time unit"]
    activity = sund.Activity(time_unit = time_unit) # adapt to whichever time unit you have in data
    for input_key, input_values in experiment_values["inputs"].items():
        if input_values["type"] == "Piecewise constant":
            activity.add_output("piecewise_constant", input_key, t=input_values["t"], f=input_values["f"])
        elif input_values["type"] == "constant":
            activity.add_output("constant", input_key, t=input_values["t"], f=input_values["f"])

    sims[experiment] = sund.Simulation(models = [model], activities = [activity], time_unit = time_unit, time_vector = experiment_values["all_times"])

Compare the simulation/data by iterative over the experiments and observables/features

Once you have the simulations corresponding to the experiments in data, it is possible to e.g. in an objective function iterate over the experiments, do the simulation, and compare all observables to all features:

def object_function(params, sims, data):
    cost = 0  # Initialize the total cost

    for experiment, experiment_data in data.items():
        try:
            # Simulate the experiment with the given parameters
            sims[experiment].simulate(parameter_values=params, reset=True)

            # Iterate over all observables in the experiment
            for observable_key, observable_data in experiment_data["observables"].items():
                # Find the index of the observable in the simulation's feature list
                feature_idx = sims[experiment].feature_names.index(observable_key)

                # Find the indices of the time points in the simulation that match the observable's time points
                t_idx = np.where(np.isin(sims[experiment].time_vector, observable_data["time"]))

                # Extract the simulated data for the observable at the matching time points
                y_sim = sims[experiment].feature_data[t_idx, feature_idx]

                # Extract the experimental mean and standard error of the mean (SEM)
                y_exp = observable_data["mean"]
                SEM = observable_data["SEM"]

                # Compute the squared error, normalized by SEM, and add it to the total cost
                cost += np.sum(np.square((y_sim - y_exp) / SEM))

    return cost