Validation tool kit 


Disease models can be used – as in the ProValue trial – to enhance outcome of a clinical trial by another component. For example, disease models can use the outcome of a trial to interpolate the follow up period after the trial itself is finished. This field of usage can make disease models useful tools for researchers and decision makers in health care. To effectively support decisions, it is crucial that the disease model itself is valid. Validity of a model can be shown by a validation process. This process covers the input of the model, its calculations and the precision of its outputs. In each of those validation steps different validation techniques are used. The structure of a model can be validated using a face validity check, the internal validation is done by performing sensitivity analyses and extreme values analyses, the external validation compares the outcomes of a disease model with real world data, e.g. obtained through previous clinical trials. The purpose of this document is to describe the steps we – the Prosit modelling group – undertook to perform an internal validation. On this page we describe the validation toolkit we have set up. The results of the internal validation are presented elsewhere.


We used LibreOffice Version to develop our validation toolkit, to keep it consistent with the software we used to develop our six diabetes models. We developed it using the same template as for our models to keep the look and feel consistent. To be able to validate all diabetes models using the same input we developed an input distributor which can distribute the same input to all models. Implementation of the validation toolkit is primarily based on cell formulas and macro programming. As a next step, we created one validation file for each sub-model. Within this file the validation toolkit was tailored to be able to validate inputs of one specific sub-model.


We can perform deterministic sensitivity analyses (DSA) and extreme value analyses (EVA) with our validation toolkit.

Deterministic Sensitivity Analyses

We decided to use a 10% and 25% variation of each parameter in the DSA to show the effects of a moderate and a big change of the input parameter. We vary the cohort characteristics, the quality of life values (QALY), the cost values, and the initial distribution of health states. The proportional variation of input parameters was the same for all validation files. To have a comparator for the validation results, we start the validation process by a single model validation run inputted with baseline parameters. The overall Quality of Life (QoL) and costs of this model run are stored on the results page of the validation file as reference values. Afterwards a single parameter is changed, and another model run of the sub-model is performed. The validation file automatically retrieves the calculated total QoL and total costs of this model run. The input parameters are then set to their original values to prepare the model for the validation of the next parameter. In the end, we present the effect of the change using tornado plots for the costs, the QoL and the cost-effectiveness. In all diagrams we compare the results of the DSA to those of the baseline calculations. Furthermore, we create cost-effectiveness planes to show the scatter plot of all variations.

Quality of Life Values

We are doing a one way DSA for all QoL values of all health states included in the specific sub-model, by altering each parameter by 25%, -10%, 10%, and 25%.


DSA for all costs of all health states included in the specific sub-model is conducted by altering each parameter by 25%, -10%, 10%, and 25%.

Cohort Characteristics

All sub-models define the overall cohort by 16 distinct sub-cohorts. Each of those sub-cohorts has its own characteristics describing the sub-cohort, e.g., age, blood pressure, etc. To vary a specific attribute of the overall cohort, the parameter describing this attribute is increased/ decreased simultaneously in each sub-cohort, calculating the altered value for each sub-cohort individually. Applying this approach, a one way DSA for all attributes defining the characteristics of the cohort is calculated, varying them by 25%, -10%, 10%, and 25%.

Input Distribution over Health States

Similar to the cohort characteristics the distributions are varied for each sub-cohort individually. Each proportion of the cohort starting in a health state is varied based on proportion in the baseline cohort. The proportions in the remaining health states are also changed in a way that the overall distribution adds up to 100%. Furthermore, the proportion of the proportions which are not subject to this DSA stays the same during this adaption. Each proportion is varied by 25%, -10%, 10%, and 25%.

Extreme Value Analyses

The EVA only covers the cohort characteristics. The extreme values were obtained from a clinical expert. These values describe the highest and lowest value which could possibly be observed in a human. We used the same extreme values in all validation files. We report the EVA using a tornado plot and a scatter plot with the baseline calculation as a reference value

Input Distributor

By the help of this file, users only have to input the cohort characterisics once. They will then get distributed to all Prosit diabetes models, providing that the file names are spelled correctly and the model files are in the same folder as the distributor. To distribute input values we use macro programming. The destination files should be closed before starting the distribution.


As of now, the toolkit cannot automatically perform an external validation. Other methods of internal validation, like probabilistic sensitivity analysis are not part of the toolkit as of now. The presented toolkit is only able to internally validate all six sub-models in the Prosit framework. We decided to use 10% and 25% as standard values for the DSA. These values are not fixed and can be changed easily. It is also possible to add further percentages.


The toolkit is a powerful support in the model development process, but it is limited in its field of usage. Firstly, we decided to not vary the size of each sub-cohort. As we have got 16 sub-cohorts even a big change of size, e.g. 25%, in one of the cohorts will only have a very small effect on the outcomes. Secondly, we do not artificially put a part of a sub-cohort in a health state if this health state had an initial proportion of 0%. This means that “empty” health states will stay empty during their DSA, not generating any results of interest during the internal validation. We vary QoL values each on their own. This creates unrealistic situations during the validation process in which a severe health state has a higher QoL value than a less severe one. The toolkit was set up in this way, because this enables us to examine the impact of each QoL value individually. Lastly, the toolkit only validates the relationship between input and output. It does not look at the effects an altered model structure or altered transition probabilities might have.


This paper describes how the Prosit validation toolkit works. The results of the validation are described elsewhere. Within its scope the toolkit supports the development team as it can automatically perform major parts of an internal validation. Repeated validation is a necessary step after each major change in sub-models. Therefore, the toolkit saves time on the validation process and we can put more time into the development process of the model.