Virological, serological and clinical outcomes in chronic hepatitis B virus infection: development and validation of the HEPA-B simulation model


Hepatitis B virus (HBV) is the world’s leading cause of chronic liver disease, liver cancer and liver-related mortality.1 More than 250 million people live with chronic HBV worldwide, and 800 000 people die annually from HBV-related causes, such as cirrhosis and hepatocellular carcinoma (HCC).1 2 It is estimated that HBV will cause more than 1.3 million annual deaths and 30 million lost years of life over the next 20 years.3 The urgent need to address HBV has been emphasised in the WHO call to ‘eliminate the public health threat of viral hepatitis by 2030’.4

Tools to reduce the burden of HBV are available in the forms of vaccination, which prevents infection and treatment with nucleoside or nucleotide analogues, which inhibit HBV DNA replication and can reduce the risk of developing cirrhosis and/or HCC.5 However, major implementation challenges exist globally, particularly in sub-Saharan Africa where more than 60 million people have chronic HBV and 100 000 people die from its complications every year.2 4 6 With limited resources, clinicians and policymakers in these settings face difficult decisions determining which persons would benefit from treatment and which public health strategies should be prioritised to address HBV.7

Simulation models are tools that can be used to evaluate public health interventions for transmissible and chronic infections.8–10 By incorporating epidemiological data from multiple sources into the model, investigators can compare the clinical impact of different treatment strategies that have not been directly compared in clinical trials. Models can also provide complementary information to clinical trials and observational studies by projecting data beyond the duration of time-limited studies and can be used in cost-effectiveness analyses to guide policy decisions.8 Simulation models have provided influential evidence in other global disease prevention and treatment efforts, including in HIV, tuberculosis and malaria.11–17 Our objective was to develop and validate a novel microsimulation model detailing the natural history of chronic HBV.


Analytical overview

We developed and validated the HEPA-B Model, a novel state-transition, Monte Carlo microsimulation model of the natural history of chronic HBV disease. The model structure consists of health states related to chronic HBV infection that are mutually exclusive and collectively comprehensive. Simulated people move between health states based on monthly transition probabilities derived from clinical trial and observational data. Model outcomes include the cumulative incidence of serological and clinical events in the natural history of chronic HBV: loss of HBV ‘e’ antigen (HBeAg) for people who gain antibodies against HBeAg, loss of HBV surface antigen (HBsAg), cirrhosis, HCC and death. We simulated clinical events during the HBeAg-negative phases of infection based on definitions from the European Association for the Study of the Liver (online supplemental table 1), because the majority of adults with chronic HBV are in these phases.18 19 We validated the model by comparing its projected outcomes with observational data from two different HBV-endemic regions, Taiwan and Gambia.20 21

Supplemental material

Model structure

The model includes four phases of untreated, chronic HBV infection, defined by the presence or absence of HBeAg (ie, positive or negative) and the level of HBV DNA activity (ie, infection or hepatitis) (online supplemental figure 1).18 At model start, a simulated person with HBV draws randomly for demographic and clinical characteristics from a population distribution, including a distribution of chronic HBV phases. Every month, each person has a probability of advancing to the next chronic HBV phase (online supplemental figure 2). People exit the HBV phases if they experience HBsAg loss, which is the ‘functional cure’ of HBV infection or death.22 The transition probability to HBsAg loss for people in HBeAg-negative phases of infection is conditioned on age and the current HBV DNA level.23 24 Because HBV DNA levels fluctuates monthly, the ‘current HBV DNA level’ at each month is derived from a moving average of HBV DNA values over the prior 12 months.

In addition to progressing through the HBV phases, each simulated person in the model has a monthly probability of developing cirrhosis, HCC and decompensated liver disease.19 We did not specifically simulate the development of fibrosis. Decompensated cirrhosis was included with decompensated liver disease, which was modelled as a distinct health state from cirrhosis. The monthly probability of developing cirrhosis or HCC is conditioned on age, sex and current HBV DNA level; incident HCC is also conditioned on the presence of cirrhosis.25–27 Simulated people with cirrhosis or HCC incur a monthly probability of developing decompensated liver disease independent of current HBV DNA level. Decompensated liver disease is defined as a deterioration of liver function, which can occur from progressive cirrhosis or HCC and is characterised by jaundice, ascites, hepatic encephalopathy, hepatorenal syndrome or variceal haemorrhage.28 People who develop both cirrhosis and HCC are classified as having decompensated liver disease. At the end of each month, each person has a probability of death, based on their age, sex and presence of cirrhosis, HCC or decompensated liver disease.29 People with either cirrhosis or HCC have a higher monthly mortality compared with people without either; people with decompensated liver disease have a higher mortality than people with cirrhosis or HCC.29

Inputs and data sources

Model inputs include epidemiological characteristics of the cohort population and transition probabilities between HBV phases (table 1). Whenever possible, we used region-specific parameters from observational studies. Important parameters include rate of HBeAg loss (pooled rate, 6.46% per year),30 duration of time in HBeAg-positive chronic hepatitis (mean, 5.0 years (SD, 0.67 years)),31 and incidence of HBeAg-negative chronic hepatitis (2.56%–10.77% per year).32 Return to HBeAg-negative chronic infection from HBeAg-negative chronic hepatitis occurs at 1.01% per year, and people remain at risk to return to HBeAg-negative chronic hepatitis.33

Table 1

Inputs for key transitions in chronic HBV used to populate the HBV simulation model

We derived trajectories of varying levels of HBV DNA from observational cohorts with serial HBV DNA testing, specific to HBV phase.34 35 We derived parameters related to the incidence of cirrhosis, HCC and decompensated liver disease from the REVEAL cohort in Taiwan and other sources (table 1).20 25 We simulated age-specific, sex-specific and country-specific mortality from WHO life tables, which we assumed estimated mortality rates for people without HBV, cirrhosis or HCC.36 To maintain age-specific and sex-specific differences in mortality for each health state, we used multivariable HRs stratified by disease stage instead of using mortality rates as we wanted estimates to be consistent across levels of other confounding factors. For people with HBV without liver-related complications, we multiplied the baseline age-specific and sex-specific incidence rate of death by the rate ratio of 1.05.29 We multiplied age-specific and sex-specific mortality risk by a HR of 2.0, 4.4 and 6.0 for people with HBV-related cirrhosis, HCC and decompensated liver disease, respectively.37


We followed the framework for model validation established by the International Society for Pharmacoeconomics and Outcomes Research and the Society for Medical Decision Making.38 We tested the model’s face validity through a critical review from clinical experts in hepatology and infectious diseases (online supplemental file 1), the model’s internal validity by evaluating the model’s equations for accuracy and consistency and the model’s external validity by comparing model results to actual event data.38 We used the disease progression inputs described above (table 1), and we populated the model with baseline demographic characteristics from the REVEAL cohort25 31 34 to project the cumulative incidence of HBeAg loss, HBsAg loss, cirrhosis and HCC (online supplemental table 2). The REVEAL study clarified the association between baseline HBV DNA and incidence of HBsAg loss, cirrhosis and HCC.20 We examined the model’s ability to replicate these findings by stratifying model-simulated HBsAg loss, cirrhosis and HCC by five categories of HBV DNA at the start of HBeAg-negative chronic infection (ie, at time of HBeAg loss): HBV DNA<300 copies/mL, 300–104 copies/mL, 104–105 copies/mL, 105–106 copies/mL and >106 copies/mL. We incorporated model inputs from a variety of sources to simulate the phase transitions of chronic HBV and the monthly changes in HBV DNA as a function of the time spent in each chronic HBV phase. We calculated incidence rates for the model-simulated outcomes by dividing the number of simulated events (ie, HBeAg loss, HBsAg loss, cirrhosis and HCC) by the person-years at risk for those outcomes. In calculating incidence rates of HBeAg and HBsAg loss, simulated persons were censored at the time of the event or death; for the outcomes of cirrhosis and HCC, simulated persons were censored after the time of either those events, HBsAg loss or death.

We also replicated results of a natural history study of chronic HBV based on three decades of community serosurveys in the Gambia from 1974 to 2008,21 which details the natural history of chronic HBV in sub-Saharan Africa. We used the disease progression inputs described above (table 1), and the demographic characteristics of the study cohort (online supplemental table 2) to project the cumulative incidence of HBeAg loss (for those HBeAg-positive) and HBsAg loss. We did not include incidence rates of cirrhosis, HCC and mortality in this validation due to limitations in ascertaining these outcomes from the community-based serosurvey.21

Statistical analysis

We compared incidence rate of model-projected clinical outcomes (ie, HBeAg loss, HBsAg loss, cirrhosis and HCC) to the 95% CIs in published meta-analyses when available.39 We calculated mean absolute error, root-mean-square percentage error and intraclass coefficients (ICC) with corresponding CI to compare time-to-event curves for HBeAg loss in individuals with HBeAg-positive serology, HBsAg loss, cirrhosis and HCC over a 10-year simulation.39 40 We stratified our time-to-event simulation for HBsAg loss, cirrhosis and HCC by the five HBV DNA categories mentioned above. We also compared mean absolute error, root-mean-square percentage error and ICC values for the incidence of cirrhosis and HCC outcomes combined at each HBV DNA level over each year of simulation. We calculated ICC using the random coefficients of a two-way mixed effects model with absolute agreement using the ‘irr’ package in R. We defined an ICC 0.80–0.90 and above 0.90 to indicate good and excellent model consistency, respectively.40

Patient and public involvement

Patients and the public were not involved in the design, conduct, reporting or dissemination plans of this simulation modelling study.


Simulation of viral (HBV DNA) changes

A person-level trace analysis displayed the lifetime trajectories of HBV DNA for a random selection of people (online supplemental figures 3 and 4). All simulated people had a net reduction in HBV DNA level during HBeAg-positive chronic hepatitis. People who remain in HBeAg-positive chronic infection for the duration of the simulation had the highest HBV DNA levels. Those who terminate the simulation in HBeAg-negative chronic infection are likely to have a lower HBV DNA level compared with people who progress to HBeAg-negative chronic hepatitis. HBsAg loss at any point in the model is associated with a low HBV DNA level at the end of the simulation.

Simulation of serological markers: HBeAg and HBsAg

Model-projected HBeAg loss was similar to observed data from natural history studies of chronic HBV from Taiwan and Gambia (figure 1; table 2). All metrics that compared the model-projected and observed cumulative incidence of HBeAg loss showed a close fit: mean absolute error, 12.4%; root-mean-square percentage error, 9.3%; ICC, 0.969 (95% CI: 0.728 to 0.990) (table 3). When populated with epidemiological characteristics from Taiwan from the REVEAL study, the model-simulated incidence rate of HBsAg loss was 2.35 per 100 person-years (PY), which is within the reported 95% CI from a meta-analysis of studies from the Western Pacific region (1.23 to 2.64 per 100 PY) (table 2).23 For people in the HBeAg-negative phases of infection, model-projected HBsAg loss over 10 years ranged from 0.78/100 PY for HBV DNA>106 copies/mL to 3.34/100 PY for HBV DNA<300 copies/mL. In comparing model-projected and observed annual cumulative incidence of HBsAg loss at each HBV DNA level over 10 years, we calculated a composite ICC of 0.889 (95% CI: 0.542 to 0.959) (table 3; online supplemental figure 5). In a lifetime simulation of chronic HBV among persons infected since birth, the incidence rate of HBeAg loss and HBsAg loss (in HBeAg-negative phases of infection) were 5.18 and 2.35 per 100 PY, respectively (online supplemental figure 6).

Figure 1
Figure 1

Comparison of cumulative incidence of HBeAg loss from model-projected outcomes and observational data from the REVEAL study (A) and from a natural history study in the Gambia (B). The model was initialised with cohort characteristics of the REVEAL study31 (Panel A) and the cohort characteristics of a natural history study in the Gambia21 (Panel B). HBeAg loss was simulated using the incidence rate reported in each study. The percentage of the cohort with HBeAg positivity at each year of age is shown for the simulation (solid line) and observational study (dashed line). HBeAg, HBV ‘e’ antigen.

Table 2

Model-simulated clinical outcomes of chronic hepatitis B, in comparison to observed estimates reported in the literature

Table 3

Composite goodness of fit measures between model-simulated outcomes and observed estimates

Simulation of cirrhosis, HCC and death

Model projections of the 10-year cumulative incidence of cirrhosis and HCC increased with successively higher levels of HBV DNA at the time of HBeAg loss (figure 2), consistent with the observed association between baseline HBV DNA level on enrolment and ascertainment of liver-related complications in the REVEAL study.20 25 36 The model most closely simulated cumulative incidence of cirrhosis and HCC for people with HBV DNA between 105 and 106 copies/mL at the time of HBeAg loss (mean absolute error, 12.1%; root-mean-square error, 22.4%) (table 3). Visual inspection of the HBV DNA-stratified incidence of HCC (figure 2) also demonstrates the best fitting 10-year cumulative incidence curves for HBV DNA between 105 and 106 copies/mL at the time of HBeAg loss. The composite cumulative incidence of cirrhosis and HCC across all HBV DNA stratifications at each year of simulation demonstrated a mean absolute error of 16.3%, root-mean-square percentage error of 28.7% and ICC 0.971 (95% CI: 0.959 to 0.98) (table 3). In a lifetime simulation of HBV disease, incidence of mortality increased with older age and the prevalence of cirrhosis, HCC and decompensated liver disease (online supplemental figure 7).

Figure 2
Figure 2

Comparison of cumulative incidence of cirrhosis (A) and hepatocellular carcinoma (B) from model projections and observational data in REVEAL study. After initialising the model with cohort characteristics of the REVEAL study on cirrhosis25 (Panel A) and hepatocellular carcinoma36 (Panel B), the cumulative incidence of cirrhosis and hepatocellular carcinoma was calculated for each baseline HBV DNA stratification. The model-projected incidence of cirrhosis and hepatocellular carcinoma were higher at higher baseline levels of HBV DNA, recapitulating the results of the REVEAL study. HBV, hepatitis B virus; HCC, hepatocellular carcinoma.

The simulated incidence rate of cirrhosis ranged from 285/100 000 PY for people with HBV DNA<300 copies/mL to 2093/100 000 PY for people with HBV DNA>106 copies/mL. Compared with the REVEAL study, across all HBV DNA levels, the simulation model projected cirrhosis at each year of simulation with an ICC 0.965 (95% CI: 0.942 to 0.979) (table 3). The incidence rate of HCC was projected by the model to be 300/100 000 PY, which was within the 95% CI for HCC incidence rates reported in two different meta-analyses: one evaluating people in inactive carrier states in the Western Pacific region (95% CI: 210 to 630/100 000 PY)41 and the other evaluating people in HBeAg-negative phases of infection (95% CI: 210 to 1230/100 000 PY).42 The simulated incidence rate of HCC ranged from 57/100 000 PY for people with HBV DNA<300 copies/mL to 1654/100 000 PY for people with HBV DNA>106 copies/mL. Compared with the REVEAL study across all HBV DNA levels, the model projected HCC at each year of simulation with an ICC 0.977 (95% CI: 0.962 to 0.986) (table 3).


We developed the HEPA-B model, a novel state-transition Monte Carlo simulation model of chronic HBV infection and then validated model-projected outcomes against observational data. We incorporated extensive natural history data and individual person-level correlations with age, sex and HBV DNA, derived from longitudinal observational studies in two HBV-endemic regions.41 43 We simulated loss of HBsAg, which varied based on HBV DNA level in the HBeAg-negative phases of illness. We found an ICC above 0.95 between model projections and observed outcomes of cirrhosis and HCC, indicating excellent agreement. Furthermore, we demonstrated that the model-projected cumulative incidence of cirrhosis and HCC reflected the key role played by HBV DNA in the pathogenesis of liver-related complications.

The HEPA-B simulation model includes the capacity to project serologic and HBV DNA changes, a feature not explicitly included in previously published models of chronic HBV disease.42 44 45 A 2015 systematic review provided a critical assessment of 16 published HBV simulation models and economic analyses and found varying quality due to model structures that simplified natural history.42 44 Although more recently published HBV simulation models have incorporated a wider variety of health states (eg, decompensated liver disease) and demonstrated more complexity in their simulation of HBV disease progression,46 47 these models do not simulate person-level changes in serological and viral markers, which limits the ability to assess heterogeneity in outcomes for a population of individuals with HBV, particularly as they age. Similarly, currently published simulation studies of HBV are unable to evaluate management strategies that incorporate alternative HBV DNA-based treatment thresholds. As such, currently published models have been effective in evaluating broad public health programmes in HBV, such as population-level screening, vaccination and treatment46–48; however, they have not yet been used to evaluate more detailed clinical strategies.

Our model incorporates monthly changes of HBV DNA within each HBV phase of illness, which adds an important level of detail given the central importance of HBV DNA in the natural history and clinical progression of chronic HBV. We demonstrated that model-simulated evolution of HBV DNA is consistent with current understanding of the course of natural infection: (1) the highest HBV DNA levels occur during HBeAg-positive chronic infection; (2) the highest within-subject and between-subject fluctuation in HBV DNA levels occur during the HBeAg-positive hepatitis phase; (3) HBV DNA levels are most stable during HBeAg-negative infection, and; (4) HBV DNA rises to higher levels during HBeAg-negative hepatitis.35

The model incorporates the association of HBV DNA with liver-related outcomes, such as cirrhosis and HCC and with critical events in HBV natural history, including HBsAg loss. The overall incidence rate of cirrhosis projected by the model was lower than the incidence rate reported in the REVEAL study, although the DNA-stratified trends were similar. Cirrhosis incidence varies widely between settings and populations, and its observed incidence may be a function of the prevalence of other factors, including comorbidities, substance use, genetic and environmental conditions that predispose to chronic liver disease and age distribution.28 48 We found extremely close agreement between observed and simulated incidence rates and cumulative incidence of cirrhosis and HCC. Although we found relatively lower agreement between observed and simulated HBsAg loss, which may be attributed to the relatively lower frequency (and high imprecision) of reported HBsAg loss in the clinical cohorts evaluated,24 model-projected incidence rate of HBsAg loss was within the 95% CI reported from a meta-analysis of people in the Western Pacific region.23

Currently, the HEPA-B model can be used to project the burden of chronic HBV disease in a population and the incidence of cirrhosis, HCC, decompensated liver disease and liver-related mortality. Future incorporation of antiviral therapy into the model will impart the ability to address critical questions in the clinical management of HBV. For example, by incorporating time-varying parameters for HBeAg, HBsAg and HBV DNA, we will be able to investigate the clinical impact of alternative treatment initiation and cessation criteria.18 19 Currently, many people with chronic HBV live in a ‘grey zone’ of disease activity, with no clear consensus on when to initiate treatment.18 19 While HBV experts have argued for expanding HBV treatment eligibility, there has not been robust clinical trial or observational cohort data to support this expansion.49 50 One modelling study in South Korea by Lim et al demonstrated that certain expanded treatment criteria, such as a lower alanine aminotransferase threshold or treatment of all people with elevated HBV DNA, would avert many HBV-related deaths and be cost-effective.51 Thus, simulation modelling has and will continue to play an important role in determining the clinical impact and cost-effectiveness of new treatment eligibility criteria. We will also be able to use the HEPA-B model to evaluate the use of point-of-care testing in resource-limited settings, treatment simplification, novel therapies and to determine the impact of biomarkers on disease monitoring.52

This modelling analysis has several limitations. First, we did not validate the model projections of cirrhosis and HCC to observational data from sub-Saharan Africa, given limited data from that region. However, the model projections of HBeAg and HBsAg loss from sub-Saharan Africa replicated the observational findings from a large study from West Africa, and further validation can be performed as more data from this region emerge. Second, the model does not include within-host immunological factors and changes in aminotransferases associated with HBV progression, control or reactivation; it also does not incorporate behavioural or environmental factors that affect liver disease, such as family history of HCC, alcohol use disorder or coinfection with other viruses including HIV.53 Finally, the model does not explicitly account for the genetic diversity of HBV, including differences in natural history that are rooted in different genotypes or the prevalence of precore mutations, which may impact the evolution of changes in HBeAg status and risk of cirrhosis and HCC.54 55 The model can be further developed to address questions related to these aspects of disease. Despite the simplifying assumptions above, the model described here recapitulates important elements of chronic HBV that relate to functional cure and disease progression.

This post was originally published on