Z-scores are a means of expressing the deviation of a given anatomic or physical measurement from a size- or age-specific population mean. Z-scores can be applied to echocardiographic measurements, height, weight, and blood pressure, and thus may assist in clinical assessment and decision-making .
In diseases that affect the aortic diameter, serial diameter measurements of the aortic root are useful for monitoring disease progression. Z-scores of the aorta diameter are also useful aids in diagnosis and determination of therapeutic effects. The use of Z-scores facilitates the detection of pathological increases in aortic root diameter above that expected due to normal growth, which appears as an increased Z-score over time . We discuss Z-scores in detail in the attached audio-visual presentation.
Centiles (also called percentiles) are a common alternative to Z-scores. They are easy to interpret and have been used to monitor development in pediatrics, including aortic root dilatation. However, centiles are less sensitive to changes in the aortic root diameter, particularly at the extremes . For example, if a hypothetical patient (with a body surface area (BSA) of 1.87 m²) has an aortic root that increases from 3.56 to 3.69 cm (1.3 mm difference), the percentile increases from the 99th to 99.7th%. This difference sounds small, but it corresponds to a Z-score increase of +2.33 to +2.75, which is a more visually obvious difference. Z-scores therefore can qua.pngy growth status outside of the percentile ranges . Z-scores also allow: (i) a standardized measure allowing comparison across different ages, genders, and measures and (ii) a continuous variable allowing generation of summary statistics such as mean and SD.
In adult practice, Z-scores are less commonly used. Instead, aortic root diameter is often reported with respect to a single “normal range.” However, this approach is inaccurate in growing children because the normal range of measurements will be impacted by patient size and age. Therefore, the interpretation of these measurements during childhood presents a unique challenge, specifically in determining whether a given measurement is within the expected range. One approach to the description of clinical and echocardiographic variables is to express measurements in terms of Z-scores. In current practice, there is a lack of understanding of how Z-scores are calculated and interpreted. Here, we review the literature on Z-scores, focusing on application in thoracic aortic aneurysms.
What is a Z-Score?
where χ = the observed measurement, μ = the expected measurement (population mean), and σ = the population standard deviation (adapted from ).
A Z-score above the population mean will have a positive value, whereas a Z-score below the population mean will have a negative value. The greater the deviation of the Z-score from zero (in a positive or negative direction), the greater the magnitude of deviation from the mean . A value that is 2 standard deviations above the mean (the 97.7th percentile) will have a Z-score of +2.0. Z-scores make clinical interpretation simple because of the mean of 0 and normal range of -2.0 to +2.0. A change in Z-score value over time is interpreted as a change in the size of the cardiovascular structure beyond what would be expected from the normal growth of that person .
For a Z-score to be calculated, the mean and standard deviation for that body structure (e.g., aortic root diameter) must be determined in the population. The mean and standard deviation have been calculated in many individual studies of varying sample sizes. These are empiric observations that are not “written in stone,” but rather vary somewhat among different studies. The individual studies can be used to generate nomograms . This is achieved by selecting a cohort of individuals and calculating their BSA based on one of the available BSA equations. A parameter of interest (e.g., aortic root diameter) is then recorded for each individual, allowing generation of a scatterplot (Figure 2A) and calculation and plotting of a regression equation and confidence intervals. This scatterplot can then be transformed into a nomogram (Figure 2B), allowing one to determine the Z-score for an individual patient given their BSA and parameter of interest (e.g., aortic root diameter) .
Calculation of Z-scores
There are a number of web-based calculation tools for Z-score measurement. The largest is http://zscore.chboston.org, having collected baseline data over the past 12 years, while www.parameterz.com offers Z-score measurements based on a large number of smaller individual publications. There is also a Z-score calculator available on the Marfan Foundation website (www.marfan.org/dx/zscore) to aid in the detection of a dilated aortic root in an individual with suspected or confirmed Marfan Syndrome. Recently, the Cardio Z App for the iPad/iPhone was made available, revolutionizing the ease with which Z-scores can be calculated in the clinical environment. Z-score values representing the size of the aorta can be determined from the aortic annulus, sinuses of Valsalva, sinotubular junction, and ascending aorta.
BSA has been found to be more useful than age, height, or weight alone for the accurate measurement of the size of different cardiovascular structures . There are a number of different formulas that have been established for the measurement of BSA. The most commonly used formulas include: Haycock, Du Bois, Boyd, Gehan and George, and Mosteller. The Haycock formula  (BSA (m2) = weight (kg)0.5378 × height (cm)0.3964 × 0.024265) has been recognized as the most accurate method of calculating BSA . This formula was generated from only 81 subjects, including a variety of ages (infants to adults) and ethnic groups (Black, Hispanic, and White). Because BSA is used in determining the normal distribution of aortic sizes for different ages and body sizes, variations and uncertainties in BSA calculations can have a major impact on the accuracy of Z-scores.
Z-scores have significant advantages to alternative methods of measuring aortic diameter, especially in the pediatric population. However, sources of limitations include measurement error, validity of nomograms, inconsistent use of BSA equations (at different ages in a child’s development), and our uncertainty of the natural history of Z-scores. These limitations may significantly influence Z-score values and may falsely indicate changes in the size of a structure where true variability does not exist.
There are several formulas available for calculating BSA, which have marked discrepancies in the values they produce and therefore are limited in their accuracy. Furthermore, the validity of the studies used to develop these formulas may be questionable. Often, the studies utilize small sample sizes and do not indicate which patient demographic they represent (see Table 1 for a comparison of the most widely used BSA formulas). In addition, many BSA equations tend to over- or underestimate BSA in certain populations. Therefore, clinicians must be mindful of which BSA formula is used when interpreting Z-scores. Furthermore, it is important to be consistent in the choice of Z-score calculator, while also being aware that the accuracy of the specific BSA equation utilized in each Z-score calculation will be affected by changes in body mass and age. The user must keep in mind these limitations in the evidence base of Z-scores.
|Formula||Equation||Sample Size||Age (Years)||Gender (F:M)||Main Limitation(s)|
|Banerjee (1955) ||15||18-44||0%||Small sample size. Only relevant to Indian population. Inaccurate in SE Asian population .|
|Boyd (1935)||197||Unclear*||Unclear*||BSA overestimated if: infant, short, obese. BSA underestimated if: tall, thin [14, 24, 25]. Study demographics unclear.|
|Du Bois (1916)||9||Not stated||Not stated||BSA underestimated if: infant/child, obese [8, 14, 26, 27]. Significant patient heterogeneity. Study demographics unclear. Nutritional status of study sample is unrepresentative.|
|Gehan (1970)||401||Infants -Adults||Not stated||BSA overestimated if: short, obese. BSA underestimated if: tall, thin, increasing body size [14, 24, 26]. Study demographics unclear. Inaccurate in SE Asian population .|
|Haycock (1978) ||81||ELBW infants -adults||Not accessible||BSA overestimated if: infant, short, obese. BSA underestimated if: tall, thin, increasing body size [24, 27, 28]. Inaccurate in SE Asian population .|
|Jones (1994) ||28||3.1-10.5||46%||Small sample size. Only 4 males included. Narrow age range.|
|Meban (1983) ||79||11-42 weeks gestation||Not stated||Only pathological human fetuses studied.|
|Mosteller (1987) ||0||NA||NA||BSA overestimated if: short, obese . BSA underestimated if: infant, tall, thin, low body size [14, 24, 32]. Less accurate simplification of the Gehan equation.|
|Shuter (2000) ||42||Not stated||Not stated||Small sample size. Patient demographics unavailable.|
|Yu (2003) ||3951||20-91||54%||No subjects under 20 years of age. Formula only validated in Chinese individuals. Whole body scanning method does not take into account overlapping and shading body parts .|
Z-scores are usually calculated using BSA; however, a weight-only equation also exists for the calculation of BSA (BSA = 0.1023 (weight 0.68)) . This may be a more convenient tool, but it lacks the valuable adjustment for height in patients, which is a sensitive factor to consider when assessing the aortic diameter.
The introduction of web-based Z-score calculators, such as http://www.parameterz.com, has revolutionized the ease with which we can calculate Z-scores in the clinical environment. However, these Z-score calculating programs str.pngy their data using geographically-specific nomograms. Such geographical studies are not available worldwide, and therefore care must be taken to ensure the most accurate geographical region is used for analysis. One must also remember that these nomograms do not take ethnic diversity into account. Despite recent efforts to improve the accuracy of nomograms, there are still numerical and interpretative uncertainties [4, 10-13]. Such nomograms may produce widely different Z-score values. This is because many nomograms utilize a small sample size, with an underrepresentation of information across age groups (particularly neonates and premature infants) . There is a lack of complete information on certain cardiovascular structures and racial and gender differences in the literature [14-16]. In addition, the use of formalin-fixed pathological specimens to determine base data for nomograms is limited by their availability and may significantly underestimate the dimensions of cardiac structures in vivo, thus producing inappropriate clinical tools [17, 18].
To maintain statistical confidence in Z-scores with extreme values, nomograms must adequately represent the heteroscedasticity (change in variance) across body sizes of individuals . Inappropriate averaging of variance may lead to under- or overestimation of Z-score values for children at the extremes of body size . In addition, obesity may skew Z-score data and therefore produce measurement bias when interpreting Z-scores. This is a particular problem in patients with cardiovascular disease. Consequently, an obese patient’s Z-score may be an underestimation of the true value. Dallaire et al.  explored this problem and suggested that the use of multivariable models with weight and height as independent predictors of Z-scores should be explored to reduce this potential pitfall. Van Kimmenade et al.  concluded that, because we are facing an obesity epidemic, the use of Z-scores that correlate with height rather than BSA/weight may be more accurate in evaluating aortic root measurements in those with Marfan Syndrome.
Measurement error can be a significant limiting factor when determining the validity of Z-scores; therefore, technicians must take consistent measurements of the aortic diameter to minimize observer bias. There are clear guidelines from the American Society of Echocardiography Pediatric and Congenital Heart Disease Council regarding accurate measurement of the proximal aorta (Table 2)  While pediatric measurements are made in systole, adult measurements are made in diastole, which can give significantly different measurements. Care must be taken when interpreting Z-scores recorded before the implementation of the 2010 guidelines. Before the advent of these guidelines, discrepancies in inclusion of vessel wall thickness, axis of measurements, and stage of the cardiac cycle provided important sources of marked variability. Furthermore, dilatations of the aorta are not homogeneous, and therefore a single measurement may not represent the true scale of the pathology . These factors may contribute to intra- and inter-observer bias and affect the reliability of earlier studies . Even small changes in aortic diameter can represent significant disease progression in Z-score calculations. Together, these factors may lead to inappropriate treatment strategies such as lifelong medical therapy, which can expose patients to unnecessary side effects and financial burden, or high risk surgical interventions.
|Guidelines for the Measurement of the Proximal Aorta|
Furthermore, data on the non-pathological natural history of Z-scores is limited. Should the aortic Z-score remain identical in a normal or aneurysmal child from infancy to young adulthood? We simply do not know. Currently, randomized controlled trials (RCTs) investigating aneurysmal pathology rely on Z-score changes as a measure of therapeutic efficacy [21, 22]. The natural history of Z-scores in normal and pathological states remains largely unknown, therefore limiting the meaningful interpretation of Z-scores.
Z-scores are commonly used in pediatric settings to evaluate the diameter of the ascending aorta and aortic root. However, raw values of aortic root sizes are usually calculated in adults. The rationale for this is that height stabilizes in adulthood and is unlikely to change over time. However, this is inaccurate, especially in elderly patients who lose height from their young adult maximum. In addition, there is a huge variability in size among the population, which suggests that gender and height may be significant confounding factors when interpreting aortic root values in these patients.
Knowing these limitations, careful interpretation of Z-scores in relation to patients and recognition of information gaps in the literature are essential to improve the clinical interpretation of Z-scores.
In light of the evidence base, Z-scores are a convenient tool for diagnosing and monitoring cardiovascular disease. In addition, they are widely used in RCTs to determine treatment efficacy in aortic aneurysmal disease.
However, there are some notable limitations to the use of Z-scores. All varieties of BSA calculation directly and substantially impact aortic Z-score determination. Some of these limitations can be overcome by calculating Z-scores using consistent and generalizable nomograms. This may require consistent use of specific Z-score nomograms to accurately reflect the structure measured (e.g., aorta) and the gender, race, height, and weight of the patient. Additionally, measurement bias is a contributing factor to inaccuracies when determining aortic root size. To reduce the impact of intra- and inter-observer bias, consistent reporting of aortic root measurements, ideally by experienced technicians, is required, with abnormal measurements reviewed and confirmed by the interpreting cardiologist/cardiothoracic surgeon. As we face an obesity epidemic, it is also important to consider the accuracy of BSA-based Z-score calculations, and whether height-based calculations should be implemented for obese individuals.
We recommend that further investigation be performed into the natural history of Z-scores in non-pathological states, to assure that current interpretations of therapeutic strategies in RCTs are accurate. Specifically, we feel that clear-cut evidence is needed to show that a decreasing Z-score as a pediatric patient ages truly represents a positive therapeutic (pharmacological) effect, and not simply a normal Z-score progression with increasing body size. We have investigations underway on this specific quandary.