The use of a reference population to derive a reference interval is as old as clinical chemistry itself. The underlying concept is that the patient with disease will be distinguishable from individuals who are healthy, as the test results will fall outside the “reference interval” or “normal range.” This concept has a degree of validity when the analyte in question has a Gaussian distribution and there is a clear association between an abnormal result and a symptomatic disease state. For many analytes, the situation is more complicated. The distribution of the reference population may be non-Gaussian and the distinction between health and disease more nuanced. An example of this is cholesterol where the overlap between those with and without cardiovascular disease is marked, even in individuals with the extreme phenotype resulting from familial hypercholesterolemia (1). Use of patient self-reference (using the patient as their own “normal”) overcomes the problem of broad non-Gaussian reference ranges and overlap of healthy and diseased populations. The emphasis then shifts to detection of significant change within an individual. For chronic disease management, this requires that every patient has a baseline test. This is not practical in acute care where the first presentation may be the first encounter with the healthcare system. However, if the analyte is known to change, rapid serial sampling can be used to detect acute changes in an individual. This approach found application in the rapid serial measurement of creatine kinase and its MB isoenzyme for early rapid diagnosis of acute myocardial infarction (2, 3). There are also situations where reference intervals are not defined but instead a specific cutoff, for example, derived from an ROC curve, as was originally used for troponin in the diagnosis of myocardial infarction, is employed. Diagnostic cutoffs are selected to define what is considered to be the disease population. The diagnostic cutoff for diabetes mellitus is one chosen to match the risk of subsequent development of diabetic complications.
Traditional reference intervals are typically based on 95% of the reference population. There are good statistical reasons for this. The number required to reliably estimate a 95% distribution with confidence is relatively small and easily obtained from most tests. Current recommendations from the CLSI recommends 120 samples within each relevant age and sex strata. This is the minimum sample size that allows a 90% CI to be calculated, and the CI around the 2.5th and 97.5th limits is tight. In addition, outliers in the data have less impact on the reference limits. For the upper 97.5% cutoff, if more than 2.5% of the high data values cross the true 97.5th point, they will begin to pull the value up. The same applies at the 2.5% cutoff. Hence, when any tail at either end of the distribution contains more than 2.5% outliers values, the reference interval will be affected. When a more exacting reference interval is used, such as the 99th percentile (as is currently recommended for troponin) (4), the statistics become more demanding and the bias imposed by outliers more marked (5). First consider the impact of outliers. Here 1% of values obtained will affect the 99th percentile. If more than 1% are outliers, they will produce an inappropriately high 99th percentile. Then consider the number of samples required to calculate a CI. For a population of 200, a 95% reference interval with 95% CI can be derived but not the 99th percentile with 95% CI. To calculate 90% confidence intervals around the 99th percentile, the minimum sample size is 299, and to calculate the 95% confidence interval, the minimum sample size is 368.
There is one further caveat that relates to how a statistics package is used. The 99th percentile corresponds to 98% of a reference population (not 99%): the interval between the 1st and 99th percentile. When using the package to calculate the 99th percentile, the limits must be set appropriately. The package should calculate either the upper limit only of the 99th point of 99% of the distribution or both upper and lower limits of 98% of the distribution (which corresponds to the 1st and 99th centile). Stipulating the upper and lower limits of 99% of the distribution will calculate the 0.5th and 99.5th points. This error appears to have occurred in some of the literature and is why the method of calculation is stipulated in the IFCC recommendations on cardiac biomarkers.
Conventional selection of an appropriate population for calculating a reference interval has always been relatively straightforward. Healthy ambulant outpatients or blood donors have been a preferred source. When it comes to cardiac biomarkers, life becomes a little more complicated. Studies on high-sensitivity cardiac troponin T (hs-cTnT),2 and more recently on high-sensitivity cardiac troponin I (hs-cTnI), have shown that levels of these biomarkers, even within the reference interval, predict future cardiac events and predict the presence of underlying structural cardiac disease (6). The first study that used progressive selection of apparently normal individuals based first on a health questionnaire, then on the results of physical examination, biochemical testing, and cardiac imaging, showed a progressive reduction in the 99th percentile, as more stringent criteria were applied (7). Interestingly, this only occurred for high-sensitivity troponin methods. Progressive selection of the population according to predefined criteria to exclude asymptomatic underlying medical conditions, referred to as “coning,” and its impact on the 99th percentile was subsequently confirmed for hs-cTnT and hs-cTnI (8). Appropriate patient selection is therefore required when generating reference intervals for cardiac biomarkers. Clearly, it is not practical to undertake routine cardiac imaging, but the use of surrogate markers for ventricular function such as N-terminal pro–B-type natriuretic peptide (NT-proBNP) is appropriate to exclude subclinical disease (9, 10).
In this issue of The Journal of Applied Laboratory Medicine, the authors report the collection of a sample bank at the 2015 AACC Annual Meeting in Atlanta, GA, and at the University of Maryland to act as a sample set for Wu et al. (11). The sample collection followed the recommendations of the IFCC (9, 10). Subjects had a self-administered health questionnaire and biochemical testing including NT-proBNP measurement to exclude patients and define a sample set that would be considered an appropriate reference population.
The argument about what constitutes a “normal” cardiac population can be challenging. Troponin levels appear to rise with age, although this may reflect underlying, undiagnosed disease (12). In addition, the majority of patients presenting to the emergency department have comorbidities. However, the selection used in this study represents a pragmatic population. The nature of the screening used will exclude the majority of asymptomatic disease, and the selection represents the low-risk population likely to attend the emergency department with chest pain. Recently, rapid and safe rule-out of acute myocardial injury including myocardial infarction can be achieved by either a low cutoff for hs-cTnI on admission or serial measurement combining a small δ change with all values remaining below the 99th percentile (13). The potential to generate a reliable independent 99th percentile, independent of any single manufacturer, that can be used clinically is provided by this sample set, since patients were appropriately screened by survey and laboratory testing to exclude those with known or undiagnosed cardiac conditions.
Finally, there is now the unique opportunity to provide a 99th percentile on the same sample set for cTnI across assays. To date, the ability to compare values from different assay platforms has been absent from the clinical and laboratory community; whether or not true standardization of cTnI ever occurs is a matter of debate (14). But, knowledge of the comparability of values will prove to be invaluable in assessing and refining the findings from clinical studies where different methods have been issued. I would encourage manufacturers and researchers to use this resource.
Modeling of reference limits was performed by using the Analyse-it add-in for Excel (https://analyse-it.com).
↵2 Nonstandard abbreviations:
- high-sensitivity cardiac troponin T
- high-sensitivity cardiac troponin I
- N-terminal pro–B-type natriuretic peptide.
see article on page 711
Authors' Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the author disclosure form.
Employment or Leadership: None declared.
Consultant or Advisory Role: P. Collinson, IFCC Task Force on Cardiac Troponin.
Stock Ownership: None declared.
Honoraria: None declared.
Research Funding: None declared.
Expert Testimony: None declared.
Patents: None declared.
- Received February 14, 2017.
- Accepted February 22, 2017.
- © 2017 American Association for Clinical Chemistry