Background: Antinuclear antibody (ANA)5 testing is routinely performed during evaluation of patients with a suspected connective tissue disease (CTD), yet the question of which method is most appropriate remains controversial. The purpose of this study was to evaluate the clinical utility of ANA testing by an enzyme immunoassay (EIA), an immunofluorescence assay (IFA), and a multiplex immunoassay (MIA) in a routine laboratory population.
Methods: Samples (n = 1000) were collected from specimens submitted for ANA testing by EIA (Bio-Rad). All samples were subsequently analyzed by IFA (Zeus) and MIA (Bio-Rad). The sample cohort was weighted to represent the routine testing population. Diagnostic information was obtained by chart review.
Results: For the diagnosis of a CTD, ROC curve analysis demonstrated no significant differences between IFA (area under the curve 0.81) and EIA (0.84) (P = 0.25), with overlay of a single point for the MIA. When normalized to a specificity of approximately 90%, the sensitivities of the MIA, EIA, and IFA were 67%, 67%, and 56%, respectively. By varying the clinical cutoff, the IFA could achieve the highest sensitivity of 94%; however, the corresponding specificity was only 43%. In contrast, a strongly positive EIA had a specificity of 97%, although, at this cutoff, the sensitivity was only 40%.
Conclusions: Although the overall diagnostic performance of the IFA, EIA, and MIA were not statistically different, the clinical sensitivity and specificity varied dramatically based on the positive/negative cutoff. Knowledge about the performance characteristics of each method will significantly aid in the interpretation of ANA testing.
Patients being tested for possible diagnosis of a connective tissue disease will benefit from the information presented here. Evidence presented on the relative diagnostic sensitivity and specificity of various methods used for ANA testing will allow better characterization of how this testing might perform in a clinical laboratory. Knowledge in the field of diagnostic testing for connective tissue diseases, specifically related to ANA methods and interpretations, will be advanced by the information presented.
Dating to the late 1940s and early 1950s, testing for antinuclear antibodies (ANAs) has remained one of the first-line tests for evaluation of patients with suspected connective tissue diseases (CTDs), also referred to as ANA-associated rheumatic diseases (1–4). ANAs are a heterogeneous group of antibodies so-named because the majority are specific for nuclear antigens, such as double-stranded DNA (dsDNA) and ribonucleoproteins. A variety of methods are currently available for ANA testing in the clinical laboratory (2, 5, 6). The immunofluorescence assay (IFA) was the first method used for routine clinical ANA testing (7). This method is still used by a large number of laboratories, with most performed on an HEp-2 cellular substrate. However, IFAs have some analytical disadvantages, including subjective interpretation and limited automation. To overcome these issues, enzyme immunoassays (EIAs) were developed (8–10). Although many variations exist, most EIAs for ANA testing use an HEp-2 nuclear extract, supplemented with certain purified antigens, in place of the intact cell. The theory behind this method is that the majority of antigens present in the nucleus would be represented in the extract, thereby maintaining the broad antigen sensitivity required for an ANA screening test. Hundreds, perhaps thousands, of potential antigens are present in the nucleus of an HEp-2 cell. However, some antigens show the highest sensitivity and specificity for individual CTDs. These clinically relevant antigens are the basis of the newest ANA method available to the clinical lab, the multiplex immunoassay (MIA) (11–16). In this assay, a set of specific antigens are coupled to fluorescent microbeads. The patient sample is interrogated using a mixture of these antigen-specific beads. In the case of an MIA test for ANA, positivity for one of the antigen specificities will result in the sample being identified as “ANA positive.”
From a laboratory process standpoint, EIAs and MIAs offer many advantages over IFAs, particularly for large-volume laboratories. For the last 15 years, our laboratory has used the EIA as the primary method for our general ANA test. In 2011, the American College of Rheumatology released a position statement citing the opinion that IFA is the gold standard for ANA testing, primarily because of the sensitivity of the IFA for the diagnosis of various CTDs. This opinion raised many questions from our clinicians regarding the diagnostic performance of our EIA method and the MIA method by extension. The purpose of this study was to compare the diagnostic sensitivity and specificity of an IFA, EIA, and MIA in the representative patient population of a clinical laboratory.
Samples (n = 1000) were collected from specimens submitted to the Antibody Immunology Laboratory at the Mayo Clinic. Samples were identified based on the clinical ANA result obtained by EIA (Bio-Rad). This testing was performed in accordance with all manufacturers' instructions on the Triturus semi-automated platform (Grifols). Using a cutoff of ≤1.0 U, a total of 273 negative samples were randomly selected for inclusion in this study (the manufacturer's recommended reference range was <1.0 U). In addition, using ranges established by the laboratory, 225 weak positive (1.1–2.9 U), 250 positive (3.0–5.9 U), and 252 strong positive (≥6.0 U) samples were collected. At the time of collection, no information regarding indication for testing, differential diagnosis, or clinical symptoms was accessed. This study was approved by the Mayo Clinic Institutional Review Board.
All samples selected for this study were subsequently analyzed by IFA and MIA. The IFA testing was performed by Mayo Medical Laboratories New England using HEp-2 cells (Zeus Scientific). All samples were screened at a dilution of 1:40; samples positive at 1:40 were titered to a final dilution of 1:640. The MIA testing was performed by the Antibody Immunology Laboratory on the BioPlex® 2200 (Bio-Rad). A sample was identified as positive for an ANA by the MIA if a positive result was obtained for at least 1 of the 11 included antigens. No samples were flagged for reagent blank bead or internal standard errors.
Samples were selected for this study to be distributed uniformly across the reportable range of the ANA EIA method. Clinical diagnoses for all patients in this study were obtained by chart review. Diagnoses in the CTD group (total n = 227) included systemic lupus erythematosus (n = 75), inflammatory myopathy (n = 20), Sjögren's syndrome (n = 42), sclerotic disease (n = 26), overlap syndrome (n = 8), mixed CTD (n = 4), and undifferentiated CTD (n = 52). However, this uniformly distributed range of ANA EIA test results does not reflect the typical distribution of the ANA EIA test results for the overall patient population served by the Antibody Immunology Laboratory. In a 6-week period from September to October 2010, a total of 1590 ANA EIA tests were performed. The distribution of test results within that population was as follows: 1202 (75.6%) negative (≤1.0 U); 216 (13.6%) weak positive (1.1–2.9 U); 79 (5.0%) positive (3.0–5.9 U); and 93 (5.9%) strong positive (≥6.0 U). All analyses were weighted by the inverse of the sampling probabilities to make the study population representative of the overall patient population (17). This step is necessary to accurately estimate sensitivities and specificities of the tests (18). For sensitivity and specificity calculations in the weighted cohort, patients with a diagnosis of a CTD (weighted n = 76) were compared to all other diagnoses in the cohort (weighted n = 924). In addition, sensitivities and specificities were calculated in the unweighted cohort (CTD, n = 227; non-CTD, n = 773). Sensitivities and specificities were calculated using the cutoffs described above. ROC curves were constructed for the 3 ANA methods and the areas under the curve were calculated. Positive and negative likelihood ratios (LRs) were calculated. Interval LRs, defined as the probability of obtaining a test result in a specific range when a CTD is present divided by the probability of obtaining that test result when a CTD is absent, were calculated to capture the magnitude of abnormality of the test results (19). The 95% CIs were obtained for the LRs (20). Analyses were performed using SAS version 9.3 (SAS Institute) and R 3.0.2 (R Foundation for Statistical Computing).
Analytical comparison between ANA methods
The analytical concordance between the ANA IFA and EIA results are shown in Table 1. The IFA and EIA results were each divided into 3 categories based on titer and arbitrary units, respectively. A total of 580 samples had a titer of ≤1:40 by IFA, which would correspond to a negative interpretation. Of those, 515 (88.8%) were also negative by EIA, with a quantitative result of ≤1.0 U. However, a modest number of samples (n = 65; 11.2%) were classified as positive based on the EIA method, despite having an IFA titer of ≤1:40. For samples in which the ANA IFA titer was ≥1:320 (n = 149), 61 (40.9%) and 52 (34.9%) had EIA values of 1.1–5.9 U or ≥6.0 U, respectively, with the remaining 36 samples (24.2%) having a negative EIA result of ≤1.0 U. In contrast, at the more moderate titers of 1:80 and 1:160 (n = 271), the majority of samples (n = 205; 75.6%) had EIA results that were classified as negative, with only 66 samples (24.4%) having a confirmatory positive EIA.
The analytical correlation of the EIA and IFA with the MIA was also assessed (Table 1). In samples where the IFA was ≤1:40 and the EIA was ≤1.0 U (n = 515), the majority (n = 490; 95.1%) were negative by MIA, whereas 25 samples (4.9%) were positive. This high frequency of negative MIA results was observed consistently in samples with EIA results ≤1.0 U, despite IFA titers of 1:80/1:160 (n = 180; 87.8%) or ≥1:320 (n = 33; 91.7%). Further, the frequency of positive MIA results correlated with the EIA unit value. For samples with EIA results of 1.1–5.9 U, the frequency of positive MIA results were 25.4% (IFA ≤1:40), 41.0% (IFA 1:80/1:60), and 37.7% (IFA ≥1:320). These frequencies increased to 100% (IFA ≤1:40), 80% (IFA 1:80/1:60), and 88% (IFA ≥1:320) for EIA results ≥6.0 U.
Diagnostic comparison between ANA methods
The clinical sensitivity and specificity for each of the 3 ANA methods were then compared in terms of their ability to distinguish patients with a CTD from all others within the cohort. ROC analysis is shown in Fig. 1. EIA unit values were treated continuously, while the sensitivities and specificities were calculated at the individual titers for the IFA. Using the weighted data, which represents the general testing population of the laboratory, the ROC curves for the EIA and IFA essentially overlay one another, with no statistically significant differences between the areas under the curve for the IFA (0.81) compared to the EIA (0.84) (P = 0.25) (Fig. 1A). Also shown in Fig. 1A is a single point (triangle) representing the MIA. Because the MIA is purely qualitative (positive or negative), ROC analysis is not meaningful. However, it is important to note that the single point representing the MIA also falls on the EIA/IFA ROC curves. In Fig. 1B, ROC analysis of the unweighted data is shown. Similar to the weighted analysis, there is no statistical difference in the areas under the curve between the EIA (0.82) and the IFA (0.76).
Because ANA testing is interpreted as “positive” or “negative” based on specific cutoff values, a comparison of clinical sensitivities and specificities at varying cutoffs were compared for the 3 methods (Table 2). For MIA, a strictly qualitative result, a sensitivity of 67% was observed in the weighted population, with a corresponding specificity of 87%. This step resulted in positive and negative LRs of 5.28 and 0.38, respectively. In comparison, in the unweighted analysis, the specificity decreased to 69% with an increase in the sensitivity to 86%. In the weighted cohort, the IFA demonstrated the highest sensitivity of 94% at a cutoff of 1:40, although this was in the context of a specificity of 43%. As the titer cutoff was increased to 1:80 and 1:160, the sensitivity decreased to 84% and 70%, respectively, with a corresponding increase in specificity of 62% at 1:80 and 77% at 1:160. Because of these lower specificities, the positive LRs for the IFA only reached a modest 3.07 at a cutoff of 1:160. Conversely, at the highest sensitivity, a significant negative LR of 0.14 was achieved. In contrast, using the weighted analysis, the EIA achieved the highest specificity at 97% (at a cutoff of 6.0 U). Even at the lowest cutoff of 1.1 U, the specificity remained relatively high at 80%. However, this increased specificity for the EIA came at the expense of sensitivity. At the cutoff of 1.1 U, the observed sensitivity was 74%, which decreased to 40% at a cutoff of 6.0 U. Because of the high specificities, the positive LRs for the EIA exceeded that of the IFA, ranging from 3.67 to 13.37. In contrast, the negative LRs, ranging from 0.32 to 0.62, while still moderate, did not reach the clinical significance of the IFA. When assessing the unweighted population, both the IFA and EIA showed trends similar to the MIA, specifically an increase in sensitivity with a decrease in specificity. In addition, in the unweighted analysis, the EIA achieved the highest specificity, while the IFA consistently showed the highest sensitivity, consistent with the weighted analysis.
To directly compare the MIA to other methods, clinical sensitivities were calculated at a given specificity (Table 3). For the qualitative MIA, the specificity was determined to be 87% in patients with no autoimmune disease. To make an appropriate comparison, cutoffs for the IFA and EIA were identified that resulted in a specificity of 90%, which were 1:320 and 1.6 U, respectively. At this specificity, the sensitivity of the EIA for diagnosis of a CTD was 67%, which is comparable to the 67% sensitivity of the MIA. In contrast, the sensitivity of the IFA at a specificity of 90% was significantly lower at 56% (P < 0.01). This analysis could not be performed using the unweighted cohort, since some tests did not reach the required specificity of 90%.
A similar analysis of sensitivity and specificity was performed using a combination of the IFA and EIA methods, with interval LRs used for comparison (Table 4). If both the EIA and IFA were negative (≤1.1 U and ≤1:40, respectively), the likelihood of a CTD was low, as indicated by the positive LR of 0.19. Interestingly, if the EIA was ≤1.0 U, a positive IFA result of 1:80/1:160 or ≥1:320 only increased the positive LR to 0.50 and 1.12. If the EIA fell into the weak positive or positive range, with a corresponding negative IFA, an insignificant positive LR of 0.61 was observed. In this same EIA range, a positive IFA increased the positive LR from 2.66 to 3.08. Lastly, a strong positive EIA result (≥6.0 U) was associated with the highest positive LRs, ranging from 6.17 to 18.50. Interestingly, a clinically relevant positive LR was observed for a strong positive EIA result, even if the IFA resulted at a titer of ≤1:40.
In 2011, the American College of Rheumatology released a position statement titled “Methodology of Testing for Antinuclear Antibodies.” This statement included a review of the literature related to clinical testing for ANAs, along with several recommendations for laboratories. One of these recommendations is that laboratories using newer methods need to provide data to physicians regarding clinical sensitivity and specificity of the chosen method in comparison to the traditional IFA. This study fulfills this recommendation by assessing the clinical performance of ANA testing by IFA, EIA, and MIA in our laboratory patient population. With this approach, rather than testing well-defined populations of patients with CTDs and disease controls, patients were selected at random from samples submitted to the clinical laboratory for ANA testing, presumably for a variety of clinical indications. Although the initial cohort was selected to equally represent negative, weak positive, positive, and strong positive ANA results, weighting of the group allowed for definition of a patient group that reflects the population for whom testing is ordered. It is important to recognize that this approach is meant to represent the laboratory's testing population and not the general population. Although each individual disease is relatively rare, the prevalence of CTDs as a group is approximately 1%–2%. Without weighting, the prevalence of a CTD diagnosis in our cohort is 22.7%. After the weighting algorithm is applied, which is based on the frequency of test results in our laboratory, the prevalence of CTDs drops to 7.6%. While still higher than in the general population, this is to be expected, as testing is likely ordered in patients with some clinical indication of a CTD, therefore driving a higher disease prevalence.
The weighting algorithm was an important step because, although generally thought otherwise, diagnostic sensitivity and specificity of a test does vary with disease prevalence. When the analysis was performed in the cohort without weighting, the sensitivity of the ANA testing increased with a corresponding decrease in specificity. This is similar to the observations of Brenner and Gefeller, who showed the sensitivity increases and specificity decreases when the prevalence of a disease increases (18). Because the prevalence of CTDs is relatively low, it would be extremely difficult to collect enough samples to represent the entire population and acquire a reasonable number of samples from affected patients. Collection of a population including an overrepresentation of the disease group, followed by application of the weighting algorithm, is a viable option (17).
Strictly analytical comparisons between ANA methods are challenging. Quantitative comparisons are not possible because different methods, specifically IFA and EIA, report results by vastly different mechanisms (titer and arbitrary units, respectively). Qualitative (positive/negative) agreement is also a challenge, since it requires that a single cutoff be established for each assay. For this study, qualitative agreement between methods was assessed by grouping results into negative, weak positive/positive, and strong positive categories. For results in which the IFA was ≤1:40, 88.8% also had a negative EIA result (≤1.0 U). Correspondingly, for samples with an IFA of ≥1:320, 75.8% had an EIA that would have been interpreted as weak positive, positive, or strong positive. These two categories represent the concordant negative and concordant positive groups, respectively. However, this same approach demonstrates that there are samples that fall into the categories of “IFA positive/EIA negative” and “IFA negative/EIA positive.” Results that are “IFA positive/EIA negative” are not unexpected. These results have generally been explained by the increased number of antigens in the HEp-2 cells compared to that in the lysates used for EIAs. In contrast, the finding of “IFA negative/EIA positive” may be surprising. In many studies, IFA has been used to define cohorts of CTD patients, potentially excluding patients with a CTD who are ANA negative by IFA (21, 22). Our data suggest that there may be some antibodies that react with antigens not well represented or “hidden” in the HEp-2 cells, such as SS-A, or that may bind to specific epitopes with increased exposure in the solid-phase EIA (23–26). To further complicate the comparison, approximately 5% of samples negative by both IFA (≤1:40) and EIA (≤1.0 U) were found to be positive on the MIA method. This is quite curious, given that the primary concern regarding multiplex assays has focused on the limited antigen repertoire. Again, whether this is related to how some antibodies recognize different epitopes between solid-phase assays using purified antigens compared to cellular substrates remains to be clarified. It would be interesting to know if certain antigen specificities are enriched in this group, or if these antibodies are associated with a CTD. Unfortunately, the number of these samples in our weighted cohort (n = 25) was not sufficient to address these questions. A study that targets this antibody phenotype and the corresponding clinical presentations would be an important addition to the field.
Because the analytical comparison has significant limitations, diagnostic utility for the three methods was assessed in our testing population, which presents its own challenges. Although diagnosis of a CTD is largely based on clinical evaluation, ANA and other specific autoantibody testing may provide supporting information, particularly in patients with compatible clinical symptoms. This scenario raises the question of how to assess sensitivity and specificity of a test when it is, in part, used to make the diagnosis. Although this issue cannot be entirely discarded, ANA testing plays only a small role in the diagnosis of a CTD, and a positive result would never be diagnostic for any specific CTD. In addition, the purpose of this study was only to compare ANA testing by different methods, with a relative comparison of their diagnostic utilities. Overall, no statistically significant differences in the clinical utility of the IFA, EIA, or MIA methods were identified, as based on comparison of the AUCs from the ROC analysis. However, depending on which specific cutoff is used by a laboratory, differences in the sensitivities and specificities of the various methods were noted. These differences could have significant clinical implications when interpreting a “positive” or “negative” ANA result.
A positive ANA result, while not diagnostic, may increase the likelihood for a diagnosis of a CTD. The utility of a positive result is determined largely by the specificity of the given test. For ANA testing, the EIA demonstrated relatively high specificities, resulting in the highest positive LRs of the 3 methods. In contrast, the IFA showed more modest positive LRs, which are reflecting the comparatively poor specificities of this method, even at relatively high titers. The MIA, with its qualitative interpretation, showed a positive LR of 5.28. Taken together, these data suggest that a positive result by EIA or MIA increases the likelihood of a CTD more significantly than that of a positive IFA result, although this is heavily influenced by the diagnostic cutoff used for interpretation.
A negative ANA, on the other hand, is considered to be useful for ruling out a CTD. The utility of a negative result to exclude a diagnosis relies on high sensitivity of the test. The American College of Rheumatology position statement indicates that IFA testing “should remain the gold standard for ANA testing,” which is based largely on studies that have demonstrated that this method displays the highest sensitivity for most CTDs, and for systemic lupus erythematosus in particular. Data in this patient cohort corroborate these findings, with the IFA demonstrating the highest sensitivity, although this was only observed at relatively low titers. In contrast, the sensitivities of the MIA and EIA did not reach those of the IFA. As a result, the IFA showed the best negative LR of 0.14 compared to both the EIA and MIA. However, it should be noted that even the negative LR of a negative IFA result using a low-titer cutoff only moderately decreases the likelihood of a CTD. Based on the analysis in this testing cohort, it seems that none of the three methods included have sufficient sensitivity such that a negative result conclusively excludes a diagnosis of a CTD.
In an attempt to improve the positive and negative predictive value of ANA testing, a combination of EIA and IFA was assessed. Results ≥1:320 by IFA and ≥6.0 U by EIA had the highest positive LR (13.88) for a diagnosis of a CTD. However, this appears to be no different than the positive LR of 13.3 for an isolated EIA result of ≥6.0 U. In addition, for patients with a strong positive EIA result and a negative or weak positive IFA result, there is still a moderate increase in the likelihood of the individual having a CTD. These data suggest that if a patient has a positive ANA by this EIA method, addition of an IFA is unlikely to rule out (negative IFA) or significantly add to the positive predictive value (positive IFA) of a CTD. In contrast, if a patient has a positive IFA, additional testing by EIA may be useful. In this scenario, if the EIA is positive, the predictive value of the results is enhanced over the IFA alone, whereas if the EIA is negative, the likelihood of a CTD is substantially decreased.
There are several limitations to this study that should be taken into account. A single kit or assay was included to represent each of the 3 methodologies. Given that there may be some kit-to-kit variation, results from the EIA in this study, for example, may not be generalizable to all EIA kits on the market. As an extension of this, the diagnostic utility determined in this study is a reflection of our patient cohort. For laboratories with different testing populations, the observed sensitivities and specificities, particularly at the different cutoffs, may vary significantly. Laboratories may consider similar studies, the results of which may be a more accurate assessment of ANA testing in their patient population using their specific kit and method of interest.
This study compared the clinical sensitivity, specificity, and overall diagnostic performance of the three primary ANA methods currently available to the clinical laboratory. Although, overall, the methods displayed similar diagnostic utility, differences in sensitivity and specificity were observed. All methodologies for ANA testing have limitations, and the performance of a given method will be affected by the patient population (rheumatologic compared to general population) and how the testing will be used (to confirm or rule out a diagnosis). Laboratories must take these considerations into account when choosing the method to be implemented for ANA screening.
↵5 Nonstandard abbreviations:
- antinuclear antibody
- connective tissue disease
- immunofluorescence assay
- enzyme immunoassay
- multiplex immunoassay
- likelihood ratio.
Authors' Disclosures or Potential Conflicts of Interest: Upon manuscript submission, all authors completed the author disclosure form.
Employment or Leadership: L.A. Brunelle, Mayo Medical Laboratories.
Consultant or Advisory Role: None declared.
Stock Ownership: None declared.
Honoraria: None declared.
Research Funding: M.R. Snyder, Bio-Rad.
Expert Testimony: None declared.
Patents: None declared.
Role of Sponsor: The funding organizations played no role in the design of study, choice of enrolled patients, review and interpretation of data, and final approval of manuscript.
- Received April 19, 2016.
- Accepted April 27, 2016.
- © 2016 by American Association for Clinical Chemistry