- Research article
- Open Access
Development and validation of a computerized expert system for evaluation of automated visual fields from the Ischemic Optic Neuropathy Decompression Trial
BMC Ophthalmology volume 6, Article number: 34 (2006)
The objective of this report is to describe the methods used to develop and validate a computerized system to analyze Humphrey visual fields obtained from patients with non-arteritic anterior ischemic optic neuropathy (NAION) and enrolled in the Ischemic Optic Neuropathy Decompression Trial (IONDT). The IONDT was a multicenter study that included randomized and non-randomized patients with newly diagnosed NAION in the study eye. At baseline, randomized eyes had visual acuity of 20/64 or worse and non-randomized eyes had visual acuity of better than 20/64 or were associated with patients refusing randomization. Visual fields were measured before treatment using the Humphrey Field Analyzer with the 24-2 program, foveal threshold, and size III stimulus.
We used visual fields from 189 non-IONDT eyes with NAION to develop the computerized classification system. Six neuro-ophthalmologists ("expert panel") described definitions for visual field patterns defects using 19 visual fields representing a range of pattern defect types. The expert panel then used 120 visual fields, classified using these definitions, to refine the rules, generating revised definitions for 13 visual field pattern defects and 3 levels of severity. These definitions were incorporated into a rule-based computerized classification system run on Excel® software. The computerized classification system was used to categorize visual field defects for an additional 95 NAION visual fields, and the expert panel was asked to independently classify the new fields and subsequently whether they agreed with the computer classification. To account for test variability over time, we derived an adjustment factor from the pooled short term fluctuation. We examined change in defects with and without adjustment in visual fields of study participants who demonstrated a visual acuity decrease within 30 days of NAION onset (progressive NAION).
Despite an agreed upon set of rules, there was not good agreement among the expert panel when their independent visual classifications were compared. A majority did concur with the computer classification for 91 of 95 visual fields. Remaining classification discrepancies could not be resolved without modifying existing definitions.
Without using the adjustment factor, visual fields of 63.6% (14/22) patients with progressive NAION and no central defect, and all (7/7) patients with a paracentral defect, worsened within 30 days of NAION onset. After applying the adjustment factor, the visual fields of the same patients with no initial central defect and 5/7 of the patients with a paracentral defect were seen to worsen.
The IONDT developed a rule-based computerized system that consistently defines pattern and severity of visual fields of NAION patients for use in a research setting.
The Ischemic Optic Neuropathy Decompression Trial (IONDT) was a randomized clinical trial designed to test the safety and efficacy of optic nerve decompression surgery (ONDS) combined with careful follow-up for treatment of non-arteritic anterior ischemic optic neuropathy (NAION), as well as to document the natural history of NAION . Using visual acuity as the primary outcome measure, the IONDT demonstrated that ONDS is not effective and may be harmful .
For NAION, characterized clinically as causing visual field loss, conclusions about treatment efficacy and natural history based on visual acuity outcomes alone may be inadequate. For this reason, change in the visual field, as measured by the Humphrey Visual Field Analyzer (HVF), was a planned secondary outcome in the IONDT. The Humphrey Visual Field Analyzer® (Zeiss Humphrey, San Leandro, Ca, USA) provides a standardized testing environment, quantitative assessment of threshold sensitivity to spots of light at fixed points throughout the visual field, and data regarding reliability of patients' responses.
In the IONDT, we found no difference between visual fields from ONDS and careful follow-up groups at 6 months using the HVF global measure, "mean deviation" (MD). However, MD by itself may be an insufficient measure for assessment of visual fields in eyes with NAION. For example, the classical patterns of defect encountered in NAION may shift without changing average loss or there may be important changes in sensitivity within small areas of the visual field corresponding to nerve fiber bundle defects. These changes in area or size may not be detected when averaged into the MD measure, warranting a more detailed analysis of the quantitative visual field testing.
Development and validation of a system for classifying and assessing change in visual fields is complex due to the lack of "gold standards". Glaucoma trials have utilized a number of approaches for evaluating progression, but the algorithms seldom include classifications based upon the defect type. [3–6] Although the Optic Neuritis Treatment Trial (ONTT) investigators categorized visual field defects, they did not use strict definitions for classification and patterns of field loss were qualitatively rather than quantitatively determined .
Despite a variety of anticipated challenges, in the IONDT we set out to develop a rule-based computerized system for classifying and analyzing visual fields. Our intent was to create logic-based computer algorithms that reliably reproduced the clinical classifications of visual field defects encountered in NAION so as to evaluate the IONDT visual fields. The computer algorithm is intended for use in a clinical research setting where standardization of classification is required.
We have previously described the IONDT eligibility criteria, randomization procedure, and visit protocols in detail . Briefly, patients aged 50 years or older were eligible for randomization into surgical or careful follow-up groups if they had symptoms and signs characteristic of NAION for 14 days or less in one eye. Patients with visual acuity of 20/64 or less in the study eye comprised the "regular entry" group, while patients otherwise eligible but with visual acuity better than 20/64 were enrolled only if visual acuity decreased to 20/64 or worse within 30 days of onset of symptoms ("late entry" group). Patients with acuity better than 20/64 and otherwise eligible and patients who refused randomization were followed as part of a non-randomized group. Institutional review boards at participating institutions approved the protocol and all participating patients provided signed informed consent.
We completed visual field testing of study and non-study eyes of all enrolled patients at baseline, for both randomized and non-randomized eyes. In the IONDT, automated perimetry was performed by trained certified visual field technicians using the HVF, 24-2 program with stimulus size III, full threshold strategy, and with foveal sensitivity measured concurrently. Visual fields for the study eye were measured before those for the non-study eye. For randomized patients, visual fields were obtained at the baseline examination; if randomization took place more than 24 hours after baseline, visual fields were re-measured. Clinical Centers measured visual fields prospectively. For randomized patients, this was at the 3, 6, and 12-month visit, at study closeout (minimum of 5 years of follow-up), and at approximately annual intervals between the 12-month and closeout visits. Visual fields for non-randomized patients were obtained at the baseline examination, at either 6 or 12-month visit or both, closeout, and at approximately one-year intervals between the 12-month and closeout visits. All patients were followed for at least 5 years. Visual field data were evaluated as a secondary outcome measure and not utilized for decision-making during the conduct of the trial.
Methods used to develop the computerized visual field classification system
Development of classification system
In 1994, we formed a Visual Field Committee (VFC), which included an "expert panel" of six IONDT neuro-ophthalmologists (AA, SMC, SEF, LNJ, GK, SAN) with expertise in the interpretation of visual fields, five methodologists (KD, JK, PL, RWS, PDW), and a programmer (LL). The number of experts required on the panel was decided after a statistical computation determined that the chance of six experts agreeing on ten patterns by guessing would be 0.00001. A majority of the experts needed to agree to categorize a field defect as a specific pattern. The chance of this degree of concordance occurring by guessing alone was 0.01215. For any field in which the agreement among panelists was not significantly better than guessing, the field was considered 'non-classifiable'.
The VFC established the protocol for developing visual field defect categories, training and evaluation of the expert panel, developing and testing of a computerized classification system, and defining progression using this system. The Committee based the sequence of steps that would be used to develop and validate the computerized expert system (see Figure 1) after Molino and associates . The expert panel formulated the definitions of the various types of field defects, (e.g. normal, absolute defect, diffuse depression), all of which were based solely on data available within the 24-2 visual field.
We used 189 visual fields to develop the computer based classification system, none of which were associated with patients enrolled in the IONDT. Eighty-one visual fields were included from patients with NAION who were screened but not eligible for the IONDT. Reasons for ineligibility included refused enrollment, age < 50 years, onset of symptoms unknown or > 14 days, unable to read English, myocardial infarction within last 6 months, a visual condition precluding reliable visual acuity measurement, and current anticoagulant or corticosteroid use. One hundred eight visual fields from NAION patients not screened or enrolled in the IONDT, and seen at Doheny Eye Institute (n = 24), Emory University (n = 50), or the University of Missouri's Mason Eye Institute (n = 34) were also used to develop the computer-based system, following institutional review boards approval at each institution. All visual fields used to develop the computer-based classification system had been evaluated for reliability and had < 3 fixation losses, < 3 false positive responses, and < 3 false negative responses.
The expert panel first formulated initial operational definitions of visual field defects corresponding to the 52 points in the 24-2 Humphrey scale (see Figure 2). Global guidelines included the following rules:
If a field is classified as normal or absolute (no perception of stimuli), no other classification may be made.
A depressed point is defined as equal to, or greater than, 4 dB loss.
Fields are classified even though they appear unreliable from the HVF indices (i.e., short term fluctuation).
Severity is based upon subjective judgment. Only the arcuate/altitudinal category may have more than one severity with a separate severity assignable to the arcuate and the altitudinal components.
Definitions were refined through an iterative process using an "evaluation set" of visual fields until consensus was reached, as follows:
The VFC director reviewed the 189 visual fields to select 19 with one or more representative defects (evaluation set), and then sent the evaluation set to each of the 6 expert panelists, along with instructions, a grading form, and proposed definitions for 13 types of defects and for levels of severity. Members of the expert panel independently reviewed the fields and definitions, and, after telephone and face-to-face meetings, agreed upon modified definitions of pattern defects.
The VFC Director used the modified pattern defect definitions to re-classify the 19 visual fields in the evaluation set. Each member of the expert panel independently reported the degree to which s/he agreed with the classification of each field, choosing from among the following choices: excellent, good, uncertain, poor, or bad. At the same time, the panelists were instructed to categorize the severity (density) of each defect as mild, moderate, severe, or absolute (Table 1). Because there was again lack of agreement among the expert panel on the classification, the group met face-to face to discuss and revise the existing definitions for a second time. Disagreements were resolved by allowing three categories of field defects: peripheral, paracentral, and central as well as a category of "other" which could be used only for visual fields that were impossible to fit into any other specific category.
Using the revised definitions derived from the evaluation set, the VFC then sent a "training set" of 97 masked, representative, non-IONDT NAION fields to the expert panel for classification. To assess the ability of the panelists to apply the rules reliably, 11 duplicate fields from the training set and 12 fields from the evaluation set were included for a total of 120 fields.
At least five of six (83%) panelists independently agreed on the defect classification for 55 of 120 fields comprising the training set. Agreement on classification of the remaining 65 fields was achieved through a series of four interactive reconciliation meetings of the expert panel, held either by teleconference or in person. These discussions resulted in further refinement and finalization of the pattern definitions and consensus on classification of all fields in the training set.
The final classification system included "normal" and 13 different rule-based field or defect types, shown in Table 2. Severity was restricted to mild, moderate, and severe, and was defined subjectively.
The computer-based expert system
The VFC Director (SEF) and programmer (LL) translated the rules for the defect definitions and the general rules into a point-by-point set of algorithms applied using logical statements included with standard Excel software (computer-based expert system).
The computer-based expert system, constructed as a rule-based system on an Excel® platform, run under Windows 98®, evaluated each field quadrant by quadrant. Quadrants were then analyzed in combination as needed to encompass definitions of all identified types of defects. The programmer translated each rule into a logical statement that could be found true or false, taking the form "if ... then". A truth table was utilized to define specific types of field defects, based upon definitions of the expert panel. Two forms of logical statements were used to identify pattern defects. The first statement was based upon average dB loss within a quadrant. If the average loss did not meet the criteria for depression (i.e., 4 dB) then the alternative statement, based on the number of disturbed points within a quadrant, was used to determine the presence of pattern defects. Thus, the number of disturbed points was used primarily to find mild defects that were missed by averaging.
For instance, if the average dB loss was greater in the periphery than in the central field by 5 dB, then an arcuate defect was defined as present in that quadrant (see definition in Table 2). If the central dB loss was greater by 5 dB than the periphery, then a central defect was present. If no pattern defect was found by averaging, then disturbed point algorithms were used to find mild or smaller pattern defects within a quadrant or defect-appropriate combination of quadrants. A fixed predetermined number of disturbed points had to be within the boundary of a pattern defect for that defect to be considered present. For example, a superior arcuate defect is defined as four depressed points within one or both upper quadrants (see definition in Table 2).
Some pattern defects were determined by the presence or absence of other defects. For example, if there were superior and inferior altitudinal defects and a central scotoma, then the pattern was defined as diffuse depression. If there was both a paracentral scotoma and a central scotoma, then the pattern was defined as a central scotoma alone.
Average dB loss within a pattern defect and number of disturbed points was used to classify a defect as mild, moderate, or severe. The severity classification of the expert panel was used to define the boundaries for each type of defect. Table 3 shows how severity for an altitudinal defect was determined using 23 altitudinal defects in the training set identified by the expert panel. Severity for other defects was similarly determined (number and type of defect used to determine severity scores: 9 paracentral scotomas; 26 arcuate defects; 20 diffuse depression defects; and 3 nasal step defects). Classification as an absolute defect (i.e. no response to the brightest stimulus at all points tested on the 24-2 HVF) required use of actual sensitivity rather than relative sensitivity loss.
Definition of change across time
Calculation of SFC
To measure change in visual field defects over time (i.e., from baseline to a follow-up visits) we planned to analyze visual fields at multiple time points and compare defect type and severity. We anticipated that change in an individual's visual field could entail change in defect type, defect severity, both defect type and severity, identification of a new defect at follow-up not observed at baseline, or disappearance of a defect observed at baseline.
Because spurious changes in visual fields caused by patient learning effects or by short term fluctuation were possible, we decided to use the Humphrey global index short term fluctuation (SF), a measure of intra-test variability, as a standard by which to determine the normal variation within an individual's visual fields over a fixed time period. SF is determined by measuring the difference between two threshold measurements at 10 different points across the entire field during the same test. The average of these differences constitute the SF. Clinically, a small SF (1 to 2 dB) indicates a reliable field . To estimate the normal variation of an individual's visual fields measured at baseline and follow-up, we used a pooled estimate, SFC, for both visits, calculated as follows:
where SFC, is half of the 95% confidence interval on the pooled estimate of SF across both visits, SFbaseline is the SF measured for the visual field at baseline and SFfollow-up is the SF measured at follow-up (i.e., from the visual field obtained at the 6 month visit for determining change from baseline to the 6 month visit or from the visual field obtained at the 12 month visit for change from baseline to the 12 month visit).
When there was an apparent change in defect type from baseline to follow-up we removed the effect of normal variation by using the value of SFC to "adjust" the follow-up visit Humphrey visual field at key points used by the computerized expert system to differentiate between defects types. The adjustment was made in the direction that would decrease the probability of detecting a change in defect type from baseline to follow-up. The adjusted data was then reclassified. For example, if an individual had visual fields classified as having a superior arcuate defect at baseline and a superior altitudinal defect at the follow-up visit, that patient's SFC for these visits was subtracted from the points that distinguish an arcuate from an altitudinal defect in the computerized expert system, i.e., paracentral points 21 and 22, in the follow-up field. This adjusted follow-up visual field was then re-classified. If the adjusted visual field was still classified as having a superior altitudinal defect, then the superior portion of the follow-up field would be classified as having changed from baseline, from a superior arcuate to a superior altitudinal defect. On the other hand, if the adjusted follow-up visual field was classified as having a superior arcuate defect, then the follow-up visual field was classified as "not changed" with a superior arcuate defect for both visits. This general approach was used to distinguish an arcuate from an altitudinal defect, and a central from a paracentral defect.
Change in severity was also determined after applying SFC but was only evaluated in fields whose defects were classified as not changed.
Validation of the classification scheme
A set of 95 non-IONDT NAION visual fields was sent to the expert panel as a "validation set"; of these, 22 were masked duplicates chosen systematically from the original training set (every fifth field listed in ID numeric order). The level of agreement on classification of these fields among the expert panel and the corresponding agreement of the computer with the panel members' classifications is shown in Table 4. Reliability of individual panel members in re-classifying defects in the 22 duplicate visual fields from the evaluation set averaged 57% (range; 32% to 77%), despite a common set of definitions derived and finalized by consensus.
Figures 3 through 7 show representative visual fields used in the validation process and illustrate the type of disagreement that was found. Figure 3 shows an example of visual fields for which the expert panel members independently arrived at pattern and severity classifications that were exactly the same as the computerized classification. Figure 4 shows a visual field in which the members agreed among themselves but not with the computer classification, and Figures 5, 6 and 7 show fields for which there was little agreement among the expert panel during independent classification.
We then used an alternative validation approach, whereby the panelists were asked to agree or disagree with the computer's classification. We changed the question posed to panel members from one of application of the rules to classify the defects in this visual field to "does the consistent application of consensus-derived rules applied by the computer program result in a classification of this visual field that is clinically acceptable?" There were only 4 of 95 instances in which the majority (≥ 50%) of the panelists did not believe that the computer classification was clinically acceptable (Table 5). Specific differences were
Identification of an additional mild superior altitudinal defect by the computer, but not the panel members for one field;
Classification by the computer of one field as having a severe diffuse defect and as a combination of three separate defects (superior arcuate, inferior arcuate, and central scotoma) by the expert panel;
Classification by the computer as an altitudinal or arcuate defect and by the expert panel as an arcuate or altitudinal, respectively in 2 visual fields.
Figure 7 is an example of the third type of disagreement listed. Although two members of the panel concurred with the computer classification (superior arcuate, inferior altitudinal, and central scotoma), two members classified the inferior defect as an arcuate and one member classified the superior defect as an altitudinal defect. One member believed that only a superior arcuate was present. Investigation revealed that if the computer algorithm were modified to allow concordance with the panel members, other classification errors would result; therefore, these discrepancies were allowed to stand. Thus, there was majority agreement of the expert panel and the computer classification in 91 of 95 (96%) fields.
Validation of change
To test our approach to define change, we examined the "change" in the study eye visual field from baseline to the randomization visit for IONDT late entry patients. It is reasonable to expect that a majority of late entry patients experienced a change in the central visual field in addition to the measured change in visual acuity. Table 6 shows the unadjusted number and type of central defects observed in visual fields of 47 IONDT late entry patients at the baseline and randomization visits. Using data without any adjustment for normal variation, we found that 14 of 22 (63.6%) of the patients who had neither a paracentral nor central defect at baseline developed a central defect by the randomization visit. In addition, all 7 patients who started with a paracentral defect developed a central defect by the randomization visit. Of 18 patients starting with a central defect, only one changed to a paracentral defect at randomization.
When we applied the adjustment, SFC, for each patient's normal variation to the visual fields of the late entry eyes, the classification of 2/47 randomization fields was different. Five rather than the initial 7 patients who had a paracentral defect at baseline had a central defect at randomization (see Table 7). All other defect changes remained the same. A Stuart-Maxwell chi-square test of homogeneity showed that the shift in distribution of defects from baseline to randomization as shown in Table 7 is statistically significant (p = 0.0003). There was no observed change in severity (average dB loss) for the central defect of the 17 study participants who had a central defect at both baseline and randomization (mean 11.5 dB versus 6.7 dB at baseline and randomization, respectively; p = 0.09) after SFC adjustment. Figures 8 and 9 show examples of visual fields obtained at baseline and randomization visits in two late entry IONDT study participants; these examples show the type of change detected by the computerized system.
Automated perimetry facilitates the collection of quantitative data on the pattern and severity of visual field defects. To date, however, full use has not been made of quantitative data for detection, characterization, or progression of visual field defects in ischemic optic neuropathy. In the IONDT, we developed and validated a reliable rule-based system. This system proved capable of consistently defining pattern and severity of visual field defects detected in eyes with NAION enrolled in the IONDT. It also allowed for evaluation of change in visual field defect and severity over time. Development of this system entailed devising definitions for visual field defects encountered in NAION patients using an iterative process and expert panel and defining progression. All decision rules were based upon the opinion of the expert panel; these rules then provided the basis by which all field classifications were made. Further testing of the system showed that this rule-based computer program is a valid method for analysis of the patterns and severities of visual field data for individuals with NAION.
Development and validation of a system for classifying visual fields is complex, given that there is no existing "gold standard" for defect classification and that experts are unable to reach agreement on defect classification, at least in this study. This type of problem is well known in medicine. For instance, studies validating the use of computer-assisted diagnosis tools [8, 10, 11] suggest that the differences between computer diagnosis and human expert diagnosis differed by about the same extent as human experts disagree among themselves. The diagnostic variability in this study was similar to performance of humans and computers in validations of other expert systems for interpretations lacking a "gold standard", for which agreement ranged from 50% to 70% [8, 10, 11]. Given that computerized diagnosis may be no better than that of an expert panel, the principal reason for utilizing a computerized system in the context of a clinical trial is to reduce inconsistency by eliminating intra- and inter-observer variability. For example, we found that members of the expert panel often did not classify a visual field the same way they previously classified it. Thus, use of a computerized system reduces variability, although not necessarily the original bias of the expert panel in classification of visual field defects. Once incorporated into a computer system, the criteria for categorizing the pattern and severity of visual field defects are, according to Hirsbrunner and colleagues , "explicit, obvious, and standardized." Such attributes are essential within the context of randomized clinical trials.
Development and validation of a classification system for visual fields requires several steps [8, 10, 11]. First, an expert panel must achieve consensus on a set of rules for classifying defects. Second, the experts apply the rules successfully, i.e., the rate of agreement was not meaningfully different from agreement reported for similar classification systems in other medical contexts. Third, the consistent application of the rules by a computerized system produces classifications that do not disagree with the panel more than the expert panel disagrees with itself. Finally, the computerized system produces reasonable defect classifications, defined as classifications with which the expert panel rarely disagrees.
We recognized that more than one interpretation was possible for a given distribution of disturbed points on a visual field and that it was not going to be possible for all the experts to agree on a gold standard to evaluate the computerized system. Thus, we elected to accept the computerized determination given that the panel considered it to be consistent with clinical interpretation.
A quantified or computerized analysis of visual fields that approximates a human interpretation of an automated visual field faces particular challenges in three areas – detection, progression, and characterization of the defect. Difficulties in detection of defect relate primarily to distinguishing appropriately between short-term and long-term fluctuation. This problem is further compounded in various disease states, such as glaucoma, in which the pathological process itself produces fluctuation in sensitivity . The Ocular Hypertension Treatment Study used multiple confirmation fields to diagnose the presence or absence of a defect and provides an example of a method to deal with clinical detection of visual field defects . More advanced models of visual field perturbations, such as those by De la Rosa and colleagues , utilize an approach for rapid assessment of glaucomatous field defects based upon multiple correlations. Although the IONDT computerized system cannot distinguish between short and long-term fluctuation when detecting a defect pattern within a single field, it does use a standard set of rules for classification and detection and thus provides for consistent identification and classification of defects.
Progression of field defects is a common end-point for glaucoma studies. The issue, once again, is determining change, but from an abnormal as opposed to a normal baseline. Katz  reviewed scoring methods employed by two multicenter clinical trials, the Advanced Glaucoma Intervention Study (AGIS) and the CIGTS. These studies utilized a cumulative score (0–20), based upon depression of adjacent points occurring within specified regions of the visual field. Depression was defined by total deviation plot on the HVF printout in the AGIS and by probability values in the CIGTS. McNaught and co-workers  developed a linear model of point wise sensitivity values against time to identify progression in normal tension glaucoma. By any of these methods, detection and progression could be determined operationally, based on the sensitivity and reliability required in a particular study. The IONDT used change, defined as decibel loss or increased number of points within defects identified at baseline, to detect progression using the computerized classification system, after adjusting for measured within-individual variations in performance.
In contrast to detection and progression of visual field defects, characterization is a more complex task. It requires pattern recognition open to multiple interpretations and preferences (e.g., "lumping" versus "splitting"). Typically, glaucoma visual field interpretation does not address visual field characterization. In one of the few clinical trials to utilize pattern recognition as an outcome for visual field testing, the Optic Neuritis Treatment Trial (ONTT) established 15 monocular types of field defects (14 local and diffuse) of three different severities occurring in optic neuritis . The Director and Associate Director of the ONTT Visual Field Reading Center reviewed visual fields separately, then together, to "reach a consensus on the final classification for each visual field." Initial agreement was noted for 76.3% of the HVF, 81.5% on the location and 74% on the shape. Complete agreement in every category was achieved in only 47.4% of 309 affected eyes. In a masked retest, the agreement on shapes was present for 76.2% of 42 cases [7, 18]. The same investigators have recently developed a similar classification methodology for visual fields obtained in the Ocular Hypertension Treatment Study (OHTS). Complete agreement in classification among three readers was achieved in 64%–66% of defects and majority agreement was achieved in an additional 31%–33% .
Other methods have been used to characterize visual fields. For example, neural networks have been touted as providing a means for allowing computers to "learn" how to categorize visual fields correctly, even in the absence of specified rules. In the supervised class of artificial neural networks, the systems require a training set of "correctly" categorized visual fields to allow learning to occur [20–23]. Thus, there is a tautology in that, in the absence of rules, how is such a training set derived? Henson and associates suggest that unsupervised neural networks can be used to resolve this dilemma, as they are self-classifying . However, the patterns correspond to the number of nodes used in the neural network and do not necessarily correspond to clinically identified field defects.
In designing the computerized system for evaluation of IONDT visual fields, we encountered several methodological issues that could have influenced definitions of defect classification and/or change. First, fields were obtained using full threshold strategy rather than SITA, which resulted in prolonged testing times. SITA strategies were unavailable at the outset of patient recruitment and had not yet been completely validated by the study end. Second, because the IONDT did not formally train study patients on performing Humphrey visual field exams before collecting study visual field data, some observed changes may be due to learning effects over time. The importance of training was not generally recognized in 1992, when the IONDT began. However, the testing method used in the IONDT is probably generalizable, given that most patients in a clinical setting do not undergo visual field training sessions. Despite these methodological issues, the observed changes in pattern classification and severity of IONDT visual fields were remarkably consistent over time , suggesting that it is unlikely that the computer system algorithms had substantial classification errors.
Another concern relating to study validity was the failure of the 6 experts to agree completely on a sizable proportion of defect classifications for the test fields during the initial validation. The number of experts we included in our testing differed substantially from those utilized in virtually all other prospective trials involving visual fields. This was a deliberate decision to ensure rigor and avoid chance agreement. The observed lack of concordance in classifying defects by the 6 experts is most likely due to the number of experts (6 experts rather than the usual 2 experts plus a tie-breaker) and independence of reviewers (experts from geographically dispersed clinical centers). Indeed, we believe members of our expert panel were more likely to manifest true independence in decision-making than experts at a single reading center.
In summary, we developed a computerized method for analyzing automated perimetric data in the IONDT. This system is a validated rule-based system capable of consistently defining pattern and severity of visual field defect encountered in NAION patient. Its primary use is in the research setting; these methods are not meant to replace clinical evaluation. Once incorporated into a computer system, the criteria for categorizing the pattern and severity of visual field defects are explicit, obvious, and standardized. Such attributes are essential within the context of randomized clinical trials.
Advanced Glaucoma Intervention Study
Collaborative Initial Glaucoma Treatment Study
Humphrey Visual Field Analyzer
Ischemic Optic Neuropathy Decompression Trial
mean deviation, a Humphrey Visual Field Analyzer global index
non-arteritic anterior ischemic optic neuropathy
Ocular Hypertension Treatment Study
optic nerve decompression surgery
Optic Neuritis Treatment Trial
short term fluctuation, a Humphrey Visual Field Analyzer global index
- SFC :
the pooled estimate of short term fluctuation across two visits
Swedish Interactive Thresholding Algorithm
Visual Field Committee for the Ischemic Optic Neuropathy Decompression Trial
Ischemic Optic Neuropathy Decompression Trial Research Group: The Ischemic Optic Neuropathy Decompression Trial (IONDT): design and methods. Control Clin Trials. 1998, 19: 276-296. 10.1016/S0197-2456(98)00003-8.
Ischemic Optic Neuropathy Decompression Trial Research Group: Optic nerve decompression surgery for nonarteritic anterior ischemic optic neuropathy (NAION) is not effective and may be harmful. JAMA. 1995, 273: 625-632. 10.1001/jama.273.8.625.
Advanced Glaucoma Intervention Study investigators: Advanced glaucoma intervention study: 2. Visual field test scoring and reliability. Ophthalmology. 1994, 101: 1445-1455.
Katz J, Sommer A, Gaasterland D, Anderson DR: Comparison of analytic algorithms for detecting glaucomatous visual field loss. Arch Ophthalmol. 1991, 109: 1684-1689.
Leske MC, Heijl A, Hyman L, Bengtsson B: Early Manifest Glaucoma Trial: Design and baseline data. Ophthalmology. 1999, 106: 2144-53. 10.1016/S0161-6420(99)90497-9.
Musch DC, Lichter PR, Guire KE, Standardi CL, the CIGTS Study Group: The Collaborative Initial Glaucoma Treatment Study: study design, methods, and baseline characteristic of enrolled patients. Ophthalmology. 1999, 106: 653-662. 10.1016/S0161-6420(99)90147-1.
Keltner JL, Johnson CA, Beck RW, Cleary PA, Spurr JO, the Optic Neuritis Study Group: Quality control functions of the visual field reading center (VFRC) for the Optic Neuritis Study Group. Control Clin Trials. 1995, 14: 143-159. 10.1016/0197-2456(93)90016-7.
Molino G, Marzuoli M, Molino F, Battista S, Bar F, Torchio M, Lavelle SM, Corless G, Cappello N: Validation of ICTERUS, a knowledge-based expert system for jaundice diagnosis. Method Inform Med. 2000, 39: 311-318.
Choplin NT, Edwards RP: Basic principles of visual field interpretation. Visual Field Testing with the Humphrey Field Analyzer. 1995, SLACK Incorporated, Thorofare, NJ
Hernandez C, Sancho JJ, Belmonte MA, Sierra C, Sanz F: Validation of the medical expert system RENOIR. Comput Biomed Res. 1994, 27: 456-71. 10.1006/cbmr.1994.1034.
Sutton GC: How accurate is computer-aided diagnosis. Lancet. 1989, 2 (8668): 905-908. 10.1016/S0140-6736(89)91560-2.
Hirsbrunner H-P, Fankhauser F, Jenni A, Fundhouser A: Evaluating a perimetric expert system: experience with Octosmart. Graefes Arch Clin Exp Ophthalmol. 1990, 228: 237-241. 10.1007/BF00920027.
Werner EB, Saheb N, Thomas D: Variability of static visual threshold responses in patients with elevated IOPs. Arch Ophthalmol. 1982, 100: 1627-1631.
Keltner JL, Johnson CA, Quigg JM, Cello KE, Kass MA, Gordon MO for the Ocular Hypertension Treatment Study Group: Confirmation of visual field abnormalities in the Ocular Hypertension Treatment Study. Arch Ophthalmol. 2000, 118: 1187-1194.
De la Rosa MG, Reyes JAA, Sierra MAG: Rapid assessment of the visual field in glaucoma using an analysis based on multiple correlations. Graefes Arch Clin Exp Ophthalmol. 1990, 228: 387-391. 10.1007/BF00927247.
Katz J: Scoring systems for measuring progression of visual field loss in clinical trials of glaucoma treatment. Ophthalmology. 1999, 106: 391-395. 10.1016/S0161-6420(99)90052-0.
McNaught AI, Crabb DP, Fitzke FW, Hitchings RA: Modelling series of visual fields to detect progression in normal-tension glaucoma. Graefes Arch Clin Exp Ophthalmol. 1995, 233: 750-755. 10.1007/BF00184085.
Keltner JL, Johnson CA, Spurr JO, Back RW, Optic Neuritis Study Group: Baseline visual field profile of optic neuritis; the experience of the Optic Neuritis Treatment Trial. Arch Ophthalmol. 1993, 111: 231-234.
Keltner JL, Johnson CA, Cello KE, Edwards MA, Bandermann SE, Kass MA, Gordon MO for the Ocular Hypertension Study Group: Classification of visual field abnormalitieis in the Ocular Hypertension Treatment Study. Arch Ophthalmol. 2003, 121: 643-65. 10.1001/archopht.121.5.643.
Brigatti L, Hoffman D, Caprioli J: Neural networks to identify glaucoma with structural and functional measurements. Am J Ophthalmol. 1996, 121: 511-521.
Goldbaum MH, Sample PA, Chan K, Williams J, Lee TW, Blumenthal E, Girkin CA, Zangwill LM, Bowd C, Sejnowski T, Weinreb RN: Comparing Machine Learning Classifiers for Diagnosing Glaucoma from Standard Automated Perimetry. Invest Ophthalmol Vis Sci. 2002, 43: 162-169.
Keating D, Mutlukan E, Evans A, McGarvie J, Damato B: A back propagation neural network for the classification of visual field data. Phys Med Biol. 1993, 38: 1263-1270. 10.1088/0031-9155/38/9/006.
Kelman SE, Perell HF, D'Autrechy L, Scott RJ: A neural network can differentiate glaucoma and optic neuropathy visual fields through pattern recognition. Perimetry Update 1990/91. Proceedings of the Sixth International Perimetric Society Meeting, Malmo, Sweden, June 17–20, 1990. Edited by: Mills RP, Heijl A. 1991, Amsterdam/New York: Kugler Publications, 287-290.
Henson DB, Spenceley SE, Bull DR: Spatial classification of glaucomatous visual field loss. Br J Ophthalmol. 1996, 80: 526-531.
Feldon SE: Computerized expert system for evaluation of automated visual fields from the Ischemic Optic Neuropathy Decompression Trial: Methods, baseline fields, and six-month longitudinal follow-up. Trans Am Ophthalmol Soc. 2004, 102: 269-303.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2415/6/34/prepub
The Ischemic Optic Neuropathy Decompression Trial study was supported under cooperative agreements by the National Eye Institute, Bethesda, Maryland, EY09608, EY09545, EY09556, EY09555, EY09554, EY09576, EY09565, EY09551, EY09599, EY09584, EY09578, EY09572, EY09575, EY09567, EY09598, EY09550, EY09553, EY09566, EY09569, EY09579, EY09571, EY09568, EY09557, EY09552, EY09570, EY09582, and EY09626 and a Challenge Grant to University of Rochester Department of Ophthalmology from Research to Prevent Blindness.
Members of the Ischemic Optic Neuropathy Decompression Trial Research Group were as follows.
Allegheny General Hospital, Pittsburgh, Pa: John Kennerdell, MD (Principal Investigator); Anna Bruchis, MD (Coordinator);
St. Louis University Eye Institute, St. Louis University Health Sciences Center, St. Louis, Mo: Sophia Chung, MD (Principal Investigator); Dawn Govreau (Coordinator); John Holds, MD; John Selhorst, MD;
Carolinas Medical Center, Charlotte, NC: Mark Malton, MD (Principal Investigator); Amy Rogers (Coordinator); Timothy Saunders, MD
Cleveland Clinic Foundation, Cleveland, Oh: Gregory Kosmorsky, DO (Principal Investigator); Karen King, COT (Coordinator); Tami Fecko; Deborah Ross, CRA;
Doheny Eye Institute, Los Angeles, Ca: Steven Feldon, MD (Principal Investigator); Lori Levin, MPH (Coordinator); Kerry Zimmerman, MS (Coordinator); Kathy Friedberg, COMT; Nahid Sadaati, CO;
Emory University, Atlanta, Ga: Nancy J. Newman, MD (Principal Investigator); Donna Loupe, BA (Coordinator); Ted Wojno, MD
Henry Ford Hospital, Detroit, Mi: Barry Skarf, MD (Principal Investigator); Mark Croswell; Wendy Gilroy Clements; George Ponka, COMT;
University of Texas, Houston, Tx: Rosa Tang, MD (Principal Investigator); Melissa Hamlin (Coordinator); Jewel Curtis; Kirk Mack; Portia Tello;
Jules Stein Eye Institute, Los Angeles, Ca: Anthony Arnold, MD (Principal Investigator); Janet Buckley (Coordinator); Robert Goldberg, MD; Lynn Gordon, MD; Howard Krauss, MD;; Robert Stalling;
W.K. Kellogg Eye Center, University of Michigan, Ann Arbor, Mi: Wayne Cornblath, MD (Principal Investigator); Barbara Michael (Coordinator);
Mason Institute of Ophthalmology, University of Missouri, Columbia, Mo: Lenworth N. Johnson, MD (Principal Investigator); Gaye Baker (Coordinator); Coy Cobb, CRA, COT; Sharon Turner, COT;
Mayo Clinic, Rochester, Mn: Brian Younge, MD (Principal Investigator); Jacqueline Leavitt, MD (Co-Principal Investigator); Rebecca Nielsen, LPN (Coordinator); Barbara Eickhoff, COT; James Garrity, MD; Jacqueline Ladsten; Kathleen Lebarron; Thomas Link, BA; Jay Rostvold; Karen Weber
Medical College of Virginia, Richmond, Va: Warren Felton III, MD (Principal Investigator); Tammy Anderson (Coordinator); George Sanborn, MD;
Michigan State University, East Lansing, Mi: David Kaufman, DO (Principal Investigator); Eric Eggenberger, DO (Co-Principal Investigator); Suzanne Bickert, RN (Coordinator); Robert Granadier, MD; Sandra Holliday; Thomas Moore, MD;
State University of New York, Syracuse, NY: Deborah Friedman, MD (Principal Investigator); Patricia Jones (Coordinator); Thomas Bersani, MD;
University of California, San Francisco, Ca: Jonathan Horton, MD (Principal Investigator); Maeve Chang, BA (Coordinator); Lou Anne Aber, COA; Stuart Seiff, MD
University of Florida, Gainesville, Fl: John Guy, MD (Principal Investigator); Z. Suzanne Zam, BS (Coordinator); Revonda Burke (Coordinator);
University of Illinois, Chicago, Il: James Goodwin (Principal Investigator); Allen Putterman, MD
University of Kentucky, Lexington, Ky: Robert Baker, MD (Principal Investigator); Judy Beck (Coordinator); Michael Hanson; Toni Scoggins, COA
University of Maryland, Baltimore, Md: Shalom Kelman, MD (Principal Investigator); Charlotte Frank (Coordinator); Rani Kalsi;
University of South Carolina, Columbia, SC: Kakarla Chalam, MD (Principal Investigator); Shirley Hackett (Coordinator);
University of Utah, Salt Lake City, Ut: Kathleen Digre, MD (Principal Investigator); Jolyn Erickson (Coordinator); Terrell Blackburn (Coordinator, 1992–1993, deceased); Richard Anderson, MD; Paul Langer, MD; Paula Morris; Sandra Osborn; Bhupendra Patel, MD; Sandra Staker; Judith Warner, MD
University of Virginia, Charlottesville, Va: Steven Newman, MD (Principal Investigator); Christine Evans, COMT (Coordinator); Carolyn Harrell, COA; Helen Overstreet, RN; James Scott, RBP; Lillian Tyler, COA
West Virginia University, Morgantown, WV: John Linberg, MD (Principal Investigator); Brian Ellis, MD (Principal Investigator); Charlene Campbell, COT); Gordon McGregor;
William Beaumont Hospital, Royal Oak, Mi: Edward Cohn, MD (Principal Investigator); Kristi Cummings (Coordinator); Patricia Manatrey (Coordinator); Sara Casey; Robert Granadier, MD; Virginia Regan; David Roehr; Patricia Streasick
Chairman's Office, University of Maryland School of Medicine, Baltimore, Md (1992–2003): Shalom Kelman, MD (Study Chairman); Michael Elman, MD (Vice Chairman); Charlotte Frank, MS (Administrator);
Coordinating Center, University of Maryland School of Medicine, Baltimore, Md (1992–1998): Kay Dickersin, PhD (Director); FrankHooper, ScD (Deputy Director); Roberta Scherer, PhD (Project Coordinator); Barbara Crawley, MS; Michael Elman, MD (1992–1994); Cheryl Hiner; Lucy Howard; Patricia Langenberg, PhD; Olga Lurye; Janet Masiero, MBA; Robert McCarter, ScD; Sara Riedel; Michelle Sotos; Laureen Spioch; Joann Starr (1992–1994); Judy Urban; Mark Waring; P. David Wilson, PhD; Jie Zhu; Qi Zhu, MS;
Coordinating Center, Brown University School of Medicine, Providence, RI (1998–2005); Kay Dickersin Ph.D.; Laureen Spioch (1998–2000); Jie Zhu (1998–2004), Qi Zhu, MS (1998–2005).
Coordinating Center, Johns Hopkins Bloomberg School of Public Health, Baltimore, Md (2005–present): Kay Dickersin PhD.; Roberta Scherer, PhD. Co-Investigator
National Eye Institute, Bethesda, Md: Donald Everett, MA
Data Analysis Committee: Barbara Crawley, MS; Kay Dickersin, PhD; Frank Hooper, ScD; Patricia Langenberg, PhD; Robert McCarter, ScD; Roberta Scherer, PhD; P. David Wilson, PhD
Data and Safety Monitoring Committee: Marian Fisher, PhD (Chair); Phil Aitken, MD; Roy Beck, MD; Andrea LaCroix, PhD; Simmons Lessell, MD; Reverend Kenneth MacLean; Kay Dickersin, PhD (ex officio); Michael Elman, MD (ex officio, 1992–1994); Donald Everett, MA (ex officio); Shalom Kelman, MD (ex officio)
Executive Committee: Shalom Kelman, MD (Chair); Kay Dickersin, PhD; Michael Elman, MD (1992–1994); Donald Everett, MA; Frank Hooper, ScD
Quality Assurance Committee: Frank Hooper, ScD, (Chair); Shalom Kelman, MD; Roberta Scherer, PhD;
Steering Committee: Shalom Kelman, MD (Chair); Kay Dickersin, PhD; Michael Elman, MD (1992–1994); Donald Everett, MA; Steven Feldon, MD; Frank Hooper, ScD; David Kaufman, DO; Nancy J. Newman, MD; Z. Suzanne Zam, BS
Surgical Quality Assurance Committee: Robert Baker, MD; Steven Feldon, MD; Robert Granadier, MD; Frank Hooper, ScD; Shalom Kelman, MD; Gregory Kosmorsky, DO, Stuart R. Seiff, MD
Visual Field Committee: Steven E. Feldon, MD (Chair); Anthony Arnold, MD; Sophia Chung, MD; Kay Dickersin, PhD; Lenworth N. Johnson, MD; Joanne Katz, ScD; Gregory Kosmorsky, DO; Patricia Langenberg, PhD; Lori Levin, MPH; Steven A. Newman, MD; Roberta W. Scherer, PhD; P. David Wilson, PhD
The author(s) declare that they have no competing interests.
SEF conceived the computer expert system, contributed to the design of the development and validation studies, chaired the expert panel, interpreted the study findings, and contributed to writing and editing the manuscript. LL coordinated activities of the expert panel, wrote the algorithm for the computer system. RWS contributed to the design of the development and validation studies, coordinated the validation studies, contributed to interpreting the study findings, and contributed to writing and editing the manuscript. AA collaboratively developed the algorithms for identification of the visual field patterns and participated in the validation studies as a member of the expert panel. SMC collaboratively developed the algorithms for identification of the visual field patterns and participated in the validation studies as a member of the expert panel. LNJ collaboratively developed the algorithms for identification of the visual field patterns and participated in the validation studies as a member of the expert panel. GK collaboratively developed the algorithms for identification of the visual field patterns and participated in the validation studies as a member of the expert panel. SAN collaboratively developed the algorithms for identification of the visual field patterns and participated in the validation studies as a member of the expert panel. JK contributed to the design of the development and validation studies, interpretation of study findings. PL contributed to the design of the development and validation studies, provided statistical advice, and contributed to writing and editing the manuscript. PDW contributed to the design of the development and validation studies, provided statistical advice, and contributed to writing and editing the manuscript. SEK served as scientific Chair of the Ischemic Optic Neuropathy Decompression Trial, contributed to the design of the study. KD contributed to the design of the development and validation studies, interpreted the study findings, and contributed to writing and editing the manuscript. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Feldon, S.E., Levin, L., Scherer, R.W. et al. Development and validation of a computerized expert system for evaluation of automated visual fields from the Ischemic Optic Neuropathy Decompression Trial. BMC Ophthalmol 6, 34 (2006). https://doi.org/10.1186/1471-2415-6-34
- Visual Field
- Expert Panel
- Visual Field Defect
- Optic Neuritis Treatment Trial
- Advance Glaucoma Intervention Study