We evaluated the effect of combining SAP and OCT measurements on the ability of ANN classifiers to discriminate between normal and glaucomatous tests. We have previously demonstrated that the use of pre-processed RNFLT measurements based on A-scans improved the diagnostic performance of MLCs compared to the conventional RNFLT parameters presented by the instrument . For SAP, the Pattern Deviation probability plots and maps provide probability values of all test points, highlighting those points with values falling outside the age corrected normal limits and also account for effects of media opacities on light sensitivity across the visual field. The performance benefits of pattern deviation score - based input data have been shown .
The combination of structural and functional information contained in the OCT and SAP test data respectively, can be viewed as a type of information integration. The simplest way to integrate the different types of data is to construct a vector that consists of all OCT and SAP measurements. We additionally attempted to construct and evaluate the performance of novel input parameters that fuse both structural and functional measurements. Integrating information about the structure-function relationship of glaucomatous damage through data fusion, presents some advantages over the simple combination of the two different types of data. Instead of relying on MLCs to learn about the structure-function relationship based on limited training data, the fusion process allows for direct incorporation of prior knowledge obtained in other independent large datasets about the topographic relationship between structural and functional measurements into the classification problem. Controlling the incorporation of knowledge into MLCs can also counteract the lack of insight on the way stochastic processes like ANNs represent and use the acquired knowledge in their classification decisions. Our ANNs with input based on the novel parameters showed a high degree of agreement in their classification decisions, reflected on the presented odds ratio values (Figure 4). The higher odds ratios for the ANNs based on fused input data could indicate that these classifiers are more robust since the likelihood of a false positive or false negative test result by both fused OCT and SAP based ANNs was significantly lower.
Bowd et al has previously shown that MLCs trained on combinations of OCT and SAP derived input performed at least as well as MLCs trained on each input type alone, while the use of data with reduced complexity (by means of the backward elimination technique), further improved MLC performance . Our results did not show significant improvement using input that simply combined OCT and SAP measurements compared to when using SAP or OCT measurements separately. However, the combination of fused OCT and SAP parameters showed significant improvement compared to the use of ANNs based on SAP parameters alone, and to the best performing commercially available algorithms in both the SAP and Stratus OCT instruments. This improvement was not specific to our ANN, but could be also seen with another MLC, a relevance vector machine (RVM) classifier, that we constructed and tested for comparison purposes. We did not report the results of our RVM since its performance was very similar to that of our ANN.
The use of principal component analysis for dimensionality reduction of the OCT and fused OCT data instead of a non-linear dimensionality reduction algorithm could have affected the results. Even though non-linear dimensionality reduction techniques might provide better representations of complex data, their extensions to new data are iterative in nature without exact numerical solutions in most cases.
The performance of Machine learning classifiers is dependent on their training process. During training, it is important to present learning examples with a known outcome (i.e. 'true' normal and 'true' glaucoma cases) and with all disease stages in order for the MLC to create representative classification decision boundaries. The inclusion of cases with an uncertain condition (i.e. patients characterized as glaucoma suspects) would adversely affect the false positive and negative rates of classification and our evaluation of specificity and sensitivity rates of the classifier.
The recruitment of healthy persons was based on a random population sample with the majority of individuals having no previous experience in ophthalmic examinations. In our attempt to include healthy individuals that do not represent supernormal subjects, we did not exclude persons with cataract since it is a condition often seen in older population groups and in patients with glaucoma. The rates of missclassifed tests could be partly explained by our choice of reference standard based on ONH morphology, which did not exclude patients with normal SAP and OCT test results. The bias in selecting a structure-or function-related reference standard, affects the accuracy of combinatorial analyses by erroneous estimations of specificity, sensitivity and correlation measures of the examined structural and functional parameters. We did not base the definition of normality and glaucoma on either SAP or OCT test indices. Our choice of reference standard was instead based on clinical examination of ONH morphology. Even though this structure based reference standard relates more to RNFL morphology than function as measured by the visual field, it has not shown a high degree of correlation with OCT measurements . The significant differences in age and refraction between healthy individuals and glaucoma patients are accounted for both in the pattern deviation probability based SAP input and the age-and refraction-corrected OCT input. Even though the 10-fold cross-validation process can account for certain bias pertaining to sample variability, further evaluation on an independent group of subjects is needed to support the general applicability of our findings. Future studies should also evaluate the fusion process with data based on the new generation of spectral domain OCT that provide higher spatial resolution and improved algorithms for detecting and analyzing the RNFL.
The incorporation of knowledge about known rules into black box classifiers could enable the construction of ANN-based systems that are more closely related to grey box models (i.e. models with known general structure but also unknown parameters), allowing for greater insight into the classification process and more effective sensitivity analyses of the test input parameters. Such advantages could facilitate the practical deployment of ANNs as decision support systems in glaucoma diagnostics.