Strategies to estimate the characteristics of 24-hour IOP curves of treated glaucoma patients during office hours

Background It is known that office-hour measurements might not adequately estimate IOP mean, peaks and fluctuations in healthy subjects. The purpose of the present study is to verify whether office-hour measurements in patients in different body positions can estimate the characteristics of 24-hour intraocular pressure (IOP) in treated POAG patients. Methods The 24-hour IOP curves of 70 eyes of 70 caucasian patients with treated glaucoma were analyzed. Measurements were taken at 9 AM; 12, 3, 6, and 9 PM; and 12, 3, and 6 AM, both in the supine (TonoPen XL) and sitting (Goldmann tonometer) positions. The ability of five strategies to estimate IOP mean, peak and fluctuation was evaluated. Each method was analyzed both with regression of the estimate error on the real value and with “hit or miss” analysis. Results The least biased estimate of the Peak IOP was obtained using measurements from both supine and sitting positions, also yielding the highest rate of correct predictions (which was significantly different from 3 of the remaining 4 strategies proposed, p < 0.05). Strategies obtained from the combination of supine, sitting and peak measurements resulted to be least biased for the Mean IOP and the IOP Fluctuation estimate, but all strategies were not found significantly different in terms of correct prediction rate (the only significant difference being between the two strategies based on sitting or supine measurements only, with the former being the one with the highest correct prediction rate). Conclusions The results of this study remark the concept that IOP is a dynamic parameter and that intensive measurement is helpful in determining its characteristics. All office-hour strategies showed a very poor performance of in correctly predicting the considered parameters within the thresholds used in this paper, all scoring a correct prediction rate below 52 %.


Background
Intraocular pressure (IOP) is the main risk factor for the development and progression of glaucoma [1][2][3][4] and the only treatable one. IOP parameters (mean, peak and fluctuation) should be measured at diagnosis and strictly monitored in order to address the efficacy of IOPlowering interventions. Mean IOP has been consistently recognized as a major risk factor for glaucoma and its progression [1,2,4] whereas the role of peak [5,6] and fluctuation [7][8][9][10][11][12] as independent risk factors is still controversial.
Many studies report that mean IOP is not significantly different when measured during office-hour and 24-hour [13][14][15][16], but that office-hour data may significantly underestimate IOP peak and fluctuation: the majority of glaucoma patients had their IOP peaks outside office hours, most frequently occurring in night hours [14,[16][17][18][19].
In addiction to that supine office hour IOP measurements were described to better estimate IOP peaks than sitting measurements alone [16].
The most precise procedure to investigate IOP characteristics is 24-hour phasing [13-16, 20, 21] though it is unpractical, expensive and can be performed in a small subgroup of patients in few institutions. In addiction to that, due to the unavailability of IOP-measuring techniques that can be used while the patient sleeps, nighttime evaluations require awakening of patients, potentially causing artifacts related to stress.
In 2009 the group of Leonardi developed a disposable contact lens sensor (CLS) that allows continous IOP monitoring (Sensimed AG, Lausanne, Switzerland) [22]. Nowadays the major limitation of this technology is the fact that the results of IOP evaluations are not provided in the habitual mmHg units but in an arbitrary unit and a direct comparison between the two methods cannot be performed yet [23].
The difficulty in obtaining 24-hour curves and the possible discrepancies between 24-hour and office-hour data led our [14] and other groups [16,18,19] to develop strategies to estimate 24-hour parameters by office-hour data. We showed that the collection of supine and sitting office-hour measurements may enhance the correct identification of 24-hour IOP characteristics in both healthy subjects and untreated POAG [14].
The purpose of the present study was to verify whether office-hour measurements taken in different body positions can estimate the characteristics of 24-hour IOP in POAG patients using different IOP-lowering treatments.

Methods
This study was a retrospective analysis of 24-hour IOP curves of POAG treated patients, collected in the context of clinical trials investigating the circadian effect of antiglaucoma drugs. It was conducted at the Eye Clinic of San Paolo Hospital, University of Milan, Italy, after approval by the local Ethics Committee of San Paolo Hospital in Milan, and according to the tenets of the Declaration of Helsinki and national laws for the protection of personal data. Written informed consent was obtained from all the study participants.

Study population
Seventy caucasian patients were enrolled (39 men and 31 women): 19 of them were treated with timolol (twice a day), 29 with latanoprost (once a day), ten with brimonidine (twice a day) and 12 with the fixed combination dorzolamide/timolol (FCDT, twice a day). These treatments options were part of their standard care.
To be included in the study, patients had to have glaucomatous visual fields (abnormal mean defect and corrected pattern standard deviation on at least two consecutive, reliable Humphrey 30-2 full-threshold tests), optic nerve head (ONH) changes (presence of concentric enlargement of the optic cup, localized notching, or both, as evaluated by means of color stereophotographs), and/or retinal nerve fiber layer (RNFL) defects (presence of focal or diffuse neuroretinal rim thinning, as evaluated by means of a scanning laser ophthalmoscope). Patients with ocular hypertension were excluded. Patients with untreated POAG were not included in this study.
Exclusion criteria included angle-closure glaucoma, secondary glaucomas, corneal abnormalities preventing reliable IOP measurement, previous filtration surgery, having one eye, pregnancy, significant disturbances of wake-sleep rhythms, and/or the regular use of hypnotic drugs reported by the patients. Eligibility was verified by means of a complete ophthalmic assessment.

Twenty-four-Hour IOP evaluation
The methodology to assess 24-hour IOP is described in previous papers [14,24] and summarized in Fig. 1.
The patients were hospitalized in the morning at 7 AM and stayed for the following 24 h. The awake period lasted from approximately 6:30 AM to 11:00 PM. IOP was measured at 9 AM; 12,3,6, and 9 PM; and 12, 3 and 6 AM both in the supine and sitting positions.
For the daytime measurements (9 AM-9 PM), patients were asked to go to bed and relax for approximately 15 min, after which supine IOP was measured in both eyes. After approximately 10 min, a second IOP value was measured at the slit lamp. During the night, the patients were awakened approximately 10 min before each measurement to prevent a sudden increase in IOP. The IOP supine measurements were taken with a handheld electronic tonometer (TonoPen XL; Bio-Rad, Glendale, CA); the IOP sitting measurements on the other hand were taken with Goldmann applanation tonometer at the slit lamp. Every measurement by TonoPen XL consisted of a variable number of readings until the coefficient of variation was less than 5 %. All measurements were taken at each time point by two well-trained glaucoma specialists who had obtained good accordance between their measurements (κ = 0.82 with both tonometers). If the measurements differed by >2 mm Hg, a third measurement was taken; the mean of two or the median of three recordings was used for the analysis.

Peak IOP estimator strategies
Five parameters were tested in their ability of extrapolating peak IOP from office-hour readings: 1. The highest value obtained from the office-hour curve in the sitting position.

Mean IOP and IOP fluctuations estimator strategies
The 24-hour mean IOP and IOP fluctuations in habitual body position were compared to those calculated from: 1. Office-hour readings only in the sitting position (four measurements). 2. Office-hour readings only in the supine position (four measurements). 3. Office-hour sitting readings (four measurements) + the peak IOP, as estimated with the better of the previous formulas. 4. A combination of sitting and supine office-hour readings (four + four measurements). 5. A combination of sitting and supine office-hour readings (four + four measurements) + the estimated peak IOP.

Statistical analysis
We considered the 24-hour curves obtained in habitual body position-that is sitting readings during waking time (from 9 AM to 9 PM) and supine readings during night time (from 12 to 6 AM). These curves were compared to the readings of the same 24-hour curves obtained during office hours (from 9 AM to 6 PM) in both supine and sitting positions in order to evaluate the ability of off-h readings to predict 24-h characteristics.
The following parameters were calculated: mean and range of the difference between estimate and 24-hour IOP parameter (expressed as absolute values, i.e., both an underestimation of −4 mm Hg and an overestimation of +4 mm Hg counting as 4 mm Hg. Linear models and generalized linear models were used to assess the quality of the analyzed estimators. First, we analyzed the relation between the real value to be estimated and the estimate error (specifically the difference between the estimate and the real value). For most of the cases, a linear regression approach properly modeled the relation between the real value and the estimate. For two strategies (Strategy 1 for the peak value and Strategy 1 for the fluctuations) a Zero Inflated Compound Poisson Model (ZICP) was used to model the estimate error dependency on the real value due its peculiar skewed distribution and high zero counts. A ZICP model can be broken down in two parts: the first part uses a binomial distribution to model the zero/non zero outcome of the Estimate Error, while the second part models the distribution of the continuous Estimate Error value when not zero.
Next, we adopted a "hit or miss" approach to assess the error rate of each strategy, considering an estimate error within ± 2 mmHg as a correct prediction for the Peak IOP and Mean IOP, ± 1 mmHg for IOP Fluctuations. In this context, a multinomial logit model was used on the various strategies to assess the odds of overestimation, underestimation and correct prediction. For the two strategies for which the ZICP was used, a completely correct prediction odds (0 mmHg error) was also derived from the binomial part of the model.
All numeric predictors (i.e. the real values) were mean centered, so the intercept corresponds to the estimates for the mean predictor value. Then, an analysis by treatment groups was conducted to assess differences in estimation. In this case a classical ANOVA approach was chosen and a post-hoc correction (Tukey-Kramer) was used for multiple comparisons.
Finally, an overall comparison of the strategies was performed fitting a logit model for each feature considered (Mean, Peak and Fluctuation) with the Strategy as the predictor (where each strategy represented a level) and "Hit" or "Miss" as the response variable, according to the above classification of hits or misses. A Subject random effect was included to account for the fact that each strategy estimated the same real value for each subject. Then a post hoc analysis (Tukey-Kramer) was performed to compare the various strategies in terms of odds of hits or misses.
All calculations were made in R scripting environment.

Results
The mean age was 73.1 ± 9.09 years. 24-hour IOP data in habitual body position are shown in Table 1.
Mean and peak IOP were lower in patients treated with latanoprost than in patients treated with timolol (respectively P = 0.03 and 0.05); significant differences were not found among the other groups. Differences in fluctuation between groups were negligible.
The analysis focused on the accuracy of predictions from the various methods tested. Each method was analyzed both with regression of the estimate error (estimated valuereal value) on the real value and with "hit or miss" analysis. For each of the three variables analyzed (Mean, Peak and Fluctuation) no significant differences were found among the treatment groups (p-values all greater than 0.13, ANOVA).

Regression analysis Peak estimate
The five strategies presented above were analyzed, in terms of difference of the estimated value from the corresponding real value (Estimate Error). Strategies from 2 to 5 were analyzed using simple regression with Gaussian error distribution using the real Peak IOP value as a predictor of the Estimate Error. The results are reported in Table 2. All strategies showed a significant dependency of the Estimate Error on the real Peak value with a negative slope, i.e. overestimation was more likely for smaller values and underestimation for larger values. Strategy 5 showed the smallest slope, yielding the least biased estimate of the Peak value.
Strategy 1 required a detailed analysis due to its peculiar Estimate Error distribution: only underestimation errors were allowed with a relatively high rate of correct prediction (zero Estimate Error). To properly model such negatively skewed distribution with high zero counts we reversed the sign of the Error and used a ZICP model. This allowed accurate modeling of the odds of correct predictions and of the skewed negative errors. Particularly, the odds of correct prediction did not depend on the peak value (p-value = 0.3) and was significantly different from 1 (odds = 0.52, probability of correct prediction = 0.34, p-value < 0.01). The mean prediction for non zero errors was significantly dependent on the real peak IOP value (p < 0.001) in a non linear fashion, as depicted in Fig. 2.

Mean estimate
Five strategies for estimating the Mean IOP value were analyzed as for the Peak strategies. In this case only linear regression analysis was necessary. Results are presented in Table 3. Strategies 3 and 4 showed a non significant dependency on the real Mean IOP value and a global mean non significantly different from zero (pvalues > 0.05). Strategy 1 and 5 showed a non significant dependency of the Estimate Error on the Real Mean IOP but had a significant offset (negative for Strategy 1, positive for Strategy 5).

Fluctuation estimate
As for the Peak Value, Strategies from 2 to 5 were analyzed using simple regression with Gaussian error distribution using the real value as a predictor of the estimate error. The results are reported in Table 4.
As in the Peak estimate, one of the Strategies, specifically Strategy 1, showed a limiting effect to overestimation, yielding a relatively high rate (although much lower than for Strategy 1 for the Peak value) of correct predictions (zero error) and a negatively skewed error distribution. Again, this was modeled using a ZICP model and the odds of correct prediction did not depend on the real fluctuation value (p-value = 0.60) and was significantly different from 1 (odds = 0.13, probability of correct prediction = 0.11, p-value < 0.01). The mean prediction for non zero errors was significantly dependent on the real IOP fluctuation value (p < 0.001) in a non linear fashion, as depicted in Fig. 3.

Hit or miss analysis
A second step of the analysis was aimed at the characterization, for each strategy, of the probability of yielding clinically reliable estimates of the quantities of interest. To test this, we chose a rage of clinical tolerance (±2 mmHg for the IOP Peak estimate and the Mean IOP estimate, ± 1 for the IOP Fluctuation estimate). Errors within the tolerance range were considered as "Hit", errors outside the range were counted as "Miss" (subdivided in overestimates, "Over" and underestimates, "Under"). Then, a multinomial logit model was used for each strategy to model the "hit or miss". This approach had the advantage of allowing a comparison the different strategies independent of the error distribution. Hits were used as the reference category, so the model summary tables (see below) present the logit coefficients of "Over" versus "Hit" and "Under" versus "Hit". Table 5 shows the results for the multinomial logit model for IOP Peak strategies. Each model compares the selected strategy in terms of odds of overestimation or underestimation with respect to correct hits (within ±2 mmHg). The first Strategy had no "Over"  Figure 4 shows scatter plots of the estimated values versus the real values, with misses highlighted in red. Table 6 shows the results for the multinomial logit model for Mean IOP strategies. Each model compares the selected strategy in terms of odds of overestimation or underestimation with respect to correct hits (within ±2 mmHg). Strategy 2 showed the highest rate of Overestimates, being even higher that correct Hits. Strategy 1 showed the highest Hit rate and no correlation of errors with the true Mean IOP value. All strategies except for Strategy 2 showed no significant correlation with the true Mean IOP value. Figure 5 shows scatter plots of the estimated values versus the real values, with misses highlighted in red. Table 7 shows the results for the multinomial logit model for IOP Fluctuation strategies. Each model compares the selected strategy in terms of odds of overestimation or underestimation with respect to correct hits (within ± mmHg). The strategies with the highest Hit rate and more balanced Over/Underestimation rate were Strategy 4 and 5 (which yielded the exact same estimates for all subjects), although showing a strong correlation with the real IOP fluctuations, especially for the odds of underestimations, being more likely for higher Fluctuation value. Figure 6 shows scatter plots of the estimated values versus the real values, with misses highlighted in red.

Treatment group analysis
In general, no significant differences were found among the treatment groups.  In brackets, the standard errors of the coefficient estimates. Asterisks represent the significance level according to the legend in the footnotes Note: *p < 0.1; **p < 0.05; ***p < 0.01

Discussion
The results of this study on glaucoma patients treated with different IOP lowering eye drops remark the concept that IOP is a dynamic parameter and that intensive measurement is helpful in determining its characteristics.  The first two rows report the odds of Overestimates (first row) an Underestimate (second row) with respect to the Hits at the average Peak value. The third and fourth rows report the logit coefficients for Over and Underestimate for the Peak Value (i.e. how the odds vary with the real Peak value). Asterisks represent the significance level according to the legend in the footnotes. The second half of the table reports the actual Hits, Over and Underestimate counts for each strategy Note: *p < 0.1; **p < 0.05; ***p < 0.01  The first two rows report the odds of Overestimates (first row) an Underestimate (second row) with respect to the Hits at the average Mean IOP value. The third and fourth rows report the logit coefficients for Over and Underestimate for the Mean IOP Value (i.e. how the odds vary with the real Mean IOP value). Asterisks represent the significance level according to the legend in the footnotes. The second half of the table reports the actual Hits, Over and Underestimate counts for each strategy Note: *p < 0.1; **p < 0.05; ***p < 0.01  Our dataset confirms the critical role of IOP variations during night hours in glaucoma: IOP peaked outside office-hour in 65 %, which is similar to 52-66 % of other studies [13,19,25]. This remarks the importance of considering the 24-hour characteristics of IOP, at least in critical patients, in order to better tailor treatments to individual IOP patterns [15].
Different strategies to predict the 24-hour rhythm of IOP have been reported in literature [14,16,18,19]. In general, supine IOP is higher than sitting IOP due to increased episcleral venous pressure, and the concept that supine measurement may be used to predict 24-hour IOP peak dates 1975 [26].
Correction formulas to predict peak from both sitting [16] and supine [14,16] measurements have also been suggested. Water drinking test [27] and ibopamine [28] have also been suggested to predict peak IOP. The strategy used in this study (a combination of office-hour supine and sitting measurements) had been previously used on untreated subjects [14], and we clearly showed that the relevant 24-hour IOP characteristics that can be missed by routine examination (ie office-hour sitting measurements) can be frequently detected when sitting and supine readings are associated.
When compared to our previous report, this study seems to suggest that, in general, office hour measurements are not adequate to correctly estimate peak, mean and fluctuations values within the acceptance thresholds considered in this paper in treated glaucoma patients.
The best office-hour strategy to estimate the peak IOP was the combination of sitting and supine office hour measurements (strategy 5). The error in this strategy was the least correlated with the real IOP value when compared with other strategies. In addition it was not affected by a significant mean offset (i.e. yelding the overall least biased estimate). Strategy 1 and 5 were similar in terms of number of correct peak predictions and resulted not to be significantly different upon the correct prediction analysis. It is important to notice, although, that strategy 1 was greatly affected by the highly negatively skewed distribution (i.e. allowed only underestimations) which were not counterbalanced by the zero error predictions (34 %). From a clinical point of view this is of particular importance, showing that sitting office hour measurements tend to underestimate the correct peak value. This error is not only significantly correlated with the real Peak IOP value, but its non linear relation with the real Peak IOP is difficult to model and to compensate for. All office-hour strategies for the Mean IOP estimate, except Strategy 2, showed similar features. The best strategy in terms of number of correct predictions was sitting office hour measurments alone (strategy 1), although the odds of correct predictions were not significantly different from strategies 3, 4 and 5. Strategies 1, 3 and 4 could be considered equal since no correlation of the errors with the real values or significant mean offset (i.e. no systematic bias) could be found; they performed equally in terms of correct predictions (about 50 %) and the wrong prediction rate was not correlated with the real Mean IOP values. Strategy 5 had a significant mean offset (0.6 mmHg) but this did not affect the correct prediction rate significantly. Strategy 2 was the worst both in terms of correct predictions and in terms of significant systematic bias (especially, exhibited a strong correlation of the errors with the real Mean IOP value).
The IOP Fluctuation strategies were affected by several flaws. Strategy 1 performed very poorly in terms of correct predictions and, as for Strategy 1 for the Peak IOP, suffered from a non linear strong correlation of the error rate with the real IOP value and a skewed error distribution, allowing only underestimates. The best in terms of correct predictions were strategies 4 and 5 (which yielded the same estimates for all subjects) although they were not significantly different from the other strategies. All strategies showed a strong correlation of the error with the real IOP Fluctuation value, underestimating higher values.
As shown in Table 8 using sitting office-hour measurements, a correct identification of all parameters (peak, mean and fluctuation) was achieved in 24 % of cases, which was very similar to untreated patients (20 %) [14], with a slight increased percentage of cases fully characterized adding supine office-hour values (27 %).
The analyses of this study may be prone to a number of possible limitations that have been previously described and discussed in details [14] . The accuracy of 24-hour data may be affected by a number of factors, including hospitalization, disturbed sleep, sudden waking and exposure to light at night [29,30]. The use of two tonometers could also be a specific limit of this paper [31,32]. Still, this 24-hour procedure is strictly standardized and controlled, and it has been largely used in our center for nearly two decades [14,24,[33][34][35][36]. Different characteristics of the study groups (in particular, baseline IOP) may largely influence our findings. In fact, we found that mean 24-hour IOP was higher with timolol than with others glaucoma medications, thus confirming previous findings, showing a substantial effect of timolol during the day, but no measurable effect at night [24]. Also the choice of timepoints was critical in determining the results: in particular, latanoprost patients received their medication after the 9-pmmeasurement, a fact that could explain the high percentage of IOP peaks outside office-hours. Adherence to treatment was not a concern, as medications were administered by study personnel during the study period.

Conclusions
The results of this study remark the concept that IOP is a dynamic parameter and that intensive measurement is helpful in determining its characteristics. The methods of this study have been previously used for untreated subjects, for whom we clearly showed that relevant 24-hour IOP characteristics may be missed by routine examination (ie office-hour sitting measurements), whereas a combination of officehour sitting and supine measurements can provide more accurate information.
The results of this study on treated glaucoma patients show a very poor performance of all office-hour strategies in correctly predicting the considered parameters within the thresholds used in this paper, all scoring a correct prediction rate below 52 %. Some strategies allowed a simple linear modeling. In these cases a model to correct for the systematic bias could be theoretically applicable, but a much larger sample size would be required for precise parameter estimation. In all cases, however, the residual standard error was relatively high (ranging from 1.4 to 3.2) so that a model correction (i.e. zero mean error) would hardly succeed in obtaining reliable estimates. Data are percentages Full characterization: mean IOP, peak, and fluctuations, respectively, within 1, 1, 2 mm Hg from the 24-hour value Full absence of characterization: mean IOP, peak, and fluctuations, respectively, outside 1, 1, 2 mm Hg from the 24-hour value