Date Due: Dec. 4, 1999
SD 461 Interim Report
Image Processing for Clinical Grading of Ocular Redness
Submitted by:
Janine Cullen 95119481
Shane Pounder 95101537
Kimberly Whitear 95105986
TABLE OF CONTENTS
1 Introduction
*2 Anatomy of the eye
*3 Problem Statement
*4 Criteria and Constraints
*5 Website Design
*5.1 Banner
*5.2 Sliders
*5.3 User instructions
*6 Data Collection
*7 Image Processing
*7.1 Sclera Isolation
*7.2 Redness Features
*7.2.1 Percent Redness
*7.2.2 Red Difference
*7.3 Edge Features
*7.3.1 Edge Detection
*7.3.2 Measure of Number of Arteries
*7.3.3 Density of arteries
*8 Pattern Recognition
*9 Future Directions
*9.1 Web Site and Data Collection
*9.2 Image Processing
*9.3 Pattern Recognition
*10 Timeline Changes and Problems Encountered
*10.1 Web Site and Data Collection
*10.2 Image Processing
*10.3 Pattern Recognition
*10.4 Other Milestones
*11 Conclusions
*There is a subjective element to any medical diagnosis. This is especially true in the field of optometry where professionals are constantly classifying ocular health based on experience rather than quantified characteristics.
One specific symptom used to determine ocular health is the level of redness evident on the sclera of the eye. Excessive redness is known as conjunctival hyperaemia. Currently, many optometrists classify the level of conjunctival hyperaemia based on a comparison of the eye with a set of control images showing different intensities of disorder. Unfortunately, there is not a fixed set of images that is universally accepted.
There is also no scale that is used consistently by optometrists. Some practitioners grade the eye using a numerical classifier while others qualitatively classify the degree of redness in the eye. Furthermore, even if two optometrists grade redness on the same scale, there can still be a great deal of variation as they are still using a subjective grading method.
In order to design feature extraction algorithms for the image of an eye, it is important to understand the anatomy of the eye itself. The most important part of the eye pertaining to this project is the sclera. The sclera, commonly referred to as the white of the eye, is composed of tough, dense connective tissue that protects and shapes the eyeball. (See Figure 1.) This will be the region of interest for analysis of the images. A transparent, mucous membrane called the conjunctiva lines the inner eyelid and then folds back to cover the surface of the sclera. This part of the conjunctiva is called the ocular or bulbar conjunctiva. A large network of blood vessels is found in the choroid coat, one of the regions of the vascular tunic that makes up the middle layer of the eyeball. This coat is loosely joined to the sclera and provides all of the nourishment for the surrounding tissues through its blood vessels.
Figure 1: Anatomy of the Eye
Since the ocular conjunctiva is very thin, blood vessels are clearly visible beneath it. However, there are also fine blood vessels that run through this membrane. When the conjunctiva becomes irritated, the blood vessels enlarge making the eye appear red in colour. One particular contact lens induced condition causing redness is conjunctival hyperaemia. Irritation of the conjunctiva can be caused by bacteria trapped underneath the lens from handling or simply by the lens itself. This condition varies in severity from person to person depending on their sensitivity and the nature of the irritation.
The main problem to be addressed by this workshop is the lack of consistency in judging the degree of conjunctival hyperaemia in eyes. The workshop will use image processing and pattern recognition techniques with expert grading data to produce an automated classifier that will consistently and accurately grade images of eyes.
It is important that the system developed satisfy a number of criteria so that it will be useful.
The main constraint on the system is to provide a better classification system than previously developed. This constraint will be measured by the correlation between the expert grading and the software grading. The classification system of the previous workshop resulted in a correlation of 0.94 with none of the grades falling outside the range of values from the experts. The classifier developed during this workshop must improve upon the correlation from last year without introducing errors into the results.
After careful review of the data obtained by the previous group, a decision was made to redo the website. A number of features were added, removed or changed in order to eliminate any potential data collection errors. Since the purpose of the website is to collect accurate expert data using an online survey, it is important that careful consideration be given to the survey layout and organization. The first step involved changing the hyperlink to the actual survey to a large, salient icon. Other less crucial information such as email addresses and links to related sites were left as is.
On the old website, sample images of eyes with respective gradings were placed on a scrollable, vertical banner at the left-hand side of the screen. However, this banner appeared on the introduction screen only and was not available to the user as a reference throughout the survey. The banner is now positioned horizontally across the top of every screen, with four sample eyes rated as 20, 45, 70 and 95. The intent is that users will refer to the sample image gradings in order to interpolate their own grades. The 1-100 grading scale will be new to many experts; the sample images will serve as a guide for using the scale.
Upon inspection of discrepancies between previously collected expert datum, a few unexpected, isolated grading values of 1 were assigned to images which were otherwise rated significantly higher. The old website facilitating the collection of this data utilized dropdown menus to rate the nasal and temporal sides of the sclera. These drop-down menus defaulted to a value of 1, which could therefore be an explanation for the skewed data points. A slide bar is a much better tool for grading images as there is better visual feedback to the user and the position of the value indicator can be set to a default of zero.

Figure 2: Sample survey frame with sliders
In addition to providing the user with detailed instructions on how to complete the survey, a sample of the survey images and rating process was included to help guide the user.
The data collected from the website will be used with the classifiers to determine grades that are consistent with the expert grading.
Once all of the images have been graded by the expert, an exit survey will come up. The exit survey will gather demographic information from our experts as well as information regarding their current grading techniques. There are also fields for additional comments from the experts about grading and the project.
Originally, the data collection was set to take place during the first two of weeks of November; due to some technical problems trying to create the slider for the grading the data collection will take place during the month of December. The request for expert data will be coordinated with the optometry department. Experts will include optometrists and academics in the field as well as optometry students.
The data will be analysed to determine certain statistical features such as the mean, the maximum and the minimum. The results of the analysis will be used to help determine the classifier.
The tool used to complete the image processing work is Matlab version 5.2. The built-in image processing toolbox provides both the functionality and flexibility necessary to perform the required analysis.
There are three major components to the image processing portion of the project. In order to isolate the desired features of the sclera, the sclera itself must first be isolated. Once this isolation has been completed to satisfaction, the code for redness and edge features can then be applied to the resulting images. The following sections explain each of these components in more detail.
The sclera isolation algorithm is built on last year’s work. However, the original isolation routine makes a number of assumptions that are not always valid and, consequently, produces less than optimal results. These assumptions were:
To better explain why these assumptions are sometimes false consider the following examples:



Original Eye Previous Isolation New Isolation
Figure 3: Isolation of Sclera
Using the previous isolation algorithm, too much of the left portion of the sclera is obscured. This occurs because the old algorithm used only four user-defined points to determine what portion of the image to blacken. The new algorithm that has been devised corrects for these assumptions.
The iris is isolated in the same way as before. The user selects the centre of the iris as well as a point on the outer edge of the iris. This allows a circle to be drawn centred on the given centre position and with a radius defined by the distance to the edge point. Since the assumption that an iris is a perfect circle generally holds, this method provides good isolation of the iris.
It is no longer assumed that there are equal portions of sclera on either side of the eye or that the iris is in the centre of the image. In order to address this, it is necessary to prompt the user for two additional points (See Figure 4). The four original positions are still used: centre of the iris (1), outer edge of the iris (2), upper right edge of nasal fold (3) and lower right edge of nasal fold (4); along with the upper left edge of the eye (5) and the lower left edge of the eye (6). This allowed for four parabolic curves to be drawn, each using one of points (3) through (6) and two vertices located in line with the centre of the iris. If the point used in addition to the vertex is on the left side of the eye then only the left portion of the curve is plotted and the reverse is true for the right side of the eye.

Figure 4: User-defined points
As can be seen from the sample images, the assumption that there is no sclera visible above or below the iris sometimes produces poor results. In order to accommodate for this, an additional two user-defined points, (7) and (8), need to be selected, bringing the total to eight. The additional two points are used as the vertices of the four parabolic curves that surround the sclera.
The previous group focused most of their image processing efforts on extracting a measure of redness with which to determine a rating for each image. Unfortunately, their code was misplaced so the features were re-coded.
One of the measures used was percent redness. This feature was defined by the following equation:
where n black= the number of black pixels
In the end, the previous group rejected this measure since it produced the same moderate redness rating for any pixel that was close to grey (R=G=B). However, it was decided that this method might be useful in the classifier later on.
In order to test the code, a pure red and a pure white "eye" were constructed in a drawing program. The red eye produced a score of 1 and the white eye produced a score of 1/3. Based on the given equation, the results obtained from the test verify that this feature is working as anticipated.
This feature measures the difference between the red and blue plus green values of the pixels; black pixels are ignored. The redness difference metric provided reasonably good results for most of last year’s data, but the metric did not perform well for all of the images. This feature is defined by the following equation:
again, where n black = number of black pixels
The resulting values are to be evaluated for their usefulness in the classifier. Testing was again performed to ensure the feature is working as planned. The red eye produced a value of 1 and the white eye produced a value of 0.
One possible technique for counting the number of arteries present in the sclera is the zero-cross method, which involves the use of a low pass filter. While code already exists for this feature using a Laplacian of Gaussian (LoG) filter, the results were somewhat unpredictable and difficult to analyse. Small vessels often appeared as fragments in the filtered image and thus inaccurately reported a much higher overall number of vessels. Since this feature is used in all of the old data analysis and comparison, it will still be used as a classifier feature. Alterations, however, will be made in an attempt to rectify the problem of vessel fragments. Previously, the problem was addressed by ignoring all edge clusters of fewer than 10 pixels. This may not improve the technique completely since some fragments are actually greater than 10 pixels, and small localized redness may be discounted by mistake. One approach is to vary the sensitivity threshold used by the algorithm.
Another issue with this feature involves the definition of what constitutes a single artery. Most arteries originating at the visible edge of the sclera have at least one branch; those branches then split off into other branches. With the edge-detection and nearest neighbour clustering technique, these multi-branched arteries will be reported as single arteries. While this is technically true, it is not a good representation of what an expert would report as the actual number of arteries.
7.3.2 Measure of Number of Arteries
One of the features that might help characterise the degree of conjunctival hyperaemia is the number of visible arteries on the sclera. Many of the arteries visible on the eye cross each other making it difficult to do a straight count of the number of arteries. One of the possible solutions to this problem is to trace the outside of the sclera and count the number of arteries around the outside. As the program traced the outside of the sclera, the arteries would be detected by changes in the colour channels. While this would give a count of the number of arteries around the perimeter of the eye, some arteries branch into smaller arteries and would be missed by this feature. A more effective metric will be if the numbers of arteries are detected in more places than just around the outside. A circular or elliptical path will be set around the pupil. The number of arteries passing through this path will be counted. The path will then be increased in size and another measurement will be taken. Figure 5 shows a sample of what the paths may look like on the eye. New paths will be traced until the paths no longer cross any of the sclera. The resulting feature would consider the number of arteries throughout the entire eye and not just around the perimeter. This metric should be roughly proportional to the actual number of arteries.
Figure 5: Concentric vessel detection paths
7.3.3 Density of arteries
Localised clustering of arteries may consciously or unconsciously influence the way in which an expert grades an eye. In order to determine the extent of localised irritation, a new feature will be used. This feature will represent the degree of localised irritation on the eye. This can be implemented by sectioning the isolated sclera into windows for analysis, similar to superimposing a grid on the image. Initially, windows of size 10x10 will be used; if this window size is not producing effective results, the window size can easily be changed. The percent redness will be detected for each window in the grid. When the first portion of the analysis is complete, there will be redness numbers for each section of the grid. If one square has a large redness result while the rest do not, there is likely a localised trauma in that section. The feature can be turned into a metric to use in a classifier. This feature will attempt to replicate the effects that smaller areas of irritation in the eye may have on the expert grade. Support for the use of this feature may come from the feedback collected on the exit survey of the web site.
Figure 6: An example of localized trauma
The major goal of this workshop is to create a classifier that will successfully use the features of the eye to come up with a meaningful grade. Pattern recognition techniques will be used to come up with a classifier that will effectively mimic the grading performed by experts in the field.
The previous attempt to create a classifier used expert data in order to train the system for grading. Once again, expert data will be used to make the classifier for the project. It would be impossible to create a classifier that will consistently repeat the grading given by all experts, as there is extreme variation in the expert data. A method must be determined to decide whether a given classifier is effective. The effectiveness of a given classifier can be measured by using the correlation between the expert data and the software results. This correlation must be greater than 0.94, the value from last year’s workshop.
Previously, a scale of one to twenty was used for grading but, through consultation with the optometry department, this has been changed to a scale from one to one hundred. It is easier for people to think in terms of percentages as opposed to fractions of real numbers. This allows for more precision in the expert ratings. Most importantly, this larger scale gives more freedom for the classifier to determine a very precise grade so that more subtle changes in eyes can be detected over time.
The previous workshop group was moderately successful at creating two one-dimensional classifiers using LoG luminance and edge pixel features. The mean expert grades were used along with the calculated feature to train the program; using a polynomial-fit function, a grade was determined for the image being graded. While these classifiers correlated well with the expert data, a more rigorous classifier can be built to consider more than one feature at a time, thus improving the correlation.
It was determined that the classifier for this workshop should be a two-dimensional classifier. The one-dimensional classifier does not consider enough different features of the eye, whereas a two-dimensional classifier could consider two different features. In some eyes, there could be a network of many small arteries apparent in the conjunctiva that would result in a high irritation grade from the experts. However, if the feature used in the one-dimensional classifier was a redness feature, the automatic grading may underrate the eye. A two-dimensional classifier could look at both the redness feature and an arterial feature to give an accurate grading for the eye.
The use of a two-dimensional classifier is a discriminant problem in pattern recognition. Discriminants that will break down a graph into sections for each score must be determined. Figure 7 shows a sample of what the sectioned graph might look like.

Figure 7: Two-dimensional Sample Classifier
The actual section lines will be determined mathematically using expert data correlated with feature values. From these discriminants, a set of mathematical classifiers will be created such that the two feature values will be entered into the expression and a final grade will be determined. Figure 7 is an example of a classifier graph where the two features have been graded between one and one hundred. In practice, the discriminants will actually divide the graph into more precise ratings from one to one hundred.
Multiple features will be coded and tested for the two-dimensional classifier. An algorithm will be created that will allow for simple testing of the different features. The algorithm will take the expert data for the eyes as well as the feature results and return a sectioned contour graph broken down into the different ratings. For each image, the classifier will be trained based on the other pictures and expert data; the control image will then be graded based on the other images. This will be repeated for every picture until there is a rating for all the images.
For each combination of features, the two-dimensional classification will be calculated. The correlation between the classification and the expert data will be checked to determine which of the two-dimensional classifiers is the most accurate.
It is possible that one combination of features will be an effective classifier for some images and another classifier would work better for other images. For example, in highly irritated eyes, the density of arteries might be a more effective feature than the number of arteries, yet when the eyes are less irritated the opposite may be true. Multiple classifiers may be necessary depending on the image characteristics.
The final product of this workshop will be a classification algorithm that can accurately assess the level of conjuctival hyperaemia. If the software were to be released, the final produce would be a fully trained algorithm. Many more images would have to be made to train the software.
9.1 Web Site and Data Collection
The web site will be running for data collection during December and the start of January. Experts will be invited during this time to complete the eye survey that is on the website. The results from the survey will be analysed for statistical features such as the mean grade, maximum grade and minimum grade. These features of the expert data will be used for calculating the classifier and finally for testing the effectiveness of the classifier.
The final features will be coded in Matlab so that they return usable values for the pattern recognition. Further changes to the features may also be required as the classifiers are tested and the usefulness of the different features is tested.
The majority of the work remaining in the workshop is in the creation of the classifier. Once all of the data has been collected and analysed, the pattern recognition portion of the project will commence. Using the feature results and the expert data, different one-dimensional classifiers will be created for each feature. These one-dimensional classifiers will be used to produce two-dimensional classifiers incorporating two features.
10 Timeline Changes and Problems Encountered
The timeline was altered slightly from the original design plan due to unforeseen obstacles in some of the tasks. Other changes, due to new insights into the workshop requirements, were required as the workshop progressed.
10.1 Web Site and Data Collection
The web site design and structure followed the original timeline. Coding of the web site, however, lasted into December when it was set to finish at the end of October. The problem occurred when sliders were chosen for the grade selections. The sliders are java applets and it was difficult to incorporate the applets into the existing PERL. The obstacle was getting the applet to talk to the existing script and submit the grade to the output files. This unfortunately pushed the data collection and analysis portion of the workshop back into December and January. The expert data is not required until the pattern recognition portion of the workshop, therefore the collection and analysis can occur concurrently with the final image processing work in January.
The feature selection brainstorming started a bit early and ended on time. Originally, the features were not defined in image processing terms so they had to be refined. Before the features were coded, however, some additional image processing was required. The isolation of the sclera performed by the previous group did not adequately expose enough of the sclera for analysis. The isolation was added to the timeline and was completed before any feature coding was started. Once the features were defined, the redness features were coded using Matlab by the start of December. The edge-based features were discussed but the coding of these features has been scheduled for the beginning of January as we wrap up the outstanding image processing work.
The original breakdown of pattern recognition tasks was altered to include the 1-D classifier creation, the 2-D classifier creation and the evaluation of the classifiers. The timelines for the pattern recognition were also altered to be consistent with the new task breakdown. The creation and evaluation of the 2-D classifier is expected to be the largest portion of the second term tasks.
All other milestones set out in the timeline, such as report writing and presentation planning, remain unchanged from the original timeline.
11 Conclusions
The work completed during the first half of the workshop has laid the foundation for producing an effective grading system for eyes. Half of the image processing has been completed and a plan for the remainder of this work has been completed. The sclera isolation from last year was modified to expose more of the sclera for analysis. The results from the previous group showed that a two-dimensional feature would be more effective as it could consider both redness and arterial features.
The web site was altered from the previous year to include some extra features. Sliders replaced drop-down boxes for the grading, as sliders are easier to use and interpret. In order to test expert repeatability certain images during the survey will be repeated, including some of the control images from the banner of the survey.
12 References
Martini & Timmins (1999) Human Anatomy. Third Edition. Toronto: Pearson Education Canada.
Handler, C., Nagaraj, V. & Nichols, S. (1999) Automated Grading of Ocular Images Using Image Processing Techniques. University of Waterloo, Ontario.