Reproducibility and Reliability of the Wilson-Richmond Categorisation Tool in the Assessment of the Morphological Traits of the Lips

The aim of this article is to demonstrate the reproducibility of the Wilson-Richmond Categorisation Tool for the assessment of lip morphology. This categorisation system was initially developed as a result of the identification of various morphological features of the vermilion of the lips, following the review of three-dimensional facial scans collected on 2,246 patients (1,095 male and 1,151 female) in a 15-year-old population as part of the Avon Longitudinal Study of Parents and Children. The Wilson-Richmond categorisation tool evaluates the six main areas associated with the lips, namely the philtrum, Cupid’s bow, the nasiolabial angle, the upper and lower vermillion and the sub lip region. This tool can be utilised to categorise individuals lip morphology from a topographical perspective and in the assessment and comparison of lip changes that occur during growth. This is the first in a series of papers that will describe the usage of the Wilson-Richmond Categorisation Tool in the assessment of two ethnically different growing populations. The aim of this first article is to demonstrate the reproducibility and reliability of this tool. In this study, Three-dimensional laser scans of (80?) 40 individuals were reviewed and morphologically categorised. In almost all of the classification categories the intra-examiner reliability was greater than the inter-examiner reliability, however both (the intra and inter-examiner reliability) showed high levels of agreement, with the lower double vermillion border and the philtrum width proving to be the most reliable and reproducible categories. The least reliable were the lower vermillion contour and lip-chin shape in both the intra and inter-examiner groups, but these percentage agreements were still sufficiently high enough to indicate good reproducibility and reliability. The intra and inter-examiner percentage agreements were also higher than previously reported figures. Conclusion: This study has shown that the Wilson-Richmond Categorisation Tool is a reproducible and reliable method of assessing the various morphological features of the lips and shows both good inter and intra-examiner reliability. The Wilson-Richmond Categorisation Tool can provide a standardised means of assessment by which further comparisons amongst different, growing, ethnic groups may be compared with a view to identifying population associations. The second paper in this series utilises the WRCT to assess the morphological characteristics of two different ethnic populations of 12-year-old school children.


Introduction
The lips provide the border to the oral cavity and they have several well-defined functions.They are involved in: the production of sounds, mastication, maintaining an oral seal, providing sensory information about food prior to its placement in the oral cavity, as well as a role in sexual attraction/intimacy.The lips also play a pivotal role not only in verbal, but non-verbal communication.A recent study has also highlighted that children may deploy selective attention to the mouth of a talking face when learning speech [1].The lips frame the orthodontists work and it is important therefore to understand the effect that not only orthodontics has upon this structure, but also the effect of normal growth, in particular as orthodontic treatment is often undertaken in patients of pubertal age and studies suggests that facial growth continues into adult hood [2].The necessity to be able to predict accurately growth and the orthodontic affects upon the lips is further reinforced by the fact that the smile is one of the key criteria by which patients judge the success of their own orthodontic treatment [3].Rains and Nanda [4] highlighted the scarcity of investigations in the published literature on the orthodontic effects upon the soft tissue profile before the 1950's and a similar finding was found by Riolo [5].In contrast, there has been a considerable amount of research on lip growth following cleft lip and palate repair and changes in lip contour following orthognathic surgery [6][7][8][9].Wilson, et al. [10] further highlighted the scarcity of research that has been undertaken since Rains and Nanda's comments with respect to the vermillion of the lips in a normal population.Studies that have looked into this area have attempted to describe and classify the traits they have found; for example the three-dimensional study undertaken by Mori, et al. [11] on a small sample of five to six year old children who classified the morphology of the philtrum columns into four types; Human Genome Research Institute who summarised the anatomy of the oral region and defined and illustrated the terms that describe the major characteristics of the lips, mouth [12], nose and philtrum [13].Wilson, et al. [10] reviewed the characteristics of lips in a normal 15-yearold population and described the various lip traits and associations present to devise the Wilson-Richmond Categorisation Tool (WRCT).This can be utilised to aid identification of the various morphological features of the lips; it also found that certain morphological features had a high level of association.The WRCT has been used in this study to categorise morphological features of the lips and to demonstrate the reproducibility and reliability of the Wilson-Richmond Categorisation Tool (Figures 1 & 2) in the assessment of various lip traits present in a population sample.A further paper subsequent to this one will apply the WRCT to a population of 12-year-old Welsh school children.

Method
The author (SH) received a training package on the WRCT scoring system from its developer CW.This involved reviewing threedimensional scans of an initial 45 patient sample that both examiners CW and SH scored independently.The resultant scores were then compared and the rationale for the respective WRCT classification ascertained and discussed.(Should I include the results of the initial calibration tests?-yes) Once both examiners were content that SH was proficient in the utilisation of the WRCT a random study sample was then selected.Forty randomly selected three-dimensional facial scans were obtained from the Avon Longitudinal Study of Parents and Children (ALSPAC) [14].Since 2006/2007 the ALSPAC study sample have been recalled and three-dimensional cans have been undertaken using Konica Minolta Vivid 900 laser scanners [15,16].The data collected from these three-dimensional laser scans had been proven reliable and reported extensively in the literature [17,18] The forty, three-dimensional scans were imported into Geomagic Qualify 10 (a reverse engineering software package), where the image was processed and viewed using the grey undertexture, this was found to highlight the morphological Features [in my paper, and also I'm writing a paper with Jelena highlighting this -but this article will probably get published before that one!) (need reference for need of grey skin tone).The software allowed 360o rotation of the facial scan and provided the ability to select specific viewpoints from which each morphological feature could be scored against the WRCT.The author used six standardised views of each individual in the study sample upon which to conduct the WRCT assessment (Figure 3) The facial scans were reviewed and the lip traits were scored against the WRCT.The two examiners reviewed the calibration data independently and the inter-examiner error was then calculated (Table 1).Intraexaminer assessment (SH) of agreement was undertaken with a oneweek interval between WRCT data assessment time points (Table 2).

Statistical Analysis
The inter and intra-examiner agreement was calculated for each trait by evaluating the percentage agreement between the respective scores.

Results
The three-dimensional facial scans were reviewed for the forty randomly selected facial scans.The results for the percentage intraand inter-examiner reliability calculations are outlined in tables 1 & 2 respectively.As was expected in almost all of the WRCT categories the intra-examiner reliability was greater than inter-examiner reliability.The only exceptions to this were the lower vermillion double lower border and the lower lip tone.The highest agreement in the intraexaminer group and inter-examiner group were the lower double vermillion border category and the philtrum width.The lowest categories in the intra-examiner and the inter-examiner group were the lower vermillion contour and lip-chin shape.

Discussion
Wilson, et al. [10] in the development of the classification tool highlighted the considerable variation in normal lip morphology and in the development of a visual/numerical tool has provided a method of classifying and identifying trends in these phenotypical traits.Wilson et al10 reported a high level of agreement between inter and intra examiner reliability with respect to most aspects of the WRCT.However, they highlighted that the least reliable aspect was the assessment of the lower lip vermillion contour.In their research they found the intra and inter examiner reliability to be 79% and 33% respectively.They recommended dichotomisation of the lower lip results in order to improve reliability (90% inter-examiner and 67% intra examiner respectively).Whilst in the author's experience this aspect of the WRCT did prove the most difficult aspect in which to achieve calibration, the results of this study showed that a high level of agreement (70% inter and intra-examiner agreement) could be achieved and that dichotomisation of this aspect of the WRCT may not be required.This could potentially be due to the fact that the author had a more comprehensive training package or because the use of this tool had matured since its development and consequently the author received better training from the developers in its use.The morphological appearance, trends and associations of this study sample are not reported here as they have already been reported in the much larger study undertaken by Wilson et al10.But more importantly this study has shown that an examiner new to the WRCT can calibrate and utilise this tool to assess a series of scanned images from a study population in order to classify the individuals according to the morphological appearance of their lips.This type of analysis would not have been possible by the more traditional land marking techniques, where the subtleties of the lip contours, grooves and indentations are ignored, with the preference for exact measurements with small margins of error.This detailed examination of the topography of the lips afforded by the WRCT provides a unique insight into lip morphology (and can be likened to a detailed Admiralty chart of the ocean bed or that of hill contours on an Ordanance survey map).It is the aspiration of the authors that this tool will allow a detailed insight into the soft tissue characteristics of different ethnic populations and the potential identification of changes due to growth of a key aspect of the oral soft tissue environment for many medical specialties.

Conclusion
This study has shown that the Wilson-Richmond Categorisation Tool is a reproducible and reliable method of assessing the various morphological features of the lips.This tool has been developed on epidemiological data and shows both good inter and intra-examiner reliability.The Wilson-Richmond Categorisation Tool can provide a standardised means of assessment by which further comparisons amongst different, growing, ethnic groups may be compared with a view to identifying population associations.The second paper in this series utilises the WRCT to assess the morphological characteristics of two different ethnic populations of 12-year-old school children.

4 .
Flat type, and the work of a panel of experts for the National

Figure 3 :
Figure 3: An example of the standardised views of the 360o laser scan used.