Kefelegn Kebede* and Ashenafi Getachew Megersa
Received: August 27, 2024; Published: September 03,2024
*Corresponding author: Kefelegn Kebede, School of Animal and Range Sciences, Haramaya University, Ethiopia
DOI: 10.26717/BJSTR.2024.58.009159
Traditional methods for analyzing categorical variables in livestock and poultry, such as logistic regression and canonical discriminant analysis, often rely on assumptions. When these assumptions are violated, conclusions can become inaccurate. To overcome these limitations, alternative methods like the Classification and Regression Tree (CART) technique are increasingly used. CART is a nonparametric method that does not require assumptions of normality or linearity. It effectively handles complex datasets, making it a valuable tool in animal science research. However, its application in poultry science, especially in Ethiopia, is limited. There is a notable gap in research applying CART to differentiate chicken breeds based on egg quality traits, which is essential for making informed breeding decisions. This study evaluates how breeding affects egg quality traits and uses the Classification Tree Algorithm (CTA) to differentiate between Bovan Brown and Fayoumi chicken breeds. Data were collected from 1,341 eggs, including 600 from Bovan Brown and 741 from Fayoumi. We measured various external and internal egg quality traits: egg weight, egg length, egg width, albumen height, albumen weight, yolk color, yolk height, yolk weight, shell weight, and shell thickness. The CTA was then used to classify the breeds based on these traits, with statistical analyses performed using JMP Pro version 18. The CTA model showed high accuracy, with R² values of 0.989 for training data and 0.930 for validation data.
Egg weight was the most significant predictor, accounting for 94.4% of the variation in breed classification. While albumen weight and albumen height also contributed, their impact was less significant. Other traits had minimal effects on breed differentiation. The model achieved nearly perfect classification rates, as shown by confusion matrix and ROC analyses. This study confirms that egg weight is the most critical factor for distinguishing between Bovan Brown and Fayoumi chicken breeds, with other traits being secondary. These findings highlight the effectiveness of CTA in poultry breeding and suggest that future research could improve breed classification by incorporating more complex models or additional data types.
Keywords: Breed Differentiation; Bovan; Fayoumi; Egg Quality Traits; Machine Learning Algorithm
Abbreviations: AH: Albumen Height; AW: Albumen Weight; EL: Egg Length; EWi: Egg Width; EW: Egg Weight; SW: Shell Weight; ST: Shell Thickness; YC: Yolk Colour; YH: Yolk Height; YW: Yolk Weight; CTA: Classification Tree Algorithm; LSM: Least Square Means; BV: Bovan; FM: Fayoumi; ROC: Receiver Operating Characteristic; AUC: Area Under the Curve
Traditional statistical methods for analyzing categorical response variables in livestock and poultry, such as mortality, calving ease, stillbirth, litter size, fertility, and breed discrimination, often rely on logistic regression, canonical discriminant analysis, and data transformation techniques [1-3]. These methods depend on assumptions like normality, constant variance, linearity, and non-multicollinearity. Violating these assumptions can lead to significant errors in result interpretation [4]. To address these issues, it is important to use more robust statistical methods that do not rely on such assumptions. The Classification and Regression Tree (CART) technique is one such method. CART is a powerful tool in data mining [5]. Unlike traditional approaches, CART does not assume normality, constant variance, or linearity. It has several advantages: it is nonparametric, can handle multivariate analyses, manages multiway splits, and works with various types of variables. It also adapts well to data transformations and missing values and provides a graphical representation that helps interpret complex interactions. CART has been widely used in fields like business [6] and medicine [7,8], and it also shows promise in animal science. For example, [1] used CART to identify factors affecting the number of lambs reared from a fertilized ewe.
Another study applied CART to predict lamb mortality in Polish Merino sheep [9], while [2] explored the link between PrP genotypes and litter size in different breeds. Additionally, [3] used CART to study the impact of cage density, genotype, and season on Japanese quail fertility. Similarly, [10] assessed how egg quality traits affected fertility in Japanese quail, identifying specific egg dimensions related to higher fertility rates. CART has also been used to prepare cows for artificial insemination [11], analyze milk yield and udder traits in goats [12], and investigate factors affecting milk quality [13]. Moreover, [1] used ANOVA with transformations to standardize calving ease data, ensuring a normal distribution and homogeneity of variance. Despite its advantages, CART's application in poultry science in Ethiopia is still limited. There is also a lack of research using CART to differentiate chicken breeds based on egg quality traits. This study aims to address this gap by evaluating how breed affects egg quality traits and using CART to distinguish between chicken breeds based on these traits. The results will help breeders make informed decisions about breeding strategies, focusing on selecting animals with desirable traits [1].
Description of the Study Area
The study was conducted at the poultry research farm of Haramaya University, situated 505 km to the east of Addis Ababa. The site is positioned at an elevation 1980 meters above sea level, with coordinates of 90 26'N latitude and 420 3'E longitude. The average annual maximum and minimum temperatures are recorded at 23.4°C and 8.25°C, respectively, while the region experiences an average annual rainfall of 741.6 mm.
In this experiment, two exotic chicken breeds—Bovan and Fayoumi—were used. These breeds were raised under consistent housing and feeding conditions to ensure uniformity. Initially, the chickens were housed in brooder facilities equipped with incandescent heating lamps for the first eight weeks. After this period, they were transferred to the grower house for the growth phase and subsequently moved to the layer house, where they were kept under the deep litter system during the laying phase. Throughout the experiment, the chickens had continuous access to water and were provided with a nutritionally balanced diet tailored to their specific needs. During the first eight weeks, they were fed a standard ration containing 20% crude protein (CP) and 2800 Kcal/kg metabolizable energy (ME). As they transitioned into the growth phase (9 to 20 weeks), the feed was adjusted to 16% CP and 2800 Kcal/kg ME. Finally, during the laying period, the feed was further modified to 16.5% CP and 1750 Kcal/kg ME to meet the specific dietary requirements of the hens. To ensure the health and well-being of the chickens, they were vaccinated against major viral diseases. Additionally, a veterinary professional closely monitored their health throughout the study to maintain optimal conditions.
A total of 1,341 eggs, comprising 600 from Bovan Brown and 741 from Fayoumi, were used for this study. The recorded traits included albumen height, albumen weight, egg length, egg width, egg weight, shell weight, yolk colour, yolk height, yolk weight, and shell thickness. Egg weight, length, and width were measured on intact eggs using a sensitive weighing scale and a digital calliper, respectively. To assess internal egg quality, the eggs were carefully cracked onto a flat glass surface. The albumen and yolk were then separated, and their respective weights were recorded after measuring albumen height with a tripod micrometre. Following this, the eggshells were gently washed and left to air dry for 48 hours, after which shell thickness was determined.
For all statistical analyses in this study, JMP Pro version 18 [14] was used.
Egg quality traits (EW, AH, AW, YH, YW, SW, YC, EL, and EWi) were subjected to a one-way analysis of variance using the general linear model procedure of SAS JMP Pro software to determine the effect of breed. Treatment means were separated using student’s t-test at a 95% confidence interval. The linear model employed was:
where:
= Observed value of the egg quality traits
= Overall mean
= Fixed effect of the ith breed (i = Bovans Brown, Fayoumi)
= Random residual error term
Classification tree algorithm is a method used for analyzing a criterion variable that has two or more categories. Researchers use CTA to segment data based on combinations of independent variables, thereby identifying patterns that best predict the outcome of the criterion variable [15]. CTA method, a form of recursive partitioning, is particularly useful for classification tasks. It selects the best predictor using various impurity or diversity measures, with the goal of creating subsets of data that are as homogeneous as possible concerning the target variable [5]. The CTA process begins at the root node, which contains only the response variable and no fragmentation. The data is then split into binary nodes, recursively creating child nodes until homogeneous subsets are achieved. Each split aim to maximize homogeneity within the resulting subsets, continuing until the index of homogeneity meets specified criteria. This process involves assigning a predicted outcome class to each node and continuing the splits until further division is impossible. The final subsets, which are not subject to further division, are called terminal nodes (leaves). The number of leaves indicates the tree size, while the depth is determined by the number of edges between the root and the most distant leaves. CTA uses p-values with a Bonferroni correction as the splitting criterion. Pruning is employed to remove redundant branches and enhance the accuracy of the model.
The quality of the CTA models was assessed using several metrics: average squared error, cumulative lift, Kolmogorov–Smirnov statistics, misclassification rate, and the area under the ROC curve. Lower average squared error and misclassification rate, alongside higher cumulative lift, Kolmogorov–Smirnov statistics, and ROC area, indicated better model quality.
Analysis of Variance
Table 1 below presents the independent t-test results that reveal significant differences in egg quality traits between the Bovan and Fayoumi chicken breeds, highlighting the effect of breed on various egg quality traits. The analysis underscores the importance of breed selection in poultry production, particularly concerning egg quality traits, which are crucial for both consumer preference and industrial applications. The analysis of egg weight (EW) reveals that Bovan Brown chickens have a significantly higher mean (63.46 g) compared to Fayoumi chickens (44.33 g), indicating a substantial breed effect (P < 0.05). The greater egg weight in Bovan Browns is likely due to their larger body size and more efficient feed conversion. Previous research supports that larger breeds generally produce heavier eggs [16]. This trait is important in the egg market, where heavier eggs are preferred and command higher prices [17]. Albumen height (AH), an important indicator of egg quality, is also significantly higher in Bovan Browns (10.85 cm) compared to Fayoumis (6.65 cm), with a significant breed effect observed (P < 0.05). Higher albumen height is associated with better egg freshness and quality, which are valued in both table eggs and those used in the food industry [18,19]. The superior albumen height in Bovan Browns may be related to genetic factors that enhance protein deposition in the albumen [20,21]. Similarly, albumen weight (AW) is significantly higher in Bovan Browns (42.04 g) compared to Fayoumis (26.07 g), reflecting a significant breed effect (P < 0.05).
Note: a, b when different superscripts are indicated in the same row for a given trait, it means that there is a significant (P < 0.05) effect of breed. AH = Albumen Height, AW = Albumen Weight, EL = Egg Length, EW = Egg Weight, EWi = Egg Width, SW = Shell Weight, YC = Yolk Colour, YH = Yolk Height, YW = Yolk Weight, and ST = Shell Thickness.
Albumen weight contributes directly to overall egg weight and is a key determinant of egg quality. The marked difference in albumen weight may result from variations in metabolic efficiency and genetic predispositions [22,23]. This finding is relevant for the egg processing industry, where albumen is a valuable product [24,25]. Yolk height (YH) and yolk weight (YW) also differ significantly between the breeds. Bovan Browns show higher yolk height (14.36 cm) and yolk weight (15.71 g) compared to Fayoumis (14.00 cm and 14.34 g, respectively; P < 0.05). These traits affect consumer preference and nutritional value, influencing the egg’s visual appeal and nutrient content [26,27]. The larger yolk in Bovan Browns is likely related to their overall egg size, as yolk traits often correlate with egg size [28,29]. In terms of shell quality, Bovan Browns have a significantly higher shell weight (SW) (4.99 g) compared to Fayoumis (4.00 g), indicating a breed effect (P < 0.05). However, shell thickness (ST) does not significantly differ between the breeds, with Bovan Browns and Fayoumis showing similar values (0.81 cm and 0.79 cm, respectively). This suggests that while Bovan Browns have heavier shells, the thickness remains similar, possibly due to a trade-off between weight and thickness [30,31].
Heavier shells in Bovan Browns may offer better protection during handling and transport [32]. Interestingly, the only trait where Fayoumi chickens excelled was yolk colour (YC), with Fayoumis showing a more intense yolk colour (1.37) compared to Bovan Browns (1.23), which was statistically significant (P < 0.05). Yolk colour is influenced by diet, particularly carotenoid content, and is a key factor in consumer preference [33,34]. The darker yolk in Fayoumis may result from differences in diet absorption efficiency or genetic factors that enhance carotenoid deposition [35]. Lastly, Bovan Browns have significantly greater egg length (5.45 cm) and egg width (4.23 cm) compared to Fayoumis (5.05 cm and 3.76 cm, respectively; P < 0.05). These differences highlight the impact of breed on egg dimensions, which can affect packing and handling processes [36].
Classification Tree Algorithm (CTA)
This study employed CTA analysis to differentiate between two exotic chicken breeds, Bovan Brown (BV) and Fayoumi (FM), based on a range of egg quality traits. The model exhibited strong predictive accuracy, with R² values of 0.989 for the training set and 0.930 for the validation set, underscoring the model's robustness. These high R² values suggest that the selected predictors—albumen height (AH), albumen weight (AW), egg length (EL), egg width (EWi), egg weight (EW), shell weight (SW), shell thickness (ST), yolk colour (YC), yolkheight (YH), and yolk weight (YW)—were highly effective in distinguishing between the chicken breeds. This result is consistent with previous research, which has also highlighted the efficacy of these traits in breed discrimination [37]. The analysis began with the root node [Node 0] (Figure 1), encompassing all 1006 observations, where the G² value of 1383.4 and LogWorth of 280.1 suggested substantial variability to be explained. The model generated five splits, with egg weight (EW) and albumen height (AH) emerging as primary discriminators.The first split occurred at EW<52.0 g [Node 1], forming two distinct groups. The majority (553 observations) fell below this threshold, predominantly classified as FM (Prob = 0.994), with a minimal presence of BV (Prob = 0.006). This outcome aligns with prior research that identified egg weight as a critical factor in breed differentiation [38].
The model’s precision in classifying nearly all FM breeds (550 out of 553) at this node highlights EW's significance as a predictor. Further splitting at AH<9.2 [Node 3] resulted in a node where all remaining observations (524) were classified as FM, demonstrating near-perfect classification. This suggests that a combination of low EW and low AH is almost exclusively indicative of FM chicken eggs. Previous studies have also underscored the importance of albumen traits, particularly albumen height, in distinguishing between breeds [39]. A small subgroup (29 observations) within the EW<52.0 g and AH≥9.2 cm [Node 4] revealed a mixed breed classification, with 10.3% classified as BV and 89.7% as FM. This indicates a nuanced relationship between these traits and breed classification, where higher AH slightly increases the likelihood of an egg being from the BV breed. However, FM's dominance in this node suggests that even within these trait ranges, FM eggs are more prevalent.In contrast, the subset of eggs with EW≥52.0 g [Node 2] (453 observations) was predominantly classified as BV (Prob = 0.986). Further refinement using AW and AH showed that higher albumen weights (AW≥32.8) [Node 6] and albumen heights (AH≥7.6) [Node 10] were particularly associated with the BV breed, with this node exhibiting near-perfect classification accuracy. This supports the idea that these traits are strong indicators of the BV breed [40].
An interesting finding emerged in the subset where EW≥52.0 and AW<32.8 [Node 5] (11 observations), which showed a near-even split between BV and FM. Despite some classification uncertainty indicated by a G² value of 15.2 and a LogWorth of 2.8, considering AH (AH≥7.6) [Node 10] led to 83.3% of the eggs being classified as BV, suggesting that AW's interaction with AH provides clearer differentiation. Overall, the CTA highlighted the dominant role of egg weight and albumen traits in breed discrimination. The model's high accuracy across both training and validation phases underscores the reliability of these predictors. However, the findings also reveal that while these traits are generally effective for breed classification, nuanced interactions can lead to mixed classification in edge cases. Future research could explore these interactions further, possibly incorporating additional egg quality traits to enhance classification precision [41].
Egg Quality Traits Importance
The analysis of trait importance revealed that egg weight (EW) was the most influential factor in breed classification, accounting for nearly 94.4% of the model's discriminative power. This aligns with previous research that has consistently identified egg weight as a primary determinant in poultry breed differentiation [38]. Albumen weight (AW) was the second most significant predictor, with a G² value of 48.7. Although AW's influence was less than that of egg weight (EW), it was important for refining breed identification, especially when EW alone was not sufficient. This highlights AW's role as a useful secondary indicator for distinguishing chicken breeds [40]. Albumen height (AH) also contributed to the model, with a G² value of 27.8 from two splits. While its impact was smaller compared to EW, AH might interact with traits like AW to improve breed discrimination in specific cases. This finding is consistent with earlier research showing that albumen height has a limited but significant effect on breed classification [39].Other egg quality traits, such as yolk height (YH), yolk weight (YW), shell weight (SW), egg length (EL), egg width (EWi), shell thickness (ST), and yolk color (YC), had minimal impact on the model, with G² values of 0.0. This indicates that these traits did not significantly differentiate the breeds in this dataset, likely due to the dominant role of EW [41] (Table 2).
Note: AH = Albumen Height, AW = Albumen Weight, EL = Egg Length, EW = Egg Weight, EWi = Egg Width, SW = Shell Weight, YC = Yolk Colour, YH = Yolk Height, YW = Yolk Weight, and ST = Shell Thickness.
Terminal Leaf Report and Confusion Matrix
This report provides detailed insights into how the egg quality traits influence breed discrimination. The CTA results indicate that egg weight (EW) and albumen height (AH) are critical predictors for discriminating between the two breeds. The tree’s structure begins with a primary split on EW, with a threshold at 52.0 g, which effectively distinguishes the majority of the FM breed when combined with albumen height. This finding aligns with previous research, where egg weight has been consistently identified as a key factor in breed differentiation [38] (Table 3). Leaf 1) EW < 52.0 & AH < 9.2: This leaf, representing the largest proportion of FM eggs (Prob = 0.999), contained 524 instances of FM and none of BV. The near-perfect classification suggests that eggs lighter than 52.0 g and with a low albumen height are almost exclusively characteristic of the FM chicken breed. This result emphasizes the strong predictive power of these two traits when combined, which is consistent with findings by [39], who noted that albumen height is particularly crucial in differentiating between poultry breeds.Leaf 2) EW<52.0 & AH≥9.2 & EW<49.0: In this leaf, all 24 instances were FM, with a classification probability of 0.985. The lower egg weight (<49.0 g) alongside a higher albumen height continues to strongly indicate the FM breed. Leaf 3) EW<52.0 & AH≥9.2 & EW≥49.0: This leaf presents a more balanced classification, with three instances of BV and two of FM (Prob = 0.562 for BV and 0.438 for FM).
Note: AH = Albumen Height, AW = Albumen Weight, EL = Egg Length, EW = Egg Weight, and YW = Yolk Weight, BV = Bovan Brown, FM = Fayoumi
The increase in egg weight, closer to the 52.0 g threshold, introduces some ambiguity in classification, suggesting that while FM is still likely, there is a significant probability of encountering the BV breed as well. This scenario underscores the complexity of using egg quality traits for breed discrimination, particularly when dealing with overlapping characteristics [37]. Leaf 4) EW≥52.0 & AW<32.8 & AH<7.6: All five instances in this leaf were FM, with a classification probability of 0.92. This result indicates that even when egg weight exceeds 52.0 g, if the albumen weight and height are low, the egg is most likely from the FM breed. These findings align with those of [41], who also found that FM eggs typically exhibit lower albumen measurements compared to other breeds. Leaf 5) EW≥52.0 & AW<32.8 & AH≥7.6: This leaf exhibited a dominant BV presence, with five instances of BV and only one of FM (Prob = 0.79 for BV). This outcome suggests that when albumen height increases beyond 7.6, despite a lower albumen weight, there is a strong likelihood of the egg belonging to the BV breed. This interaction between albumen height and weight reflects the nuanced nature of breed classification, where certain trait combinations can reverse expected trends [40]. Leaf 6) EW≥52.0 & AW≥32.8: The final leaf, which represents the most robust classification for BV (Prob = 0.999), contained 442 instances of BV and none of FM. High egg weight combined with high albumen weight serves as a definitive marker for the BV breed.
The near-perfect accuracy of this classification highlights the distinctiveness of BV eggs when these traits are maximized. This finding is consistent with earlier studies, which have shown that BV eggs tend to be larger and heavier with more substantial albumen content [37]. The classification tree algorithm effectively identified key egg quality traits that differentiate between BV and FM breeds, particularly highlighting the importance of egg weight, albumen height, and albumen weight. The results underscore the role of these traits as reliable indicators of breed, with specific thresholds for each trait providing clear classification rules. However, the analysis also reveals that while these traits are generally effective, there are instances of ambiguity where certain combinations, such as higher AH in lighter eggs, can lead to mixed classification. These findings suggest that while the model is robust, further refinement could involve exploring additional traits or interactions to enhance accuracy, particularly in edge cases where the classification is less clear.
Confusion Matrix Analysis
The confusion matrix results offer valuable insights into the model's classification accuracy. In the training dataset, the model achieved perfect classification for Breed BV, correctly identifying all 450 instances, while for Breed FM, it accurately classified 553 out of 556 instances, yielding a near-perfect classification rate of 0.995. Only three instances of Breed FM were misclassified as Breed BV, reflecting a very low error rate of just 0.5%. The validation dataset further demonstrated the model's strong predictive performance. Out of 150 instances of Breed BV, 149 were correctly classified, and 184 out of 185 instances of Breed FM were accurately identified. The misclassification rates remained minimal, with just one instance of Breed BV misclassified as Breed FM, resulting in accuracy rates of 0.993 for BV and 0.995 for FM. These high accuracy rates underscore the model's robust generalization capability, indicating that it performs nearly as well on unseen data as it does on the training data [42] (Table 4).
Note: Values in before brackets are counts while those in brackets are per cent.
Receiver Operating Characteristic (ROC) Analysis
The ROC analysis further validates the model's efficacy, with both breeds BV and FM achieving a perfect area under the curve (AUC) scores of 1.0 for the training data. This indicates that the model perfectly distinguishes between the two breeds without any overlap or ambiguity in predictions. The validation data AUC scores, while slightly lower at 0.994 for both breeds, still reflect near-perfect classification performance, confirming the model's excellent ability to differentiate between the breeds with high confidence (Figure 2). The results from the Classification and Regression Tree (CART) analysis demonstrate the model's exceptional capability to differentiate between poultry breeds based on egg quality traits. The model exhibited outstanding fit statistics, low error rates, and near-perfect classification accuracy across both the training and validation datasets, highlighting its robustness. The minor decline in performance observed on the validation data compared to the training data is a typical outcome, reflecting the model's generalization capacity. The performance metrics, including the low Root Mean Squared Error (RASE) and Mean-Log p values, indicate that the model is not only accurate but also precise in its predictions. These results align with previous research that employed machine learning algorithms for breed classification based on phenotypic traits, thereby reinforcing the validity of this approach [43]. Additionally, the near-perfect Area Under the Curve (AUC) scores suggest that the model has substantial potential for practical applications, such as enhancing breeding programs, optimizing egg production, and advancing other aspects of poultry science.
This study shows that Bovan chickens outperform Fayoumi chickens in most egg quality traits, except for yolk colour. These results align with previous research, which highlights the role of breed in determining egg quality. Such findings have important implications for breeding programs and commercial egg production. They emphasize the need for careful breed selection to enhance egg quality and meet market demands. The Classification Tree Algorithm (CTA) is highly effective for distinguishing between chicken breeds based on egg quality traits. Egg weight emerged as the most significant predictor. Although albumen weight and albumen height also contributed to classification, their impact was less than that of egg weight. The model's high accuracy in both training and validation datasets confirms the CTA's reliability for breed differentiation.
The authors declare that they have no conflicts of interest.
This research was conducted with the financial support of Haramaya University.