Discriminant Function Analysis is a commonly preferred method for sex estimation as it renders prediction and subsequent identification more mathematical, objective, and direct [17]. However, DFA has several assumptions which must be, under ideal scenarios, satisfied prior to its application. Discriminant analysis, being a parametric approach, assumes a normal distribution and homogeneity of the variance-covariance matrix. The method is sensitive to outliers, and in order to avoid overfitting requires a large enough sample size, i.e., at least 3–4 times the number of independent variables [27, 62, 63]. DFA has previously been employed for sex estimation with different markers across the skeletal framework [83,84,85,86,87,88,89], including the acetabulum [2, 4, 12, 16,17,18, 20, 24, 25, 44, 79]. Such previous investigations highlighted the vertical acetabular diameter as one amongst the most discriminatory sex variables [17, 19, 25, 44, 81]. Discriminant Function Analysis in the present study indicates that sex can be differentiated through a demarking point, with males on the higher side (positive group centroid value) and females on the lower side (negative pole). Patriquin et al. [19], Bubalo et al. [4], Steyn & Iscan [23], and Patriquin [44] reported similar findings for South African, Croatian, Greek, and African populations, respectively. Accuracy percentages obtained with LDFA across numerous studies are listed in Table 5. Accuracy obtained with males of the training set in the present study are comparable to those reported previously [2, 4, 12, 18, 19, 22, 23] while accuracy obtained with males of the training and test set in the present study are higher than those reported by Patriquin for South African white males [19], and Steyn and Patriquin for South African and Greek populations [24]. Cross-validation accuracies reported by Macaluso for a French male population [78] are higher than those obtained here for an Iberian male test population. For females of the training set, however, accuracy percentages obtained herein were higher than those reported by Patriquin for black females [19], alone. With females of the test set, lowest accuracy percentages were obtained with the present Iberian population when compared to previous studies [19, 78]. For the total population, with the training set of the Iberian population, lowest accuracy percentages were obtained here with the exception of Steyn and Patriquin [24] who reported a marginally lower accuracy of 82.50%. For the test set/ cross-validation, the present study reported lowest accuracy percentages. These differences in accuracy percentages, however, were not statistically significant. It is also prudent to mention here the lack of homogeneity with regards to test groups and cross-validation. While the present study employed a test group or holdout group of 15% of the total study sample, certain other investigations utilised a LOOCV (leave-one-out cross validation). This, too, could have contributed, in part, to these observed differences, albeit non-significant, in accuracy.
Table 5 Sex estimation accuracy with the acetabular diameter, reported across literatureIn comparison to LDFA, QDFA yielded marginally higher accuracy percentages for females of the training set. Males and the total population, however, garnered lower accuracy with QDFA. With males of the test set, QDFA gave higher accuracy percentages, whereas for females and the total population, LDFA yielded higher accuracy. Such varying patterns in accuracy call into question intervening factors which can influence the association between acetabular diameter and sex. Future investigations should attempt to establish the influence of such factors, for example age, stature or body mass index, on the accuracy of the acetabular diameter to estimate sex. It is highly plausible that when such additional factors are taken into consideration, QDFA might present as the more accurate statistical approach, attributable to its use of a more flexible i.e., quadratic decision boundary. It is also important to note here that all assumptions associated with DFA were not satisfied in the present study, primarily with regards to normality of the sample. This, too, could have resulted in the inconsistencies in accuracy percentages observed with the linear and quadratic approach.
Logistic Regression Analysis constitutes another commonly utilised statistical approach for sex estimation [38, 41, 42, 90,91,92,93]. LRA, being a semiparametric approach has fewer assumptions to satisfy when compared to DFA, i.e., primarily a large sample size is warranted. LRA is more flexible in comparison to DFA and does not mandate a normally distributed sample, linearly related predictor variables, homoscedasticity, and works well with both discrete, as well as continuous data [27]. Despite this inherent flexibility, the use of LRA for sexing the acetabulum is relatively unreported [12, 75, 80]. Furthermore, Nagesh et al. [80] employed the acetabulum-pubis index for sex estimation, as opposed to just the acetabular diameter, and measured the acetabular diameter using a procedure different from the present study. Macaluso [78], on the other hand, employed the acetabular diameter and Logistic Regression Analysis for sex estimation and reported accuracies ranging from 84.10 to 89.60%, higher than those obtained herein with Iberian populations (Table 5).
It is, however, worth mentioning that discrepancies in accuracies between aforementioned research studies, and the present study, could also be attributed to the underlying sexual dimorphism present in the population (s) under scrutiny. Possible sex differences between the populations studied in previous researches, in comparison to the present study sample, may be one of the factors responsible for the observed differences.
ROC curve plotted with the aforementioned traditional statistical methods demonstrated an acceptable discrimination power [71] between sexes when using this acetabular attribute.
Machine learning approachesArtificial Neural Networks, a form of supervised Machine Learning, is being increasingly incorporated into sex estimation investigations [94,95,96,97]. Unlike DFA, ML, and by extension ANN, does not mandate satisfying any assumptions regarding distribution of the sample. An extensive literature search revealed that ANN has not been employed for sexing the acetabulum so far. As a result, findings of our study could not be corroborated by previous literature. However, convolutional neural networks have previously been utilised to estimate sex from the acetabular morphology and yielded an accuracy of 74.60% [98]. The neural networks used within the present research were built using different activation functions- hyperbolic tangent and sigmoid within the input and hidden layers, and hyperbolic tangent, sigmoid, and softmax within the output layer. Given that the required output is in the form of 0,1, sigmoid activation presents as the ideal choice for input and hidden layers, whereas softmax activation is apt for the output layer as the intended objective is to undertake classification of subjects into mutually exclusive classes. However, hyperbolic tangent within the output layer yielded higher accuracy, when compared to the softmax function. Greater in-depth research into how varying activation functions can impact observed sex estimation accuracy, and possible reasons for this, is currently wanting.
Support Vector Machines, another class of supervised Machine Learning approaches enable both, regression, as well as classification. An effective application of SVC warrants that the data be linearly separable and this is often achieved through the use of kernels. An advantage of SVC for sexing is that it is applicable even with smaller datasets. While SVC has previously been utilised for sex estimation with different skeletal markers [46, 99, 100], its usage for sexing the acetabulum is presently lacking. This prevented a comparative evaluation of our results. Preliminary scatter plot evaluations of the training set data points indicated that a linear separation is feasible for the data at hand i.e., males and females occupy, by and large, different spaces which can be linearly separated. This finding is corroborated by both, centroid values obtained with DFA [4, 19, 23, 44], as well as the known anatomical size differences between the two sexes [26]. Linear kernels are additionally advantageous as they are often simpler and quicker to train. Nevertheless, given the occasionally higher accuracy observed with QDFA, it might be beneficial to incorporate additional factors such as age, and investigate sexing accuracy using polynomial kernels within future investigations. In fact, the dynamic shape metamorphosis of the pelvis across adult human lifespan has been already noticed [101].
Decision Trees are another mode of non-parametric supervised Machine Learning which allow for regression and classification problems. Being, primarily, a non-statistical approach Decision Trees require no assumptions regarding distribution, or variance. However, they do warrant certain non-statistical assumptions such as the discretization of continuous variables. Decision Trees have previously been utilised for sexing the pelvis [27, 63], and also the acetabulum [98, 102]. However, Yusuf et al. [102] reported cumulative accuracies (involving multiple variables pooled together) for sexing with Decision Trees, whereas, Cao et al. [98] focussed their investigation towards morphological sex differences of the acetabulum. Different Decision Trees can be constructed using specific growing methods such as CHAID (Chi-square Automatic Interaction Detection), CRT (Classification and Regression Trees), QUEST (Quick, Unbiased, Efficient Statistical Tool), etc. CHAID and CRT trees carry out splits using a chi-squared test and computations of Gini impurity, respectively. QUEST trees, on the other hand, split on the assumption that the target variable is a continuous variable. The three growing methods differ not only on the splitting method employed, but also the kind of data they can handle. While CHAID and QUEST work with categorical variables, CRT works equally well with categorical and continuous data. Klales et al. utilised all three growing methods for sexing the pelvis and reported similar accuracy values using all three [27]. In keeping with these findings, and with the data flexibility accorded by CRT, this method alone was employed within the present study. Accuracy obtained with DTC herein could not be compared with previous literature due to lack of similar data. Future investigations should attempt to decipher how the use of different growing algorithms can impact accuracy and bias associated with sexing the acetabulum, if at all.
kNN is a type of non-parametric supervised ML classification approach, and thus does not warrant satisfying any assumptions regarding sample distribution. Classification within kNN ensues on the basis of patterns observed within the data, as opposed to predetermined labels. k in kNN denotes the most similar individuals in a reference sample (training set), and subsequent classification is undertaken based on group identities of these similar individuals [27]. kNN algorithms have previously been utilised for sexing the pelvis [27, 49]. However, there is a scarcity of literature regarding its application for sexing the acetabulum. Since kNN classification relies greatly on the nearest neighbour, different k values can impact obtained accuracy values significantly. Within the present study, k of 50 and 75 proved to be most reliable for sex estimation. Further attempts should be made to utilise different ‘k’ in order to better understand the performance of kNN classification as a result of varying k values.
ROC curve plotted using machine learning approaches garnered an excellent discrimination [71] with ANN, and acceptable discrimination using DFA, LRA, SVC, DTC, and aforementioned activation functions yielded an outstanding discrimination [71] between the two sexes.
In the present study, DTC yielded highest accuracy percentages for females and the combined population, and ANN garnered the most accurate results for males. The improved performance observed with ML approaches herein, is in agreement with previously undertaken studies [45, 46, 100, 103,104,105,106,107]. High performance measures observed with DTC can be attributed to the inherent pruning characteristic of Decision Trees which prevents overfitting of data. Furthermore, the use of a single variable in the present study could have also contributed to the observed high accuracy through the creation of a simple tree with pure leaf nodes, as opposed to complex trees with consistently declining purity. Artificial Neural Networks, in turn, generate high accuracy by modelling heteroscedasticity much more efficiently, as well as its ability to predict the unseen/ unknown through generalization. A high AUC value for ANN ROC curves further validates the accurate performance of Neural Networks (Table 4).
Males of the training set garnered higher accuracy percentages in comparison to females with most statistical approaches, with the exception of QDFA and DTC. For the test set, however, by and large, higher accuracy percentages were observed in females. The only exceptions to this dictum were LDFA and QDFA, wherein males of the test set demonstrated higher accuracy. Previously undertaken investigations with LDFA and LRA have also indicated such variable findings, with certain studies reporting higher accuracy in males [12, 19, 23, 44, 78], and certain others illustrating higher accuracy percentages for females [4, 12, 19, 24,
Comments (0)