Factor structure and trends in SF-12v2 health-related quality of life scores among pre-and post-pandemic samples in Thailand: confirmatory factor analysis and Rasch analysis

This study examined changes in item endorsement patterns in the general Thai population between two time points—pre- and post-COVID-19 pandemic. The DIF analysis revealed significant shifts in some of the eight subscales and two summary components across sociodemographic subgroups contributing to corresponding changes in the affected mean scores.

Factor analysis

As in previous studies [17, 26, 42, 43, 44], the CFA did not support a single-factor model with all 12 items of the SF-12v2 loaded onto one latent construct. However, the CFA validated the two-factor structure of the Thai SF-12v2, with 12 items loading onto their respective summary components, which remained consistent across all three datasets. These findings suggest that the Thai general population conceptualized the SF-12v2 items similarly as other populations did, in alignment with the developer’s original model [17, 25, 42]. The psychometric evaluation further supported the use of both the PCS and MCS for HRQoL measurement, with evidence of strong convergent and discriminant validity and satisfactory internal consistency across datasets.

Consistent with the CFA results and a previous study [45], Rasch did not support a unidimensional structure for the 12-item SF-12v2. The model demonstrated significant item–trait interactions and failed to achieve the unidimensionality criterion in the combined dataset, despite achieving satisfactory reliability (PSI = 0.84–0.86). Nevertheless, Rasch analysis supported the eight domain subscales, the PCS, and MCS, which satisfied the assumptions of the Rasch model, including nonsignificant item–trait interactions, unidimensionality, and reliability, without requiring the subtest construction across three datasets. These findings contradict those of studies in populations with stroke [16] or lung cancer [46], where Rasch models supported only the MCS. The different characteristics of the tested populations could account for this discrepancy.

Comparison of mean HRQoL scores before and after the pandemic

Analysis of mean score differences across sociodemographic variables revealed that older participants and those with comorbidities reported significantly lower mean scores on the physical health-related subscales (PF, RP, GH and BP) and the PCS than their counterparts. Mental health-related subscales (SF and RE) were also significantly lower among participants with comorbid conditions. However, no significant differences in MCS scores were detected across sociodemographic subgroups. These findings partially align with previous studies conducted in the general Polish population [10] and among older adults in the Netherlands [47] and China [44], which reported lower PCS scores among older and comorbid participants. However, those studies also found significantly lower MCS scores among participants with chronic illnesses. This discrepancy may stem from differences in analytical methods. The aforementioned studies used ANOVA and independent-sample t-tests, which do not account for confounders, whereas this study employed a GLM, enabling adjustment for multiple sociodemographic varibles and thus yeilding more robust associations. Additionally, comorbidites in the current sample were primarily related to physical illness, likely exerting a limited influence on mental health. Therefore, the MCS might be insensitive to meaningful differences in mental health among health status subgroups.

The GLM revealed significant differences in the RE subscale and the MCS, while the physical-related subscales and the PCS remained unchanged across pre-and post-COVID-19 dataset. The RE subscale appeared to contribute significantly to the observed MCS differences, consistent with the CFA findings in which RE02 and RE03 had the highest loading on the MCS across all three datasets (standardized coefficient: 0.781\(\:-\)0.958). These findings suggest that participants’ mental health improved after the COVID-19 pandemic. Conversely, prior studies have reported a significant decline in both physical and mental health of participants following COVID-19 infection [48, 49]. Several explainations may account for these discrepancies. First, different instruments, recall periods, and health domains were used to assess HRQoL. Second, the proportion of older adults (aged \(\:\ge\:\)60 years) was higher in the post- (25.15%) than the pre-COVID-19 dataset (13.2%). Older adults generally report better mental health than younger participants, aligning with Thai SF-36v2 data, which indicate higher MCS scores among older Thai adults than among younger participants (aged between 18 and 29) [50]. This is very possibly statistical artifact associated with the proprietary scoring of the summary scores. The component summary scores of the SF-12v2 are calibrated to approximate the summary scores of the SF-36v2, which are in turn based on analysis of SF-36v1 data. The proprietary scoring is based on an orthogonal EFA solution analysing the standardized subscale scores, and the consequences of applying negative scoring coefficients to negative standardized subscale scores is that lower PCS scores cause higher MCS scores and vice versa [51]. Thus, older individuals have lower physical health scores, resulting in higher mental health scores by definition. Third, as the postpandemic dataset was collected after the acute phase of infection, the SF-12v2 items may lack sensitive to detect residual health impacts. Similarly, previous research demonstrated that the PCS and MCS scores before COVID-19 infection, as well as at 3 and 6 months following the baseline assessment, were higher than the baseline scores. However, both summary scores demonstrated a slight upward trend over time [20]. These findings suggest that COVID-19 infection had a negative impact on both physical and mental health during the acute phase of the illness, yet the effects diminished by six months after the baseline assessment. Therefore, it is hard to determine whether the higher MCS scores were due to the pandemic effect.

DIF analysis across different sociodemographic subgroups

Adequate Rasch model fit for the eight subscales, as well as the PCS and MCS enabled the DIF analysis to assess item endorsement across sociodemographic factors and the time points. Significant DIF was observed for the PF subscale was observed by source (pre-and post-COVID-19 datasets), age, and health status. Graphical inspection revealed that post-COVID-19, older, and unhealthy participants determined the mean scores of each class interval that were consistently below those of their counterparts. These results indicated greater item endorsement difficulty and significantly lower PF scores in these groups, with the exception of the source. The absence of a significant difference in PF score by source may be attributed to methodological differences: DIF used two-way ANOVA, whereas GLM assessed mean differences.

Similarly, significant DIF for the PCS was observed by source, age, and health status; however, mean PCS differences were not significant when considering only the source. The DIF graph also indicated that the lines representing the post-COVID-19 dataset and older and unhealthy participants were consistently below those of their counterparts, suggesting that the endorsement difficulty of the PCS was greater for those participants than for their counterparts, yielding lower mean PCS scores. The MCS exhibited significant DIF by source and age, despite the absence of significant mean MCS differences across age subgroups. DIF graphs suggested that adults aged between 18 and 36 years old and prepandemic participants had greater endorsement difficulty for MCS items. Notably, the ICCs indicated that the lines representing sociodemographic subgroups were crossed for all subscales and summary components, suggesting that the DIF is significantly nonuniform rather than uniform [18].

Research implications

In this study, the CFA indicated that MH03 and VT02 had low factor loadings (< 0.70) on the MCS subscale, suggesting that these items—and possibly the MCS subscale overall—may be insensitive to variations in mental health among the general Thai population. Future health surveys in Thailand should include participants with a broader spectrum of physical and mental health conditions to more accurately capture HRQoL using the physical and mental health subscales of the SF-12v2.

As the findings of the SF-12v2 validation showed that the physical subscales and PCS effectively differentiated HRQoL across sociodemographic subgroups, it implied that the SF-12v2 can be incorporated into clinical assessment to evaluate treatment effectiveness among patients with physical illnesses. Furthermore, the PSI values for the eight-domain SF-12v2 and the two summary components (PCS and MCS) were all greater than 0.7, indicating sufficient reliability for those two scales [39]. These results support the use of those two scales for comparing HRQol level between groups undergoing different treatment procedures, resulting in facilitating the selection of the most appropriate clinical intervention to achieve the highest clinical outcomes.

Although the SF-12v2 may not adequatly capture some positive states such as calmness and peacefulness (MH03) or vitality (VT02), it can effectively assess other important aspects of mental health such as role limitations due to emotional problems, depression and social functioning to identify the psychological impact of chronic conditions. These measures may support the identification of mental health consequences and guide the most appropriate coping strategies for patients with chronic illnesses. If the goal of HRQoL measurement is to measure the positive feelings, the SF-12v2 can be complemented with mental-specific instruments to provide a more comprehensive evaluation of mental well-being within target population.

Future studies are also encouraged to further validate the SF-12v2 questionnaire using clinimetric approach [52].

Study limitations

Some limitations of this study should be addressed. First, post-COVID-19 participants were not included; consequently, the mean score changes for both the eight domain subscales and two summary components may not reflect the impact of the COVID-19 pandemic. Second, most comorbidities were mainly related to physical illnesses; therefore, SF-12v2 may have been insufficiently sensitive to detect mental health differences. Participants in future studies should have both physical and mental health conditions to better represent the general Thai population. Third, the two summary component scores were calculated using a proprietary scoring method, which may have caused in inconsistencies between item-level data and summary scores. Future research should further develop scoring coefficients derived from CFA for the Thai population to address this issue. Fourth, the eight subscales, GH, VT, BP, and SF are based on a single item, whereas the others include two items. Single-item subscales cannot independently fit a measurement model, limiting their psychometric interpretatbility [19]. Caution should be exercised when interpreting findings based on subscale scores. Fifth, CFI and TLI might be inflated when using polycholic correlations; therefore, these indices should be interpreted with other fit indices, such as RMSEA and SRMR.

Comments (0)

No login
gif