The present study uses data from the PROMPt project, a prospective screening and indicated prevention implementation study [50]. It was conducted in the municipal area of Dresden, Germany, from 10/2018 until 09/2022. Children were screened for disruptive behavior and emotional problems during their regular health check-ups at their pediatrician (U9-U11 for usually 5 to 10 year olds) using the parent report of the Strengths and Difficulties Questionnaire [18, 19]. Based on the SDQ results and the pediatrician’s clinical expertise, parents received feedback and – if indicated – a recommendation for further action from their pediatrician. For children with disruptive behavior problems, the participation in the indicated prevention program adapted from the Baghira group training for children with oppositional and aggressive behavior by Aebi et al. [1] [short: Baghira training] was recommended. The indicated prevention program “Mutig werden mit Til Tiger” (Becoming brave with Til Tiger, short: Tiger training, Ahrens-Eipper et al., [2] was recommended for children screened positively for emotional problems. Children with abnormal (i.e. clinically relevant) levels of mental health problems were advised to contact counseling or therapeutic institutions for further diagnostics and, if necessary, treatment.
Process for training participationFamilies interested in participating in the recommended indicated prevention program were to contact the study team directly. Families that did not do so within 3–4 weeks despite a prevention recommendation were attempted to be contacted by the study team up to five times to determine their interest in participating, provided that consent to contact has been given. Additionally to the children screened during pediatric health check-ups, other accession routes to the indicated prevention programs were possible and observed in PROMPt, e.g. self-assignment, recommendation by friends or acquaintances if no regular health check-up was coming up soon or if the child had private health insurance.
All interested families were invited to an on-site initial interview with a psychologist (project team member) at the TUD Dresden University of Technology (TUD) to evaluate if the child would benefit from the indicated prevention program and inclusion criteria were fulfilled. Exclusion criteria for training participation were a diagnosed disruptive behavior or emotional ICD-10 mental disorder or unstable medication in the last 6 months, a current psychotherapeutic treatment, and acute endangerment of self or others to ensure that children’s symptomatology was within the prevention spectrum and in line with clinical guidelines. In case of accession through an additional route, families also filled in the SDQ during the interview. Children evaluated as eligible for participating in the Baghira or Tiger training were assigned to a similar aged training group (± 1 year) of three to five children (M = 4.24, SD = 0.50; ideal group size of six was reduced due to the Covid-19 pandemic) by the study team. If children were eligible for participating in both trainings due to co-occurring disruptive behavior and emotional problems, a participative decision was derived between study team member, parents and child, if applicable, as to which training was more appropriate. In six cases, both trainings were conducted consecutively (three each with Baghira or Tiger training first). For training participation, all legal guardians gave their written informed consent and children their verbal consent.
AssessmentsQuestionnaire assessments took place at four time points: screening at the pediatrician’s office or during initial interview by the study team, T0 shortly after screening or before begin of training, T1 approximately 6 months after screening or shortly after end of training, and T2 approximately 12 months after screening (i.e. 6 months after T1 or end of training). At screening, the SDQ and a short project specific questionnaire including socio-demographic data were completed in paper-pencil. For training participants, T0 and T1 were administered via tablet at the TUD and T2 via online questionnaire. Participating families received 10€ for each fully completed assessment. For families not participating in a training, all questionnaires at T0, T1 and T2 were filled in online. These families could win family games in a raffle for each fully completed assessment. All families gave their written informed consent for participation in the questionnaire assessments.
MeasuresStrengths and Difficulties Questionnaire (SDQ)The German version of the SDQ for 4 to 17 year olds [18, 19] was administered as a screening instrument at screening, T1 and T2. It consists of 25 items with 5 items each for the sub-scales emotional problems, conduct problems, hyperactivity/inattention, peer relationship problems and prosocial behavior. The questions referred to the past six months and were rated on a 0- to 2-point response scale (0 = not true, 1 = somewhat true, 2 = certainly true). For the project flow and the current analyses, the subscales conduct problems (to determine disruptive behavior) and emotional problems were considered, because they were the basis for the decision on the prevention indication at screening. When calculating the sub-scales’ sum scores, a maximum of two missing values each were imputed by the child’s mean score of the respective sub-scale (corresponding to Ravens-Sieberer et al., [37]. Higher scores indicated greater problems. For the PROMPt project, the usual cut-offs for categorization [18] into normal, borderline and abnormal scores (conduct problem scale 0–2, 3, 4–10; emotional problem scale 0–3, 4, 5–10) were slightly modified in order to reach more children with potential prevention indication (conduct problem scale 0–2, 3–5, 6–10; emotional problem scale 0–3, 4–6, 7–10). In the analysis sample, the total SDQ showed good internal consistency (α = 0.85 − 0.86 depending on measurement time point). In previous work, the SDQ sub-scales of the parent version were found to have moderate to satisfactory test-retest reliability and concurrent validity [45].
KINDL-RWe administered the German parent versions of the KINDL-R Quality of Life Questionnaire for Children, using the Kiddy-KINDL-R version for children not attending school and the Kid-/Kiddo-KINDL-R version for schoolchildren [34, 35]. Each version consists of six sub-scales – physical well-being, emotional well-being, self-esteem, family, friends and everyday functioning – with four items each. Items are rated on a 5-point response scale (never, seldom, sometimes, often, all the time). Both versions are identical besides the sub-scale everyday functioning, which contains questions either about nursery school/kindergarten or school. For analyses we combined both versions (referring to as “KINDL”), because many of the nursery school/kindergarten children at T0 started school until T2 and therefore switched between versions and score calculation was identical. Sum scores were calculated with a maximum of 30% missing values being imputed by the child’s mean score of the respective sub-scale according to the manual. Higher scores indicated greater quality of life. In the analysis sample, the Kiddy-KINDL-R and Kid-/Kiddo-KINDL-R showed good internal consistency (α = 0.86 − 0.89 depending on version and measurement time point). The KINDL has been shown to have moderate to satisfactory test-retest reliability and convergent validity [9, 34, 49].
SampleA flowchart of the analysis samples is illustrated in Fig. 1. Further details on the full study sample of the PROMPt project can be found in Weniger et al. [51]. Briefly, n = 3739 study invitations to families during the regular pediatric health check-ups were documented. Of these, n = 3231 children were screened at the pediatrician’s office (response rate based on documented study invitations: 86.4%), of which n = 387 had to be excluded due to a lack of written informed consent, resulting in a total study sample of n = 2844. As the SDQ screening result was missing, another n = 19 subjects had to be excluded from the total sample to build the screening sample for analyzing the prevalence (SDQ screening), which finally included n = 2816 children. For the analysis of quality of life (KINDL at T0), n = 1104 children were assessed, as the KINDL T0 result was missing for a further n = 1721 subjects. For the analyses of program effectiveness, a separate sample was composed based on the group assignment following the SDQ screening, pediatricians’ appraisal and initial interview. For this, the total sample was extended by n = 120 children entering the project via other access routes. N = 13 subjects had to be excluded because group assignment was not possible, resulting in an assigned group sample of n = 2951. Of these, n = 1932 were evaluated as normal with no recommendation for prevention participation (Normal). N = 934 were recommended to participate in one of the two indicated prevention programs offered by the study team, of which n = 337 participated (Training): n = 192 in the Baghira training (Baghira) and n = 145 in the Tiger training (Tiger). Families of n = 597 children refused training participation despite recommendation (NoTraining) of which n = 330 received a Baghira training recommendation (NoBaghira), n = 207 received a Tiger training recommendation (NoTiger) and n = 60 received a recommendation for both Baghira and Tiger training. The latter were categorized based on their screening SDQ score as NoTiger (n = 39) if they had higher emotional than conduct problems scores, and as NoBaghira (n = 21) if their conduct problems score was higher than or equal to their emotional problems score. This is similar to the training decision in the initial interview, according to which children took part in the respective training appropriate for their main problem and the Baghira training was chosen more frequently in cases of uncertainty. This resulted in a total of n = 351 in the NoBaghira and n = 246 in the NoTiger group. Finally, n = 85 children were evaluated to have abnormal or clinically significant disruptive behavior or emotional problems or did not fulfill inclusion criteria for participation (Abnormal). As many families dropped out of the study after screening, children were excluded from the analyses if the respective questionnaire data for at least one sub-scale was missing at all measurement time points. Due to the study design, where only the SDQ was filled in during the screening and all other questionnaires were firstly administered at T1, the number of participants with data missing at all time points differed strongly for the SDQ (n = 6) and KINDL (n = 1504). Therefore, separate analyses were run for the SDQ (n = 2945 with n = 1928 Normal, n = 337 Training (n = 192 Baghira, n = 145 Tiger), n = 595 NoTraining (n = 350 NoBaghira, n = 245 NoTiger), n = 85 Abnormal) and KINDL data (n = 1441 with n = 907 Normal, n = 334 Training (n = 191 Baghira, n = 143 Tiger), n = 146 NoTraining (n = 83 NoBaghira, n = 63 NoTiger), n = 54 Abnormal). Due to the design of the PROMPt project with the screenings tied to the regular health check-ups at the pediatricians it happened that the same child entered the project twice as distinct cases in the data set. In this study’s analysis samples this occurred to four children: two children categorized once as Training and once as NoTraining and two children categorized twice as Normal. All four of them were included in the SDQ assigned group sample with both data entries. In the KINDL assigned group sample only one of the once Training and once NoTraining children was included with both entries and the other three with only one entry each.
Fig. 1Flowchart of the analysis samples. n = number of participants; SDQ = Strengths and Difficulties Questionnaire; KINDL = Kiddy-KINDL-R and Kid-/Kiddo-KINDL-R Quality of Life Questionnaire for Children; Normal = children evaluated as normal with no recommendation for indicated prevention participation; Training = children who participated in an indicated prevention program after recommendation; NoTraining = children who did not participate in an indicated prevention program despite recommendation; Abnormal = children with abnormal or clinically significant disruptive behavior or emotional problems or that did not fullfill inclusion criteria for participation in an indicated prevention program; Baghira = children who participated in the Baghira training; Tiger = children who participated in the Tiger training; NoBaghira = children who did not participate in the Baghira training despite a recommendation including children with a recommendation for both trainings and a higher or equal SDQ conduct problems than emotional problems score; NoTiger = children who did not participate in the Tiger training despite a recommendation including children with a recommendation for both trainings and a higher SDQ emotional problems than conduct problems score
Prevention programsThe indicated prevention programs Baghira training and Tiger training were conducted at the TUD by certified trainers (psychologists or psychology master students).
Baghira trainingThe Baghira training (adapted from Aebi et al., [1] consisted of nine weekly 90 min group sessions for learning strategies for anger control and resolving conflicts appropriately. The different sessions covered content about emotion and self-awareness, dealing with anger and aggression, impulse control, conflict- and problem-solving, empathy, change of perspective and giving feedback. After introducing the topic of the session, the trainer helped the children practice the desired behavior in role plays. Further, the training included a reward program for encouraging the use of appropriate behavior. Additionally, children were to practice the learned strategies during the week. Adaptations to the original Baghira training contained slightly modified material to fit the younger target group (5–10 year olds instead of 8–13 year olds), reduction from 120 min to 90 min per session, inclusion of a short break and an added 90-minute information evening for parents. The latter was conducted concurrently to one of the training sessions to give parents information and advice on how to appropriately react in everyday situations and to help their child to handle anger and frustration.
Tiger trainingThe Tiger training [2] consisted of two one-on-one appointments with the child followed by nine weekly group sessions of 60 min each. In the former, the child got acquainted with the trainer and the shy hand puppet Til Tiger, who learned to become brave together with the child during the training. In the latter, the trainer and Til Tiger introduced different topics, e.g. doing something in front of others, rejecting something, making a justified demand, defending themselves against teasing without violence, which were then practiced in role plays. Furthermore, children were encouraged to practice learned strategies during the week. Additionally, a short version of progressive muscle relaxation was applied at the end of each session.
Data analysisStatistical analyses were carried out using the software STATA version 17.0 [44]. Descriptive statistics (number of participants, n; percent, %; mean, M; standard deviation, SD) were calculated for the SDQ screening results and the KINDL T0 data to describe the current status of disruptive behavior and emotional problems as well as quality of life among children from the general population using the pediatrician screening sample. To assess the association of disruptive behavior and emotional problems and quality of life, linear regression models were calculated for the KINDL score and the categorized SDQ result, adjusted for child’s sex and age. For both, the descriptive statistics of the SDQ and KINDL as well as the regression models, subjects who entered the project via other access routes than the screening at the pediatrician were excluded from these analyses, as they contacted the study team themselves with the need for prevention and might therefore bias the results. However, the results including these subjects can be found in the supplementary material. Further descriptive statistics were provided regarding socio-demographic and clinical characteristics of the analysis samples, separately for the assigned groups Normal, Training, NoTraining and Abnormal (based on the recommendation by the pediatrician after the screening with the SDQ and if any the initial interview) and the training groups Baghira, Tiger, NoBaghira and NoTiger. Baseline characteristics were compared using Chi-squared tests for categorical and t-tests or Kruskal-Wallis tests for metric variables. Post hoc Bonferroni-corrected Dunn’s tests of pairwise comparisons were carried out following significant Kruskal-Wallis tests.
To assess the effectiveness of indicated prevention programs, an intention-to-treat approach was applied. Linear mixed effect model regressions combined with the robust Huber/White/sandwich estimator were calculated for SDQ conduct and emotional problem scores and KINDL sub-scores focusing on the interactions measurement time point x assigned group in the total sample (pediatrician sample and other accession routes). Two models were calculated, using T0 and Normal (model 1) or Training (model 2) as references. Subjects were included as random effects, and childrens’ age and sex were taken into account as covariates. For disentangling possible effects by training group, equivalent models for the interactions measurement time point x training group were calculated. For better comparability of the outcomes and interpretability of beta coefficients as effect sizes regarding Cohen’s specification (≥ 0.20 small, ≥ 0.50 medium, ≥ 0.80 large effect; Cohen [11], , questionnaire scores were standardized by the pooled standard deviation by assigned group or training group at Screening (SDQ) or T0 (KINDL). The alpha level for all analyses was a priori set at 0.05. No correction for multiple testing regarding program effectiveness was applied. We decided to accept a 5% Type 1 error rate for each single test in exchange for a lower Type 2 error rate, thus favor sensitivity over robustness of findings in our observational study according with [39].
Description and handling of missing dataMixed effect model regressions are robust against missing data [47]. However, subjects with missing baseline values are omitted from calculation. In the current study, this would further reduce the number of subjects in the KINDL analyses due to missing data at T0 despite available data at T1 or T2. An overview of available data separately for the total sample and analyses samples by assigned group and training group (Table S1, S2, S3 and S4) as well as a comparison between children excluded vs. included in the KINDL analysis sample (Table S5) can be found in the supplementary material. As the T0 and T1 questionnaire assessments in the Training group took place at the TUD in combination with the training, dropouts were rare in this group. In contrast, families in the NoTraining and Abnormal group decided against training or sought help elsewhere, which often resulted in little interest in the further assessments and higher dropouts. Thus, we assumed that the missing pattern depended on the assigned group but not the KINDL baseline level and was therefore not assumed to be missing completely at random (MCAR), but rather missing at random (MAR). The performed Little’s Chi-squared test for the assumption MCAR was significant (Chi2 distance (919) = 1148.47, p < .001) and Little’s Chi-squared test for the assumption covariate-dependent missingness with overall group (assigned group and training group) as auxiliary variable was not significant (Chi2 distance (5514) = 2375.46, p = 1.00). This confirms that the missing pattern was not MCAR but MAR. Thus, we performed multiple imputation prior to the mixed model analyses. A comparison of sample characteristics between completer (data available at all three measurement time points) vs. non-completer (data available at least at one but not all three measurement time points) can be found in the supplementary material in Table S6 (SDQ analysis sample) and Table S7 (KINDL analysis sample). The multiple imputation model included SDQ emotional problems and conduct problems scores and all KINDL sub-scores at all measurement time points, child’s sex and age, as well as overall group as an auxiliary variable. The seed was randomly set to 7492. A-priori we selected 15 as number of imputations and conducted sensitivity analyses with up to 50 imputations. As these did not differ substantially, the number of imputations was retained for reasons of efficiency. Further, we also calculated the analyses without prior multiple imputation, which can be found in the supplementary material.
Comments (0)