To address this research gap, our study leveraged the extensive dataset of the Adolescent Brain Cognitive Development (ABCD) study, which includes a shortened version of SRS (short SRS, SSRS) and comprehensive phenotypic data from over 10,000 children across the United States. We primarily investigated correlates of SSRS among typically developing children. Meanwhile, we utilized a subset diagnosed with ASD as positive controls, since numerous studies and ongoing projects have referred to the SRS as indicative of 'autistic-like traits', a dimension present across the general population and potentially sharing etiology with ASD [11]. Given this presumption, we formulated four primary hypotheses. First, children with higher SSRS scores are likely to exhibit demographic profiles similar to those with ASD, such as a higher male-to-female ratio. Second, the traits may show a high degree of heritability and a correlation with the polygenic risk score (PRS) for ASD, an acknowledged genetic factor that explains 40-76% of the etiology [11,12,13]. Third, the traits may share similar neural correlates with ASD, such as atypical development of the frontal and temporal lobes, reduced gray and white matter volumes, and functional or structural alterations in the default mode network (DMN) [14,15,16,17]. Lastly, children with higher SSRS scores are anticipated to display a spectrum of physical and mental health comorbidities commonly associated with ASD. Through systematic examination of these hypotheses, our study aims to provide a comprehensive characterization of the SRS phenotype in children.
The ABCD Study enrolled a total of 11,876 children between 9 and 11 years of age through school systems at 21 research sites across the United States between 2016 and 2018 [18]. This cohort is comprised of over one-fifth of the population within this age range in the United States [18, 19]. This study utilized the ABCD Data Release 4.0. In this study, we excluded children who had no or invalid total score of SSRS (N = 1906), which was of our primary interest. All participants and their guardians provided informed consent [18].
This study examined the relationship between SSRS scores and various characteristics known to be associated with ASD. These characteristics include the proportion of males [20], PRS for ASD [21], structural brain alterations [22], white matter integrity [7], brain connectivity [23], cognitive function [24], behavioral problems [25], sleep disturbances [26], and psychotic-like symptoms [27].
By design, the ABCD study excluded children diagnosed with "moderate or severe" ASD, intellectual disability, and major neurological conditions [18]. During recruitment, trained interviewers conducted comprehensive face-to-face assessment with the children and their guardians [28]. Diagnosis of ASD was based on a parent-reported response to the question, "Has your child been diagnosed with autism spectrum disorder?". Additionally, parents reported whether their children with ASD were attending regular classes at school, which might reflect their disease severity.
The SSRS is an 11-item parent-reported instrument (scored from 0-3 per item, with a total score ranging from 0-33) derived from the 65-item Social Responsiveness Scale (SRS). SSRS data were collected at one-year follow-up assessment. The SRS is a validated screening tool for ASD and could reflect disease severity [10, 12, 29, 30]. To address the right-skewed distribution of SSRS scores and explore the potential non-linear associations with outcomes, non-ASD children were categorized into four groups based on the 33rd, 66th, and 95th percentiles of their SSRS scores, referred to as Q1, Q2, Q3, and the top 5% (Figure S1). A higher score indicated more pronounced autistic-like traits. Children with ASD were classified as a distinct group.
Genotype data from the ABCD Study were sourced from saliva or blood specimens using the Affymetrix NIDA SmokeScreen Array [31]. A subset of 5807 individuals of European descent were selected based on genetic lineage. Quality control and imputation were conducted using PLINK v1.90 [32], Michigan Imputation Server [33], and Eagle v2.4, resulting in 4673 samples for analysis. To construct PRS for ASD, we incorporated data from the iPSYCH-2017 dataset including 18,382 ASD cases and 27,969 controls of European ancestry [34], using a continuous shrinkage with a global shrinkage prior of 0.01 [35]. The first ten ancestry principal components were calculated and used as covariates in the PRS-related analyses.
The ABCD teams conducted standardized preprocessing pipelines to the structural, diffusion, and resting-state functional magnetic resonance imaging (MRI) data [36], including manual quality control, reconstruction, and subcortical segmentation using the FreeSurfer v5.3 software (http://surfer.nmr.mgh.harvard.edu/) [37]. The quality control was performed following the recommended image inclusion criteria of the ABCD 4.0 (see Supplement).
For structural MRI (sMRI), high resolution T1- and T2-weighted structural MR images (1 mm isotropic, prospective motion correction) were collected and processed [36]. The regions of interested included 68 cortical thicknesses, 68 cortical areas, 68 cortical volumes, and 40 subcortical volumes defined by the Destrieux atlas [38, 39].
In the collection of diffusion tensor imaging (DTI) data, high angular resolution diffusion imaging data were acquired at a resolution of 1.7 mm isotropic using multiband acquisition (factor = 3), comprising 96 diffusion directions and four distinct b-values [36, 37]. The ABCD team employed AtlasTrack, an automated segmentation method based on a probabilistic atlas, for the labeling of major white matter tracts [40]. In this study, we studied tract-average mean diffusivity (MD) and fractional anisotropy (FA) of 35 major white tracts.
Resting-state functional MRI (fMRI) scans in children were conducted using the high-resolution imaging with a multi-band technique [37]. The standardized preprocessing procedures included registration, distortion correction, and normalization [37]. Subsequently, connectivity measures were extracted within and between networks from the parcellated cortical ribbon, as delineated by the Gordon atlas [41]. The fMRI data obtained on Philips scanners were excluded due to problems in processing, in accordance with previous studies [42]. In line with this approach, we specifically examined 36 network-level RSFC averages encompassing both intra- and inter-task-control circuits that were reported to be related to ASD and social function [43,44,45], including cingulo-opercular (CO), cingulo-parietal (CP), dorsal attention network (DAN), fronto-parietal (FP), salience network (SN), ventral attention network (VAN), DMN, and auditory network (AN).
NIH Toolbox were utilized to assess the neurocognitive performance [46]. The NIH Toolbox Cognitive Function Battery comprises seven tasks that assess different dimensions of cognitive abilities. Cognitive performance was quantified using age-corrected T scores (mean = 100, SD = 15), which reflect a comprehensive measure of intelligence quotient (IQ). These scores were further analyzed to distinguish between crystallized and fluid intelligence components [47].
Behavioral problems were evaluated using the parent/guardian-reported child behavior checklist (CBCL) [48], a comprehensive tool to evaluate internalizing and externalizing symptoms across 113 items. The CBCL generates three composite scores: internalizing, externalizing, and total syndrome [49], with higher scores indicating greater severity of symptoms. Additionally, exploratory analyses were conducted on CBCL-derived measures, including anxiety-depress, withdraw-depress, somatic, social, thought, attention, rule-breaking, and aggressive problems.
Sleep problems were evaluated using the Sleep Disturbance Scale for Children (SDSC) [50], reported by parents or guardians based on observations over the previous six months. The SDSC comprises a 26-item inventory rated on a 5-point Likert-type scale, resulting in six subscales that address various sleep disturbances, including disorders of initiating and maintaining sleep, sleep breathing disorders, disorder of arousal, sleep-wake transition disorders, disorders of excessive somnolence, and sleep hyperhidrosis. The overall sum score, which ranges from 26-130, indicates the severity of sleep problems, with higher scores reflecting poorer sleep quality.
Psychotic-like symptoms in adolescents were measured using the Prodromal Questionnaire - Brief Child version (PQ-BC) [51], which includes items querying the occurrence and distress level of psychotic experiences, rated on a 5-point Likert scale. The total PQ-BC score ranges from 0-105, with scores ≥ 2 standard deviations above the mean considered indicative of significant psychotic-like symptoms.
We consider the following covariates as confounders: age, sex (male or female), race/ethnicity (White, Black, Hispanic, Asian, or other), and family annual income (<$35,000, $35,000 ~ $75,000, $75,000 ~ $100,000, or ≧$100,000) [52]. Body mass index (BMI; kg/m, continuous) and pubertal score were also described, given their potential association with ASD [53, 54]. The pubertal score was the average of self- and parent-reported score on a scale from 1 (prepuberty) to 5 (post puberty) [55].
To confirm SSRS as a valid measure of autistic-like traits, we assessed the prediction performance of SSRS for ASD using the receiver operating characteristic (ROC) curve and reported the area under curve (AUC), which ranged from 0-1. An AUC close to 1 indicated that SSRS would be highly predictive for ASD.
To investigate the correlation between SSRS and demographic characteristics, we applied linear, logistic, and multinomial regressions, whichever appropriate, and tested the trend associated with a higher SSRS group with weights assigned according to the median of each category. The trend test was only performed among children with no ASD diagnosis. Also, we compared the demographic characteristics between children with ASD and children in the lowest SSRS tertile (Q1) for comparison, using t-test and chi-squared test, whichever appropriate.
The ABCD study is characterized by a substantial number of twins and siblings, permitting family-based design to estimate heritability. The ABCD team identified the monozygotic twins, dizygotic twins, and non-twin siblings based on genetic relatedness data [56]. To identify full siblings from non-twin siblings, we limited the sample to children whose both recorded guardians were their biological parents. In this subsample of twins and full siblings, we calculated the Spearman's rank correlation coefficients for the SSRS scores within each kinship group (monozygotic twins, dizygotic twins, and full siblings). We applied the ACE model to decompose the variance in SSRS into proportions that could be explained by additive genetics (A), common environment (C), and unique environment (E), using structural equation modeling [57]. The ACE model assumed that the genetic relatedness was 1 between monozygotic twins and 0.5 between dizygotic twins and full siblings [25]. Correlation of shared environmental effect was assumed to be 1 between each pair of twins and full siblings. Unique environment involves nonshared environmental factors as well as measurement error. In an additional analysis, we excluded full siblings for comparison. SSRS scores were log-transformed to meet normality assumptions and regressed for age, sex, and study site prior to the ACE analyses [25]. The goodness-of-fit was measured and compared by log-likelihood, chi-square, Comparative Fit Index, Tucker-Lewis Index, and Root Mean Square Error of Approximation. The model was fitted with the R package lavaan (version 0.6-18) [58].
We applied generalized linear mixed-effect models to estimate the associations between SSRS group and outcomes, accounting for the nested structure (family nested within site, or family nested within MRI scanner for neuroimaging analysis) of the ABCD data [59]. Specifically, linear regression was applied for normally distributed outcomes, including the PRS for ASD, neuroimaging measures, cognitive functions, and sleep problems. Poisson regression was applied to CBCL scores that were right-skewed and passed over-dispersion. Logistic regression was applied to the dichotomized outcome, psychotic-like symptoms. The association with PRS was investigated adjusting for the first ten principal component of the ABCD genotyping data. This adjustment was applied in both the regression models and correlation analyses. The associations with neuroimaging measures and mental health problems were investigated adjusting for sex, age, race, and family income [59]. In structural MRI analysis, we additionally adjusted for the child height, T1 image signal-to-noise, and intracranial volume. In DTI analyses, we additionally adjusted for mean frame-wise displacement. In RSFC analyses, we additionally adjusted for the number of frames retained after processing. In analyses on neuroimaging measures, due to the generally small effect size and the relatively small number of children in the top 5% group after quality control, risk estimates within this group were likely to be underpowered. Thus, we were interested in both alterations in the Q3 and the top 5% group compared to the Q1 group. For significant findings within the Q3 group, we then confirmed the directionality of the risk estimates in the top 5% group. Considering potential different etiology of autistic traits between male and female [8, 20], we tested the interaction term of SSRS × sex and stratified the analyses by sex when the interaction was significant. Trend test was also performed to investigate the trend of risk estimates associated with SSRS.
Previous reports showed correlations between SRS and other psychiatric conditions [51]. In sensitivity analysis, we excluded non-ASD children that had diagnoses of other psychiatric conditions reported by parents, including attention deficit hyperactivity disorder, depression, bipolar disorder, anxiety, phobias, schizophrenia, alcohol or substance use disorder, and other psychological or psychiatric diagnoses. We performed all analyses using R software version 4.2.1. Mixed models were fitted with the package lme4 (version 1.1-31) [60]. Missing values were imputed by chained equations with package mice (version 3.15.0). In the neuroimaging analyses, to address the issue of multiple comparisons, false discovery rate (FDR) correction was applied.