## Original Article

Exp Neurobiol 2020; 29(1): 27-37

Published online February 7, 2020

https://doi.org/10.5607/en.2020.29.1.27

© The Korean Society for Brain and Neural Sciences

## Attentional Connectivity-based Prediction of Autism Using Heterogeneous rs-fMRI Data from CC200 Atlas

Yaya Liu1, Lingyu Xu1*, Jun Li2*, Jie Yu1 and Xuan Yu1

1School of Computer Engineering and Science, Shanghai University, Shanghai 200072, 2South China Academy of Advanced Optoelectronics, South China Normal University, Guangzhou 510006, China

Correspondence to: *To whom correspondence should be addressed.
Lingyu Xu, TEL: 86-132-6266-9978, FAX: 86-21-6613-5273
e-mail: frht_sh@163.com
Jun Li, TEL: 86-180-1170-2622, FAX: 86-20-8521-3484
e-mail: jun.li@coer-scnu.org

Received: November 9, 2019; Revised: January 21, 2020; Accepted: January 28, 2020

(http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and
reproduction in any medium, provided the original work is properly cited.

### Abstract

Autism spectrum disorder (ASD) is a developmental syndrome characterized by obvious drawbacks in sociality and communication. It has crucial significance to exactly discern the individuals with ASD and typical controls (TC). Previous imaging studies on ASD/TC identification have made remarkable progress in the exploration of objective as well as crucial biomarkers associated with ASD. However, glaring deficiency is manifested by the investigation on solely homogeneous and small datasets. Thus, we attempted to unveil some replicable and robust neural patterns of autism using a heterogeneous multi-site brain imaging dataset from ABIDE (Autism Brain Imaging Data Exchange). Experiments were carried out with an attention mechanism based on Extra-Trees algorithm, taking the study object of brain connectivity measured with the resting-state functional magnetic resonance imaging (fMRI) data of CC200 atlas. With cross-validation strategy, our proposed method resulted in a mean classification accuracy of 72.2% (sensitivity=68.6%, specificity=75.4%). It raised the precision of ASD prediction by about 2% and specificity by 3.2% in comparison with the most competitive reported effort. Connectivity analysis on the optimal model highlighted informative regions strongly involved in the social cognition as well as interaction, and manifested lower correlation between the anterior and posterior default mode network (DMN) in autistic individuals than controls. This observation is concordant with previous studies, which enables our proposed method to effectively identify the individuals with risk of ASD.

### Graphical Abstract

Keywords: Autism spectrum disorder (ASD), Classification, Autism Brain Imaging Data Exchange (ABIDE), Functional magnetic resonance imaging (fMRI), Default mode network (DMN)

### INTRODUCTION

Autism spectrum disorder (ASD) is a common neurodevelopmental disorder characterized by a range of phenotypes, which vary in the severity of sensorimotor and social deficits [13]. Investigating the potential biomarkers associated with ASD is a meaningful issue in the field of psychiatric neuroimaging. It holds promise to better understanding the underlying causes of ASD, which would have far-reaching implications for the diagnostic aids as well as targeted treatment in autism.

Recent efforts to explore ASD biomarkers [49] have focused on the connectivity analysis on regions of interest (ROIs) using resting-state functional magnetic resonance imaging (rs-fMRI). For example, Kana et al. [4] demonstrated weaker functional connectivity of the anterior cingulate cortex with middle temporal gyrus, which might lead to poor semantic and language processing in ASD. Monk et al. [5] found that the severe restricted and repetitive behaviors of autistic individuals were correlated with stronger connectivity between the posterior cingulate cortex and right parahippocampal gyrus. Moreover, a greater local connectivity in the left superior parietal lobule, precuneus and angular gyrus, and right supramarginal gyrus was presented to be associated with the deficient cognitive function of ASD [6]. Besides this, a growing number of efforts [79] have also indicated atypical connectivity in the default mode network (DMN) of ASD, which might make a negative effect on the self-referential cognitive processes. All the above observations have linked variations in brain connectivity to the behavioral traits and mental health condition associated with ASD. This implies the feasibility of using measures of functional connectivity with rs-fMRI to identify ASD biomarkers. However, the involved data in these imaging studies was acquired from small as well as homogeneous populations, which would result in a limited reproduction and generalization of functional biomarkers.

One of the possible solutions suggested in many recent studies [1017] is to investigate potential biomarkers on larger and heterogeneous dataset, as provided by ABIDE (Autism Brain Imaging Data Exchange). It is an open-access repository that aggregates neuroimaging data from 539 autistic individuals and 573 typical controls (TC) across 17 international imaging sites [18]. Based on the rs-fMRI data from ABIDE, previous work on connectivity measures have made some considerable achievements, which resulted in accurate prediction of ASD. For example, Kong et al. [10] achieved a high accuracy of 90.39% through the experiment on 182 subjects, and Plitt et al. [11] presented 95.19% on 178 subjects. Moreover, Chen et al. [12] classified a number of 252 subject with classification accuracy of 91%, and highlighted the prominent role of somatosensory regions in autism. Although these efforts are desired to be affirmed, the application on controlled data (<300 subject) limits their capabilities to provide more generalized findings. Hence, in this paper, we attended to perform connectivity analysis on about all subjects from ABIDE, with the purpose to extract replicable as well as robust neural patterns for accurate differentiation between individuals with ASD and TC.

Among various sets of ROIs, CC200 atlas defined by Craddock et al. [19] is promising in the measurement of functional connectivity. It is generated via spatially constrained spectral clustering in the context of resting state connectivity analyses. Compared with anatomically derived ROI atlases (e.g., Talairach and Tournoux [20], Harvard-Oxford [21], and Eickoff-Zilles [22]), CC200 atlas consists of such ROIs with anatomic homology that could offer increased interpretability. At present, there have been some studies on connectivity analysis with CC200 atlas and ABIDE data [13, 14, 17], but it is not enough to build accurate prediction of ASD. For example, Dvornek et al. [13] achieved good accuracy of 68.5% using 1100 subjects from ABIDE, and highlighted functional networks and regions that are known to be implicated in ASD. Afterwards, Heinsfeld et al. [14] performed better classification (70%) based on 1035 subjects, and demonstrated the functional anticorrelation between anterior and posterior areas of brain. Besides this, Eslami et al. [17] obtained similar results with a mean accuracy of 70.3% to Heinsfeld et al. [14] under the same number of subjects. To our best knowledge, this is the best differentiation result in previously reported studies with almost all ABIDE data.

Under this background, the present study aims to make an accurate and cogent differentiation between autism and controls based on the connectivity features derived from CC200 atlas. Specifically speaking, we performed the experiments with rs-fMRI data from ABIDE through a combining method of Extra-Trees and supervised machine learning method. The former concentrated the attention on features with most importance for desired identification, and the latter is used to train available model for accurate classification on subjects. And besides, we make an exploration of the remarkable neural patterns that would contribute to the clinical diagnosis of ASD. To facilitate the understanding, the results is interpreted in terms of the potential functional networks and regions in the brain. A noteworthy problem in this study is that the related terms with “concentration” and “attention” are only interpreted as the methodological aspects associated with brain networks, except where noted.

### Image acquisition and preprocessing

In order to recognize efficient functional patterns for aiding diagnosis of ASD, our study was carried out on the heterogeneous rs-fMRI data from ABIDE. We successfully downloaded a dataset of 1102 subjects from the Preprocessed Connectomes Project, and then excluded subjects whose imaging data shows some missing. Finally, data of 506 ASD individuals and 548 controls was included in our study. Table 1 contains the aggregation of critical demographic information, including distribution of ASD and TC by sex, age, and the mean Framewise Displacement (FD) quality measure. The average age of subjects was 16.59 (±8.05) and 16.86 (±7.55) years old for ASD and TD group, respectively. It is worth noting that the number of male subjects is significantly more than female in the available data. To be specific, the ASD group consisted of 446 males and 60 females, and TC group 453 males and 95 females. There was no difference in the age, sex and FD between cases and controls (p>0.05).

The rs-fMRI data utilized in this study was preprocessed with the Configurable Pipeline for the Analysis of Conenectomes (C-PAC). The preprocessing procedure covers the operation of slice time and motion corrected, as well as voxel intensity normalized. More specifically, Nuisance signal was regressed to remove fluctuations through 24 motion parameters [23], low-frequency drifts (linear and quadratic trends), and CompCor with 5 components [24]. The functional data was band-pass filtered (0.01~0.1 Hz) and spatially normalized to a template space (MNI152) with the non-linear registration from ANTS [25]. For more details with the preprocessed data, please see http://preprocessed-connectomesproject.org/abide/.

### Construction of individual networks

Since the functional connectivity is characterized with extraordinary complexity, the brain network plays an increasingly important role in the investigation on brain-based disorders including ASD [26, 27]. Here, we made a tentative research on the construction of individual brain network with ROIs of CC200 atlas, denoted as Gcc. Deserved to be mentioned, the mean rs-fMRI signals of each ROIs were extracted for each subject.

For the morphological network Gcc, each ROI is defined as the node, and the Pearson correlation between each pair of ROIs is perceived as the edge. The coefficient, ranging from −1 to 1, is an index of the correlation between two areas in the brain. Values close to 1 indicate that the signals are highly correlated, whereas values close to −1 mean anti-correlated. Finally, each subject could be abstractly represented by a resultant feature with 200*(200–1)/2=19900 dimensional vector.

### Attention selection based on Extra-Trees

The resultant feature represented by functional connectivity between ROIs was used to identify subjects with ASD and TC. Due to the curse of dimensionality, effective classification on data with high dimensionality is an obvious challenge in neuroimaging studies. Some of the more common consequences are over-fitting of the learning models and less generalization on extended data [28, 29]. It is usually caused by the presence of uninformative or irrelevant features. To concentrate the attention on most relevant features for desired classification task, this paper adopted the Extra-Trees to assess relevant features, so as to reduce dimensionality for improved generalization performance.

Extra-Trees [30] is an effective technique making measurement on the discrimination of features with a range of diverse randomized trees. Compared with random forest method, it is more generalized to select features with the concept of randomization. Given the feature vectors x i , i=1……m, the attention vector could be expressed as:

${ F I ( 1 ) , ⋯ , F I ( i ) , ⋯ , F I ( m ) } , i = 1 ⋯ m$

And the importance of the i-th feature could be defined as:

$F I ( i ) = ∑ K = 1 N F I K ( i ) N , i = 1 ⋯ m$

In the above formula, the N is the number of randomized trees, and FI K (i ) represents the importance of i-th feature in K-th randomized tree. Specifically, the values of FI was computed using the Python functions in the package sklearn. It is noticeable that the greater the FI value, the more the discernment of this feature on classification may be.

### Experiments: prediction on heterogeneous data

ABIDE datasets is aggregated across 17 international sites and without any prior coordination. The random coupling inevitably leads to that the measure assessments on severity of autism vary somewhat across sites. As a consequence, the severity could not be presented directly by quantitative comparison in sites. Against that background, the identification of ASD and TC seems to be more credible based on these heterogeneous data. Several imaging studies [1017] have already demonstrated the point.

Discrimination between ASD and TC subjects could be modeled as a supervised learning task. We attempted to encode the resultant feature, derived from the individual brain network, to train the classifier model for good identification of ASD. In particular, most attention is being concentrated on the potential functional connectivity that is implicated with accurate prediction. We adopted the linear support vector machine (SVM) model to elaborate the interpretability. The underlying implementation of model depended on the build-in functions in the classification learner tools of Matlab.

An overview of the proposed framework for ASD/TC classification is depicted in Fig. 1. Firstly, we constructed the individual brain network for each subject, and then extracted connectivity features between ROIs from CC200 atlas. Next, we used Extra-Trees to compute the attention vector on all features, and filtered out functional connectivity with negligible contribution to the identification model. Finally, the 1935 top features were selected to perform the differentiation between individuals with ASD and TC via the SVM classifier.

To ensure the reliability of experimental result, we utilized 10-fold cross-validation to measure the accuracy of predicted label. Any hyper-parameter operated in the method was internally set in the nested cross-validation. For the challenges of site-related variability, we also performed an intra-site prediction based on the stratified shuffle split cross-validation. It splits participants into training and testing sets as homogeneous as possible, that is, preserving the ratio of samples for each site and condition. Considering of the inhomogeneity on quantity of samples, we used 80% of the fMRI data for training and the remaining for testing.

### Evaluation

Three declared indicators including accuracy, sensitivity, and specificity were calculated to make an evaluation on the performance of our proposed method for desired ASD/TC classification task. The detailed calculation could be formulated as follows:

$A c c u r a c y = T N + T P T N + F N + T P + F P$ $S e n s i t i v i t y = T P F N + T P$ $S p e c i f i c i t y = T N T N + F P$

Specifically, TN, FN, TP and FP indicate the number of true negative subjects, false negative subjects, true positive subjects and false positive subjects, respectively. The larger accuracy could be interpreted as the better the classification performance to some extent. Compared with the evaluation criterion of sensitivity, the higher value of specificity would be more in line with the clinical expectation [31].

### RESULTS

We employed 10-fold cross-validation strategy to evaluate the performance of our proposed model. To lower the potential effect caused by over-fitting, all the subjects are equally allotted into 10 subsets {S1, S2, . . ., S10} with a random fashion. The subset S1 is regarded as the testing set, and {S2, . . ., S10} is further grouped into 10 subsets as a whole dataset, in which one of the subset is denoted as validation set, while the others are utilized to modeling the rs-fMRI-based classifier.

Based on the selected features, we conducted 10 runs on the cross-validation procedure to determine the classification accuracy, as shown in Fig. 2a. An assessment on the accuracy of multiple experiments was presented in a way of function fitting, showing relatively stable trend from the overall perspective. Moreover, the corresponding residuals of accuracy in Fig. 2b represented a small fluctuation in a certainty scope of [−0.2, 0.2]. This intuitively validated the feasibility and reliability of the proposed scheme. Integrated simulation results revealed that the mean score on accuracy, sensitivity, and specificity are 72.2%, 68.8%, and 75.4% respectively. To our best knowledge, this result has so far not been reported in literatures.

By reducing the dimensionality of connectivity features, we achieved better differentiation between individuals with ASD and TC. It implied that the processed features could present more available patterns, which might be generalized by the learning model so as to identify ASD. For the intra-site cross validation, the highest accuracy resulted in 71.2%, and the mean classification accuracy was 67.7% (sensitivity=66.3%, specificity=68.9%) with a range of accuracy of 64.0% to 71.2% in individual folds. A recent study using intra-site cross validation to evaluate the prediction on most of ABIDE data leaded to 66.9% accuracy [32]. The slightly lower classification results might be attributed to the disequilibrium distribution of training samples in sites that affects the generalization of informative patterns.

### Classification performance

To demonstrate the superiority of our proposed method on the tasks for ASD/TC classification, we performed a comparison of present experimental results with previously relative studies, as illustrated in Table 2. The studies to classify similar quantity of ABIDE data for experiments were firstly taken into account. Visual observation shows that our proposed method achieved the greatest integrate capability in three evaluation indicators than other methods. Most notably, a recent imaging study that attempted to classify ASD with deep learning method achieved an accuracy of 70.3% (sensitivity=68.3%, specificity=72.2%) [17]. To our best knowledge, this is the best differentiation result in the previously reported studies with almost all ABIDE data. By contrast, we improved the classification accuracy with about 2% and specificity with 3.2%. Although there was a little loss on the sensitivity compared with Heinsfeld et al. [14], the higher value of specificity would be more in line with the clinical expectation than sensitivity [31].

It is also worth mentioning that we achieved lower accuracy than some reported studies that attempted to identify ASD based on less imaging data. Particularly, such imaging studies [1012] had achieved high classification accuracy above 80% and even 90%. To assess the realistic and clinical prospect of our model, we took into account with the calculation on another two valuable metrics of PPV and NPV, i.e., positive and negative prediction values [33]. It could provide a more comprehensive evaluation on the generalization ability of the learning model. The concrete calculation depends on the relationship between sensitivity, specificity and prevalence of ASD. Analysis on the proposed model indicated a NPV of 68.9% and PPV of 66.4%, respectively. From a clinical point of view, the fact emphasizes that most people are not autistic, so that the high NPV is to be expected. The PPV means that the application of machine learning methods on brain imaging data is not driven by the purposes of diagnosis. Rather, it is a data-driven approach to inform what most likely are the neural patterns associated with the disorder [14].

Besides this, a Wilcoxon Signed Ranks related groups test [34] was executed on the classification results for further evaluation. Specifically, we compared the individual label predicted by learning model to the ground truth of subjects. For the well-trained SVM classifier in this study, the results demonstrated no statistical difference between the classified labels and ground truth (p=0.633). Summarized and analyzed by the above, the overall performance and reliability on classification of ASD was improved in our study. That is, our proposed method achieved more considerable classification of ASD than others reported studies with ABIDE data. But remarkably, each one was established on different subsets with diverse clinical or imaging features, thus the above comparison on classification performance between various studies provided only the relative results.

To be sure, the accuracy of 72.2%, obtained in the present study, improves the current state of the art. Several studies thus far have suggested that supervised learning methods are effective in identifying data with high-dimensional spaces [10, 11, 14]. In this paper, the most obvious significance of attentional selection is to effectively reduce the dimensionality of problems with an abstracted feature space, which attempts to represent more complex functions for learners. The obviously validity makes the method combining brain connectivity with attentional dimensionality reduction more reasonable to identify autism from diverse individuals.

### Dataset heterogeneity

Skilled analytical thinking and learning should not be drowned out by the severe challenge of classifying data across multiple sites, such as ABIDE data. It is generated through sharing and aggregating independent data across 17 international sites with consistent imaging modalities (i.e., rs-fMRI). At present, a growing number of efforts [1017] have revealed the potential of using ABIDE data for the detection and generalization of potential biomarkers associated with ASD. However, uncontrolled variation in the aggregated samples seems to be a major inhibitor of accurate differentiation between individuals with ASD and TD.

To be specific, such variation is manifested in various heterogeneous factors, which might range from acquisition protocols (e.g., imaging sequence [35]) to recruitment strategies (e.g., agegroup, IQ-range, and level of severity in clinical symptoms [32, 36]). There is no doubt that the existence of heterogeneity in large dataset might compromise the coherence of information between different sites. It would result in a less ideal identification of ASD, which was demonstrated in several recent imaging studies [1417, 32].

In this case, many researchers [1012, 36] was motivated to limit the number of sites or samples for an accurate identification. For examples, Plitt et al. [11] achieved a high accuracy of 95.19% based on 252 subjects from only three sites (NYU, USM and UCLA_1). And Chen et al. [36] presented an accuracy of 79.17% through the experiments on a number of 240 subject from 6 separate sites. Hereinto, there are lots of valuable aspects that might enlighten the research and clinical settings, but in the meantime inevitably limit the heterogeneity - not only in the race, age, sex, severity of clinical symptoms, but also in the socioeconomic status.

Although fewer site-wise variations or the absence of such sensitive variation in the dataset is more beneficial to discriminate between the individuals with ASD and TC, the variability might contribute to a better understanding of the brain-base disorder [14]. Thus, in this paper, nearly all of the rs-fMRI data from ABIDE (1054 of 1102 subjects) was utilized for the performance evaluation of our method. This could keep the distribution of dataset heterogeneity at the maximum level, so as to obtain more robust identification of individuals and general insights in the understanding of underlying causes to ASD.

Furthermore, our proposed method combines brain connectivity with attentional selection, which could encompass such variations in the aggregated samples and yield better results than those shallow methods. The improvement on classification could be explained by the Extra-Trees’ potential on coping with the latent factors of intricate features and by the capacity of SVM to encode variations in data. It is suggested that the Extra-Trees algorithm with randomization conception could better handle complexities of multi-site as well as big brain imaging datasets than Random Forest and the like.

### Neural patterns: connectivity in the autistic brain

A connection between two brain regions could be considered as informative and discriminative if it contributes to the desired identification for subjects with ASD and TC. To better understand the potential neural patterns driving the best identification, we made a particular investigation on the most discriminative connections resulted in the highest classification accuracy, as depicted in Fig. 3. Analysis on the contribution of functional connections proved that the connections colored as red and green were stronger in controls, and blue connection was stronger in ASD patients. Moreover, worth mentioning was that the respectively statistical hypothesis test on the three discriminative connections indicated significant differences on the identification of two groups: ASD and TC (p<0.05).

To investigate the atypical connections on identifying ASD, we explored the cognitive function of related informative ROIs by a meta-analysis tool named Neurosynth. It mainly focus on the comparison of desired brain map with various fMRI studies to-taled over 10,000 and the assignation of correlations for the map to almost 1335 terms [37]. Table 3 shows, for each above ROIs, the top associated Neurosynth terms on anatomy and function. It could be understood as obvious similarity between the defined functional connectivity network by the current seed region and the set of diverse regions associated with a particular term in the Neurosynth database. The concrete implementation is described with the individual binary mask for each informative brain regions. Visual observation on the descriptors shows that these functional regions are of great significance for supporting the social cognition and interactions. While these functions are proved to be deficient in individuals with ASD [5, 12, 38].

Several recent imaging studies [13, 39] have noted that the informative somatosensory regions and default mode were incomparably more worth for the precise prediction on individuals with ASD. Moreover, a growing number of efforts [79, 40] have revealed lower correlation between the anterior and posterior DMN in autistic individuals, e.g., correlation between precuneus and medial prefrontal cortex. Our experiments and analysis on the discriminative functional connections are concordant with the reported observations. This is also attested from another angle, that the attention mechanism discussed in our proposed framework is reasonable for exploring the potential biomarkers associated with ASD.

Based on the heterogeneous rs-fMRI data from ABIDE, we further investigated potential brain regions that deemed co-activate with the above ROIs, as shown in Fig. 4. To reduce blurring of signals across cerebro-cerebellar and cerebro-striatal boundaries, the brain signals of adjacent cerebral cortex were regressed from the cerebellum and striatum. It is evident that these observed regions groupings emphasize the neurocognitive functions known to be affected in individuals with ASD, such as diminished social reward, impaired memory and communication skills, and lacking theory of mind (a leading hypothesis on the social impairment of autism folk) [41].

There is one notable thing that our observation on the potential neural patterns of ASD emphasizes a well generalization to larger ASD population, rather than autistic individuals within specific scope. Since heterogeneity in the race, age, sex, severity of clinical symptoms, and socioeconomic status in this database may be associated with significant differences in the brain networks, similar experiments could be performed here by considering different categories (e.g., gender or age-group) to evaluate the special effects of ASD. Recent studies [36, 42] in the medical literature have bearded out its efficiency and demonstrated some meaningful insights for aiding the ASD diagnosis. For examples, Chen et al. [36] found a more significant correlation between social and communication deficits in adolescent individuals with ASD than healthy controls. And Subbaraju et al. [42] presented a clear shift in brain activities to the prefrontal cortex of male patients with ASD, but not evident in the females. Hence, in future, we would make a specific analysis on the brain networks for different categories of subjects.

### Limitations and future work

The fMRI holds promise to characterize pathophysiology and generalize potential biomarkers for ASD, which would contribute to the targeting diagnosis on brain-based disorders. To solve the challenge of limited reproducibility and generalizability for potential biomarkers, we performed an attentional connectivity analysis on multisite rs-fMRI data from ABIDE. Specifically, an individual brain network for each enrolled subject was firstly constructed with the mean rs-fMRI time series of each ROIs in CC200 atlas. Then, we used Extra-Trees algorithm to concentrate attention on the informative functional connectivity features with most contribution to the classifier model. Afterwards, the supervised learning method was applied to accomplish the ASD/TC classification through a straightforward way of setting the top discriminative features into the SVM model. The experimental results have demonstrated that our proposed attention method was effective for the diagnostic aids in ASD. We achieved good classification accuracy of 72.2% that is about 2% higher than previous competitive study. Connectivity analysis highlighted some functional regions strongly associated with social cognition and interactions, and proved lower correlation between the anterior and posterior default mode network (DMN) in autistic individuals. These observations are concordant with previous studies, which enables our proposed method to effective identify individuals with risk of ASD.

A limitation of this study on the prediction for ASD is the use of special atlas, i.e., CC200. Though a rather accurate classification was achieved on the predefined atlas of CC200, the high accuracy may not hold if other different atlases are applied. And of course, the observed significant neural patterns associated with ASD in the atlas also may be changed. Therefore, a further validation is required to assess robustness of the current approach using other parcellation schemes. Another limitation could arise in the comparison between different algorithms for evaluating the superior performance of our proposed method. Though all the referred studies were performed based on the imaging data from ABIDE, it might not be appropriate due to that each algorithm was established in different subset with diverse clinical or imaging features. In the future, we would attempt to extend our method on other atlas, consistent experimental data with published studies, and even imaging data derived from other brain-based disorders. If so, the robustness and applicability of our proposed framework could be further improved.

### ACKNOWLEDGEMENTS

This work was supported by the National Program on Key Research Project (Grant No. 2016YFC1401900); National Natural Science Foundation of China (Grant No. 81771876); the Guangdong Provincial Key Laboratory of Optical Information Materials and Technology (Grant No. 2017B030301007); Guangdong Science and Technology Program (Grant No. 2017A010101023); the Innovation Project of Graduate School of South China Normal University.

### Figures

Fig. 1.

An overall flowchart of proposed ASD/TC classification method.

Fig. 2.

Results (a) and assessment (b) of 10 runs on the 10-fold cross-validation procedure. Each number (1~10) in x-axis denotes the numbering of times. The yellow line (y=0.003*x+72) in subplot (a) represents function fitting on accuracy, and corresponding residuals are calculated in subplot (b).

Fig. 3.

The most discriminative functional connections in predictive biomarkers for distinguishing controls from ASD patients, represented as lines with red, blue and green by importance in a descending order. Red and green connections are stronger in controls, and blue connections are stronger in ASD patients.

Fig. 4.

Brain regions co-activated with ROIs of the top three functional connections. The seed region could be visually recognized as the bright white spot in each map.

### Tables

Table. 1.

Demography summary

SITE_IDASDTD

AgeGenderFDAgeGenderFD
CALTECH22.79±6.76M 5, F 20.0728.29±11.15M 10, F 40.07
CMU26.33±4.89M 5, F 10.3626.38±4.31M 7, F 10.34
KKI10.01±1.45M 18, F 40.2610.16±1.26M 24, F 90.11
LEUVEN17.89±2.71M 26, F 30.1018.84±2.25M 29, F 40.08
MAX_MUN26.08±14.89M 21, F 30.1526.21±9.80M 29, F 40.11
NYU14.52±6.97M 68, F 110.0815.56±6.06M 78, F 250.06
OHSU11.66±2.25M 13, F 00.0610.06±1.08M 15, F 00.13
OLIN16.70±3.42M 17, F 30.1916.94±3.68M 14, F 20.17
PITT18.93±7.20M 26, F 40.1519.00±6.74M 22, F 40.16
SBL35.29±10.76M 14, F 00.1434.21±6.58M 14, F 00.16
SDSU14.58±1.74M 10, F 10.1113.94±1.83M 12, F 60.07
STANFORD9.96±1.59M 16, F 40.119.95±1.60M 16, F 40.11
TRINITY17.28±3.57M 24, F 00.1317.06±3.87M 23, F 00.10
UCLA12.91±2.25M 48, F 60.2612.79±1.61M 38, F 60.11
UM13.80±1.98M 58, F 100.1915.34±3.55M 59, F 180.09
USM22.65±7.73M 58, F 00.1621.36±7.64M 43, F 00.10
YALE12.63±3.04M 19, F 80.1312.68±2.75M 20, F 80.09
ALL Sites16.59±8.05M 446, F 600.1616.86±7.55M 453, F 950.10

M, Male; F, Female. FD is a measurement on the head motion of experimented subjects, with a comparison of the current and previous volumes.

Table. 2.

Classification results of methods with ABIDE data

TypeNumber of SubjectsAccuracy (%)Sensitivity (%)Specificity (%)
Kong et al. [10]17290.4----
Plitt et al. [11]17895.294.995.6
Chen et al. [12]252918993
Abraham et al. [32]87166.953.278.3
Nielsen et al. [15]96460.0----
Dvornek et al. [13]110068.5----
Heinsfeld et al. [14]1035707463
Eslami et al. [17]103570.368.372.2
Ghiassian et al. [16]111159.2----
Proposed105472.268.875.4

Table. 3.

Top Neurosynth terms on anatomy and function associated with ROIs of the discriminative functional connections

ROIAnatomical TermsFunctional Terms
Precuneus, Posterior Cingulate, Retrosplenial Cortex, Angular GyrusEpisodic Memory, Autobiographical Memory, Memory Retrieval, Cognitive Impairment
Cortex Vmpfc, Ventromedial Prefrontal, Medial Prefrontal, Posterior CingulateDefault mode, Autobiographical Memory, Choose, Reward
Auditory Cortex, Heschl Gyrus, Superior Temporal, Planum TemporaleSounds, Speech, Listening, Audiovisual
Thalamus, Thalamic, Basal Ganglia, Caudate NucleusFinger Tapping, Pain, Supplementary Motor, Sensation
Anterior Temporal, Temporal Pole, Temporal Lobes, Lateral TemporalMental States, Theory Mind, Comprehension, Mentalizing, Sentence
Medial Prefrontal, Cortex Vmpfc, Dorsomedial Prefrontal, Prefrontal CortexMentalizing, Mental States, Beliefs, Theory Mind

### References

1. American Psychiatric Association (2013) Diagnostic and statistical manual of mental disorders: DSM-5. 5th ed. American Psychiatric Association Publishing, Washington, D.C.
2. Pandolfi V, Magyar CI, Dill CA (2018) Screening for autism spectrum disorder in children with Down syndrome: an evaluation of the pervasive developmental disorder in mental retardation scale. J Intellect Dev Disabil 43:61-72.
3. Jeon SJ, Gonzales EL, Mabunga DFN, Valencia ST, Kim DG, Kim Y, Adil KJL, Shin D, Park D, Shin CY (2018) Sex-specific behavioral features of rodent models of autism spectrum disorder. Exp Neurobiol 27:321-343.
4. Kana RK, Sartin EB, Stevens C Jr, Deshpande HD, Klein C, Klinger MR, Klinger LG (2017) Neural networks underlying language and social cognition during self-other processing in Autism spectrum disorders. Neuropsychologia 102:116-123.
5. Monk CS, Peltier SJ, Wiggins JL, Weng SJ, Carrasco M, Risi S, Lord C (2009) Abnormalities of intrinsic functional connectivity in autism spectrum disorders. Neuroimage 47:764-772.
6. Li H, Xue Z, Ellmore TM, Frye RE, Wong ST (2014) Networkbased analysis reveals stronger local diffusion-based connectivity and different correlations with oral language skills in brains of children with high functioning autism spectrum disorders. Hum Brain Mapp 35:396-413.
7. Weng SJ, Wiggins JL, Peltier SJ, Carrasco M, Risi S, Lord C, Monk CS (2010) Alterations of resting state functional connectivity in the default network in adolescents with autism spectrum disorders. Brain Res 1313:202-214.
8. Hegarty JP 2nd, Ferguson BJ, Zamzow RM, Rohowetz LJ, Johnson JD, Christ SE, Beversdorf DQ (2017) Beta-adrenergic antagonism modulates functional connectivity in the default mode network of individuals with and without autism spectrum disorder. Brain Imaging Behav 11:1278-1289.
9. Washington SD, Gordon EM, Brar J, Warburton S, Sawyer AT, Wolfe A, Mease-Ference ER, Girton L, Hailu A, Mbwana J, Gaillard WD, Kalbfleisch ML, VanMeter JW (2014) Dysmaturation of the default mode network in autism. Hum Brain Mapp 35:1284-1296.
10. Kong Y, Gao J, Xu Y, Pan Y, Wang J, Liu J (2019) Classification of autism spectrum disorder by combining brain connectivity and deep neural network classifier. Neurocomputing 324:63-68.
11. Plitt M, Barnes KA, Martin A (2014) Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards. Neuroimage Clin 7:359-366.
12. Chen CP, Keown CL, Jahedi A, Nair A, Pflieger ME, Bailey BA, Müller RA (2015) Diagnostic classification of intrinsic functional connectivity highlights somatosensory, default mode, and visual regions in autism. Neuroimage Clin 8:238-245.
13. Dvornek NC, Ventola P, Pelphrey KA, Duncan JS (2017) Identifying autism from resting-state fMRI using long short-term memory networks. Mach Learn Med Imaging 10541:362370.
14. Heinsfeld AS, Franco AR, Craddock RC, Buchweitz A, Meneguzzi F (2017) Identification of autism spectrum disorder using deep learning and the ABIDE dataset. Neuroimage Clin 17:16-23.
15. Nielsen JA, Zielinski BA, Fletcher PT, Alexander AL, Lange N, Bigler ED, Lainhart JE, Anderson JS (2013) Multisite functional connectivity MRI classification of autism: ABIDE results. Front Hum Neurosci 7:599.
16. Ghiassian S, Greiner R, Jin P, Brown MR (2016) Using functional or structural magnetic resonance images and personal characteristic data to identify ADHD and autism. PLoS One 11:e0166934.
17. Eslami T, Mirjalili V, Fong A, Laird AR, Saeed F (2019) ASDDiagNet:a hybrid learning approach for detection of autism spectrum disorder using fMRI data. Front Neuroinform 13:70.
18. Craddock C, Benhajali Y, Chu C, Chouinard F, Evans A, Jakab A, Khundrakpam BS, Lewis JD, Li Q, Milham M, Yan C, Bellec P (2013) The Neuro Bureau Preprocessing Initiative: open sharing of preprocessed neuroimaging data and derivatives. [Internet] Front Neuroinform, Lausanne, Switzerland.
Available from: https://doi.org/10.3389/conf.fninf.2013.09.00041.
19. Craddock RC, James GA, Holtzheimer PE 3rd, Hu XP, Mayberg HS (2012) A whole brain fMRI atlas generated via spatially constrained spectral clustering. Hum Brain Mapp 33:1914-1928.
20. Lancaster JL, Woldorff MG, Parsons LM, Liotti M, Freitas CS, Rainey L, Kochunov PV, Nickerson D, Mikiten SA, Fox PT (2000) Automated Talairach atlas labels for functional brain mapping. Hum Brain Mapp 10:120-131.
21. Desikan RS, Ségonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, Buckner RL, Dale AM, Maguire RP, Hyman BT, Albert MS, Killiany RJ (2006) An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31:968-980.
22. Eickhoff SB, Stephan KE, Mohlberg H, Grefkes C, Fink GR, Amunts K, Zilles K (2005) A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data. Neuroimage 25:1325-1335.
23. Friston KJ, Holmes AP, Worsley KJ, Poline JP, Frith CD, Frackowiak RSJ (1994) Statistical parametric maps in functional imaging: a general linear approach. Hum Brain Mapp 2:189210.
24. Behzadi Y, Restom K, Liau J, Liu TT (2007) A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. Neuroimage 37:90-101.
25. Tustison NJ, Yang Y, Salerno M (2014) Advanced normalization tools for cardiac motion correction. Lect Notes Comput Sci 8896:3-12.
26. Braun U, Muldoon SF, Bassett, DS (2015) On human brain networks in health and disease. pp 1-9. eLS, John Wiley &Sons, Ltd., Hoboken.
27. Liu J, Li M, Pan Y, Lan W, Zheng R, Wu FX, Wang J (2017) Complex brain network analysis and its applications to brain disorders: a survey. Complexity 2017:8362741.
28. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157-1182.
29. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17:491-502.
30. Pinto A, Pereira S, Rasteiro D, Silva CA (2018) Hierarchical brain tumour segmentation using extremely randomized trees. Pattern Recognit 82:105-117.
31. Li J, Qiu L, Xu L, Pedapati EV, Erickson CA, Sunar U (2016) Characterization of autism spectrum disorder with spontaneous hemodynamic activity. Biomed Opt Express 7:3871-3881.
32. Abraham A, Milham MP, Di Martino A, Craddock RC, Samaras D, Thirion B, Varoquaux G (2017) Deriving reproducible biomarkers from multi-site resting-state data: an Autismbased example. Neuroimage 147:736-745.
33. Castellanos FX, Di Martino A, Craddock RC, Mehta AD, Milham MP (2013) Clinical applications of the functional connectome. Neuroimage 80:527-540.
34. Kornbrot DE (1990) The rank difference test: a new and meaningful alternative to the Wilcoxon signed ranks test for ordinal data. Br J Math Stat Psychol 43:241-264.
35. Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432441.
36. Chen H, Duan X, Liu F, Lu F, Ma X, Zhang Y, Uddin LQ, Chen H (2016) Multivariate classification of autism spectrum disorder using frequency-specific resting-state functional connectivity-a multi-center study. Prog Neuropsychopharmacol Biol Psychiatry 64:1-9.
37. Yarkoni T, Poldrack RA, Nichols TE, Van Essen DC, Wager TD (2011) Large-scale automated synthesis of human functional neuroimaging data. Nat Methods 8:665-670.
38. Just MA, Cherkassky VL, Buchweitz A, Keller TA, Mitchell TM (2014) Identifying autism from neural representations of social interactions: neurocognitive markers of autism. PLoS One 9:e113879.
39. Anderson JS, Nielsen JA, Froehlich AL, DuBray MB, Druzgal TJ, Cariello AN, Cooperrider JR, Zielinski BA, Ravichandran C, Fletcher PT, Alexander AL, Bigler ED, Lange N, Lainhart JE (2011) Functional connectivity magnetic resonance imaging classification of autism. Brain 134:3742-3754.
40. Assaf M, Jagannathan K, Calhoun VD, Miller L, Stevens MC, Sahl R, O'Boyle JG, Schultz RT, Pearlson GD (2010) Abnormal functional connectivity of default mode sub-networks in autism spectrum disorder patients. Neuroimage 53:247-256.
41. Baron-Cohen S, Leslie AM, Frith U (1985) Does the autistic child have a "theory of mind"? Cognition 21:37-46.
42. Subbaraju V, Suresh MB, Sundaram S, Narasimhan S (2017) Identifying differences in brain activities and an accurate detection of autism spectrum disorder using resting state functional-magnetic resonance imaging: a spatial filtering approach. Med Image Anal 35:375-389.