View Full Text | Abstract |
Article as PDF | Print this Article |
Pubmed | PMC |
PubReader | Export to Citation |
Email Alerts | Open Access |
Exp Neurobiol 2024; 33(5): 238-250
Published online October 31, 2024
https://doi.org/10.5607/en24020
© The Korean Society for Brain and Neural Sciences
Kwangsu Kim1,2, Jisub Bae3, JeeWon Lee4, Sun Ae Moon4, Sang-Ho Lee1,5, Won-Seok Kang1,5 and Cheil Moon1,4*
1Convergence Research Advanced Centre for Olfaction, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Korea, 2Smell and Taste Clinic, Department of Otorhinolaryngology, Technische Universität Dresden, Dresden 01307, Germany, 3Center for Cognition and Sociality, Institute for Basic Science (IBS), Daejeon 34126, Korea, 4Department of Brain Sciences, Graduate School, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Korea, 5Division of Intelligent Robot, Daegu Gyeongbuk Institute of Science and Technology (DGIST), Daegu 42988, Korea
Correspondence to: *To whom correspondence should be addressed.
TEL: 82-53-785-6110, FAX: 82-53-785-6109
e-mail: cmoon@dgist.ac.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Although we have multiple senses, multimedia mainly targets vision and olfaction. To expand the senses impacted by multimedia, olfactory stimulation has been used to enhance the sense of reality. Odors are primarily matched with objects in scenes. However, it is impractical to select all odors that match all objects in a scene and offer them to viewers. As an alternative, offering a single odor in a category as representative of other odors belonging to that category has been suggested. However, it is unclear whether viewers’ responses to videos with multiple odors (e.g., rose, lavender, and lily) from a category (e.g., flowers) are comparable. Therefore, we studied whether odors belonging to a given category could be similar in behavioral congruency and in the five frequency bands (delta, theta, alpha, beta, and gamma) of electroencephalogram (EEG) data collected while viewers watched videos. We conducted questionnaires and EEG experiments to understand the effects of similar odors belonging to categories. Our results showed that similar odors in a specific odor category were more congruent with videos than those in different odor categories. In our EEG data, the delta and theta bands were mainly clustered when odors were offered to viewers in similar categories. The theta band is known to be primarily related to the neural signals of odor information. Our studies showed that choosing odors based on odor categories in multimedia can be feasible.
Keywords: EEG, Olfactory perception, Multimedia, Cluster analysis, Machine learning
In daily life, we principally detect and interpret our environment with five sensory modalities, namely visual, auditory, olfactory, tactile, and gustatory. Multimedia content predominantly targets two human senses, i.e., visual and auditory senses. In an attempt to expand sensory engagement, olfactory stimulation has been added to multimedia content [1-6]. Particularly, the addition of odors to multimedia in the film industry has been increasing each year [7-9]. Moreover, adding olfactory stimuli to multimedia content can enrich the multimedia experience of individuals. In previous multimedia studies using olfactory stimuli, these stimuli enhanced the sense of multimedia reality [10, 11].
Odors that match objects in scenes based on categories have been widely used to select olfactory stimuli for multimedia [2, 3, 10-15]. However, including all odors corresponding to every object in a scene for viewers is impractical. As an alternative, researchers have suggested reducing the number of odors by selecting representative ones based on their categories [16]. Selecting representative odors based on odor categories could reduce the number of odors offered in scenes or replace odors with cheaper and safer odors. However, similar odors in an identical category might not match identical objects. Thus, odors can still induce different viewer responses to multimedia. There is still little evidence that odors in identical categories could similarly induce user responses to multimedia.
Unlike multimedia studies with olfactory stimuli, there have been several studies on the olfaction of odors in similar categories [17-21]. Odors in similar categories are similarly perceived in human behavior tests. Neuronal signals in the piriform cortex (PC) and orbitofrontal cortex (OFC) induced by odors in similar categories are more similar than those in different categories [17, 18]. These findings suggest that odors within comparable categories potentially elicit similar behavioral and neuronal responses compared to those in different multimedia categories.
However, upon simultaneously offering stimuli of different modalities, the response is not merely the sum of neuronal activities elicited by each stimulus. In previous multimodal functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) studies [13, 22-25], neuronal activities were higher upon simultaneously offering stimuli than that on offering them separately. This tendency was more noticeable while matching stimuli from different modalities. These studies implied that there is an interaction among stimuli affecting different modalities in the brain. However, it is unclear if odors belonging to comparable categories can induce similar neuronal activity as that elicited from videos.
Additionally, in multimedia studies, researchers have widely used EEG and the analysis of frequency bands (delta, theta, alpha, beta, and gamma) to measure neuronal responses in viewer [13, 26-32]. The emotional responses of multimedia viewers are related to each frequency band. In addition, a multimedia study with odors demonstrated increased delta, theta, and alpha bands of EEG data upon offering the video with a matching odor than that presented without an odor [13]. Specifically, the theta band of EEG data is associated with olfactory processing [33, 34] in olfactory studies.
Therefore, our purpose was to examine if odors belonging to similar categories in the video could be comparable in behavioral congruency and the five frequency bands (delta, theta, alpha, beta, and gamma) of EEG data than those belonging to different categories in the video, as a matching method for odors with objects in videos. To this end, we selected four odors and two videos based on a previous study [16]. We aimed to measure the congruency between odors and videos for matching the odors with videos. Moreover, the congruency was related to the senses of multimedia reality [10, 11]. To accomplish these objectives, we planned to conduct EEG experiments and analyze event-related spectral perturbations (ERSP) to measure brain activities with frequency bands, examining how these activities change over time during the video presentation [13].
We recruited fifty-six participants (29 men and 27 women) aged between 18 and 23 years, with a mean age of 20.79 (SD: 1.25), from the Daegu Gyeongbuk Institute of Science and Technology. Three participants were excluded due to technical issues with the olfactometer during tests, resulting in a total of 53 participants for this study. All participants were right-handed, passed the Sniffin’ Sticks test [35] to confirm no olfactory dysfunction, and provided written informed consent. The study received approval from the Institutional Review Board (IRB) of DGIST (DGIST-180524-HR-005-03), ensuring adherence to relevant guidelines and regulations [36].
Two 10-second videos from the LIRIS database were utilized [37-41]. The first depicted an adult woman and a girl with flowers (flower video, FV), while the second showed a man with a cup of coffee on a table (coffee video, CV) (Fig. 1a). Participants, seated comfortably, watched the clips on a 24-inch LED screen positioned approximately 60 cm away.
Lavender oil (CAS 8000-28-0, Lot #BCBM0576V), geraniol (CAS 106-24-1, Lot #MKBW0796V, Sigma), 2-furanmethanethiol (CAS 98-02-2, Lot #SHBH5642, Sigma), and 2-ethyl-3,5-dimethylpyrazine (CAS 27043-05-6, Lot #STBF8794V, Sigma) were used in this study (Fig. 1a). All four odorants were diluted in mineral oil (Sigma, lot # MKBZ6778V). Lavender was diluted to 0.005% (v/v), geraniol was diluted to 50%, 2-furanmethanethiol was diluted to 0.0004%, and 2-ethyl-3,5-dimethylpyrazine was diluted to 0.5%. The concentrations of the odors were chosen to ensure that participants perceived the odors at a similar intensity level (Fig. S1a, b).
The rationale for selecting these specific odors was based on previous studies. Lavender and geraniol were chosen as representative floral scents due to their frequent use in olfactory research [42, 43]. For the coffee odors, 2-furanmethanethiol and 2-ethyl-3,5-dimethylpyrazine were selected as key components contributing to the characteristic aroma of coffee [44, 45].
Odorants were delivered differently based on the experiment. In the category survey and discrimination tests, odors were offered by bottles. in the EEG recording test, a custom-built olfactometer delivered the odorants. The olfactometer was linked to the mask (without cannula) worn by the participants. When the olfactometer offered odors, airflow speed was 5.18 m/s for lavender, geraniol, and 2-ethyl-3,5-dimethylpyrazine and 10.35 m/s for 2-furanmethanethiol. The airflow rates were selected to ensure that participants perceived the odors at a similar intensity level. The odors were delivered during the first 5 s at the beginning of the videos. Lavender and geraniol (flower odors) were named FO1 and FO2, respectively [42, 43]. 2-furanmethanethiol and 2-ethyl-3,5-dimethyl pyrazine (coffee odors) were named CO1 and CO2 in this study [44, 45] (Fig. 1a). All odors were given to participants with separated tubes to avoid contamination of odors by each other.
All the tests were conducted in a ventilated and soundproof chamber. The participants were instructed to maintain natural respiration during the tests. Our experiments consisted of four steps: a category survey, discrimination test, EEG recording test, and congruency survey (Fig. 1b). Before the EEG recording test and congruency survey, a category survey and discrimination test for the four odors were conducted as pilot tests to determine whether the four odors could be differentiated into two categories to avoid repetition effects in EEG recording test and congruency survey. For the pilot test, there were separately recruited thirteen participants (8 men, 5 women) who took part in the category survey and discrimination test.
In the category survey, four odors were offered to the participants in bottles, and they could freely smell the odors during the survey. Participants marked the categories of the odors among 19 categories (citrus, fruit, coffee, flower, nut, green, tree, root, herb, sweet, powdery, creamy, soap, spicy, musk, mint, watery, tropical fruits, and others) on a 9-point Likert scale questionnaire. The cumulative scores of the 19 categories across the four odors were calculated to examine which odor categories got the highest scores among participants.
where CQn is the cumulative sum of participant responses in question n, Np is the total number of participants, P is the participant, QRp is the participant responses (1~9 score) in question, and Codor is the cumulative score of all 19 questions in an odor condition (i.e., FO1, FO2, CO1, and CO2).
Next, the discrimination test was conducted using the ‘three bottle test.’ Three odors in the bottles were randomly offered to the participants. Participants were asked to choose the most different category of odor among the three odors. Four possible combinations of odors were tested for each participant. The accuracy of the test was calculated by dividing the number of trials in which participants chose odors in different categories by the whole trial.
In the EEG recording test and congruency survey, 40 participants were divided into three groups depending on the odors to avoid repetition of viewing the same videos (Fig. 1a). In group 1, there were 13 participants (7 men and 6 women). FO1 was offered with FV, and CO1 was offered with CV. Group 2 included 14 participants (7 men, 7 women). FO2 and CO2 were applied to FV and CV, respectively. In group 3, there were 13 participants (5 men and 8 women). CO1 was offered with FV, and FO1 was offered with CV. Therefore, Groups 1 and 2 were given the flower odor with the FV and the coffee odor with the CV. However, Group 3 was given the coffee odor with the FV and the flower odor with the CV.
During the EEG recording test, the participants sat on a chair in the chamber and wore a mask linked to the olfactometer. Participants’ respiration was monitored using a respiration belt. The odors were delivered to the participants immediately after an inhalation peak for 5 s, and the video was started simultaneously and offered for 10 s. After the end of the video, there was a rest period of 30 s before the next round of testing (Fig. 1c). Two sessions were conducted for each of the two videos for each participant. All videos in the sessions were presented once, and the participants watched FV and CV once.
After the EEG recording test, participants were asked to evaluate the congruency between the videos and odors. Participants rated congruency in percentages (%) from 0 to 100 where 0% represented “not congruent” and 100% represented “congruent” [46]. In addition, participants evaluated the pleasantness and intensity of the odors on a 9-point Likert scale questionnaire.
An EEG was recorded during video and odor stimulation. EEG signals were digitized using an EEG amplifier (ActiveTwo; BioSemi, Amsterdam, the Netherlands). EEG signals were recorded with Ag/AgCl scalp electrodes from 64 positions based on the international 10/20 system (Fp1, AF7, AF3, F1, F3, F5, F7, FT7, FC5, FC3, FC1, C1, C3, C5, T7 (T3), TP7, CP5, CP3, CP1, P1, P3, P5, P7, P9, PO7, PO3, O1, Iz (inion), Oz, POz, Pz, CPz, Fpz, Fp2, AF8, AF4, Afz, Fz, F2, F4, F6, F8, FT8, FC6, FC4, FC2, FCz, Cz, C2, C4, C6, T8 (T4), TP8, CP6, CP4, CP2, P2, P4, P6, P8, P10, PO8, PO4, and O2) on a BioSemi head cap (64 ch, BioSemi). Eye blinks (electrooculographic signals) were measured at approximately 2 cm above the outer canthus of the right eye. The sampling rate was 2,048 Hz, and the signals were analog-filtered via a 0.15 Hz high-pass filter and a 100 Hz low-pass filter. The 0.15 Hz high-pass filter was the default configuration of the BioSemi system, used to remove slow drifts in the signal while retaining the relevant neural activity. A conductive electrolyte gel was used for a stable connection between the scalp and electrodes. The impedance of each electrode was less than 10 kΩ. Electrophysiological activity was referenced to the common average of all channels.
EEG data were downsampled from 2,048 to 512 Hz. Then, an offline bandpass filter (0.5~50 Hz) was used to minimize the noise caused by muscle artifacts and skin potential. The bandpass filter was applied with a cutoff frequency (-6 dB point) of 0.25 to 50.25 Hz after extracting epochs, and the type of filter applied was an FIR (finite impulse response) filter. The default order were used for the FIR filter in EEGLAB which is determined by internal optimization routines, which select an appropriate order based on the sampling frequency of the data and the specified cutoff frequencies [47]. The EEG data were segmented into epochs depending on the two videos. Each epoch had a 5 s pre-stimulus period and a 10 s post-stimulus period. Data from -5 s to 0 s in each epoch were used for baseline correction. To extract the five frequency bands (delta: 1~3 Hz, theta: 4~8 Hz, alpha: 8~13 Hz, beta: 13~30 Hz, gamma: 30~50 Hz) from each channel of the EEG data (0~10 s) across time, analysis of event-related spectral perturbations (ERSP) was used for each epoch by EEGLAB. Wavelet cycles were 3 and 0.5, respectively. The time points were set at 400. The window size of each time point was 3,341.8 ms.
where CHn is the electrode number of the EEG, Tm is the ERSP time point, I is the start point, and I I+N-1 is the endpoint of ERSP in a specific frequency band (i.e., delta, theta, alpha, beta, and gamma). For instance, ERSPCH1-T1 [1] means “first frequency point of ERSP value in the electrode channel 1 and time point 1” and ƒCH1-T1 means “ERSP value in the electrode channel 1 and time point 1”.
where Odor is the odor condition (i.e., FO1, FO2, CO1, and CO2), and the frequency band is one of the frequency bands (i.e., delta, theta, alpha, beta, gamma) of the EEG data. P is the participant. For example, FFO1-theta-p1 means “ERSP data of participant 1 in theta band”.
After EEG preprocessing, we conducted clustering methods to examine which frequency bands of ERSP data were more similar depending on the categories of odors in each video. Agglomerative hierarchical clustering (AHC) and K-means clustering were used as clustering methods. AHC can easily couple the data from the closest distances and show the closest data pairs. Furthermore, AHC is faster than other machine learning methods because AHC does not need additional training steps. These reasons were why we used AHC to analyze our ERSP data. In addition, K-means clustering was also conducted to check whether our clustering results of EEG data were restricted to AHC. AHC was conducted on the ERSP data of 64 channels for all participants. The ERSP data of the five frequency bands from 0 s to 10 s were used in this analysis. This was done to examine which frequency bands of ERSP data were more similar depending on the categories of odors in each video. AHC was applied in each video: FO1 of Group 1, FO2 of Group 2, and CO1 of Group 3 in the FV; CO1 of Group 1, CO2 of Group 2, and FO1 of Group 3 in the CV. There were two shuffled conditions, including the control condition of the repetition: “within group condition” and “random sampling group condition.”
The Input table of the “within group condition” is as follows:
where
The Input table of the “random sampling group condition” is as follows:
where HCRRandom is the number of the first pair during 1,000 repetitions of the “random sampling group condition.” In the “random sampling group condition,” the ERSP data of each participant could be shuffled into other groups. The random sampling group was the control group. The number of first odor pairs in the AHC results during the 1,000 repetitions was calculated because the first pair of AHC results showed the most similarity. We examined whether ERSP data induced by videos and odors belonging to similar categories were similar to those induced by videos and odors from different categories. The cumulative moving average (CMA) during all repetitions of the AHC was drawn to check that 1,000 repetitions of AHC were sufficient for counterbalancing (Fig. S2). In CMA, the closest odor pair ratio was calculated, and the number of closest odor pairs was calculated as the number of repetition trials increased. Furthermore, K-means clustering was also conducted with the same clustering design as AHC to check whether our clustering results of EEG data were restricted to AHC. The K value was 2, and the number of odor pairs was counted instead of the first odor pair.
We conducted support vector machine (SVM) analysis to identify which frequency bands of ERSP data were most similar across odor categories within each video. SVM was applied to the ERSP data, as with the clustering methods, to evaluate patterns related to odor similarity. To optimize the model, we used grid search with five-fold cross-validation on the training data, allowing us to select the best parameters for improved accuracy on test data.
For the FV condition, FO1 and CO1 were used as training data with labels FO1=1 and CO1=2, while FO2 served as test data to determine its classification accuracy under label 1. In the CV condition, CO1 and FO1 were used as training data with labels CO1=1 and FO1=2, and CO2 was reserved as test data, with accuracy evaluated by its classification under label 1.
In the random sampling condition, participants’ data from FO1, FO2, and CO1 for FV, or CO1, CO2, and FO1 for CV, were randomly assigned to either training or test sets. The SVM model used regularized support vector classification with the best parameters identified through grid search. The highest-performing model from cross-validation and grid search was subsequently applied to test data to evaluate classification accuracy.
The results are presented as mean±SEM. Statistical significance is marked as * for p<0.05, ** for p<0.01, and *** for p<0.001. The accuracy of all participants in the odor discrimination test was a factor of the one-sample Wilcoxon signed rank test to check whether participants chose odors belonging to a different category, not randomly (33.3%). Congruency between videos and odors of all participants, depending on the group, was the factor for Kruskal-Wallis test for each video. Dunn's multiple comparison test was conducted as a post-hoc test, and the correction level for the Dunn’s multiple comparison test was 0.017(0.05/3). We used Dunn’s multiple comparison test because it is a widely used method, like false discovery rate, and has the strictest standard for post hoc tests. In addition, the original degrees of freedom, H values, p values, and epsilon-squared effect sizes (ε2) are reported. The chi-square test was used to analyze the number of first pairs of AHC results and K-means clustering results in each band across 1,000 repetitions, depending on the group. The original degrees of freedom, χ2 values, and p-values are reported. Bonferroni’s test was conducted as a post-hoc test, and the correction level for the Bonferroni correction was 0.017(0.05/3).
Electrophysiological data were analyzed using MATLAB 2020a in conjunction with toolboxes, including EEGLAB [47]. MATLAB was used for statistical analysis.
To assess if the four odors exhibited differences in categories compared to prior studies [42, 43, 45, 48], we initially conducted a category survey, cumulatively calculating participants’ scores in each category. For FO1, the categories of herbs (84), flowers (48), and green (31) were highly scored. For FO2, the categories of herbs (67), flowers (48), and citrus (43) were highly scored. The categories of flowers and herbs were commonly scored highly in FO1 and FO2. On the other hand, for CO1, the categories of coffee (66), sweet (35), and spicy (27) were highly scored. For CO2, nut (48), coffee (28), sweet (24), and root (24) were highly scored. The coffee and sweet categories were commonly highly scored in CO1 and CO2. The results showed that the four odors were principally grouped into two groups (Fig. 2a). Furthermore, to check whether participants could differentiate odors in different categories from those in similar categories, a discrimination test was conducted (Fig. 2b). The accuracy of choosing odors in different categories among the three odors was 81.8%, which was significantly higher than the accuracy of random choice (33.3%) (W [12]=91.0, p<0.001; Wilcoxon signed rank test).
To examine whether congruencies between videos and odors differed depending on the categories of odors, we compared the congruency of odors belonging to similar categories and different categories with videos. Congruencies between the FVs and odors differed depending on the categories of odors (H[2|37]=8.51, p=0.0142, ε²=0.18; Kruskal-Wallis test) (Fig. 3a). The congruency of CO1 was significantly lower than that of FO2 (p=0.02, Dunn’s multiple comparison test). The congruency of CO1 showed a tendency to be lower than that of FO1 (p=0.06, Dunn’s multiple comparison test). The congruence of FO1 was not significantly different from that of FO2. Congruencies between CVs and odors differed depending on the categories of odors (H[2|37]=19.01, p<0.001, ε²=0.46; Kruskal-Wallis test) (Fig. 3b). The congruencies of CO1 (p<0.001, Dunn’s multiple comparison test) and CO2 (Z=0.03, p=0.03, Dunn’s multiple comparison test) were significantly higher than those of FO1. The congruency of CO1 was not significantly different from that of CO2. The intensity scores of the odors in the FV (H[2|37]=0.15, p=0.9255, ε²=-0.05; Kruskal-Wallis test) (Fig. S1a) and CV (H[2|37]=1.99, p=0.3690, ε²=-0.00; Kruskal-Wallis test) (Fig. S1b) were not significantly different. The pleasantness scores of the odors in the FV (H[2|37]=3.35, p=0.1871, ε²=0.04; Kruskal-Wallis test) (Fig. S1c) were not significantly different. The pleasantness scores of the odors in the CV H[2|37]=5.95, p=0.0511, ε²=0.11; Kruskal-Wallis test) (Fig. S1d) show a tendency to be different.
To examine whether the ERSP data of the videos with odors are clustered according to the categories of odors similar to the congruency test, each frequency band of the ERSP data of three odors in each video was clustered by AHC. In the representative figure of ERSP (Fig. 4), the x-axis represents time, and the y-axis represents frequency. In the FV, the clustering number in all frequency bands was significantly different depending on the categories of odors in “within group condition,” not in “random sampling group condition” (Table 1; delta: X2[2]=312.02, p<0.0001, theta: X2[2]=746.61, p<0.0001, alpha: X2[2]=2000.03, p<0.0001, beta: X2[2]=1474.62, p<0.0001, gamma: X2[2]=1452.28, p<0.0001). In the delta, the clustering number of the FO2–CO1 pair was above the random level (333.33) and greater than that of the FO1–FO2 pair (p<0.001, Bonferroni test) and the FO1–CO1 pair (p<0.001, Bonferroni test). The clustering number of FO1–FO2 was also above the random level (333.33) and greater than that of FO1–CO1 (p<0.001, Bonferroni test). In the theta bands, the clustering number of the FO1–FO2 pair was greater than that of the FO1–CO1 pair (p<0.001, Bonferroni test) and the FO2–CO1 pair (p<0.001, Bonferroni test). In the alpha bands, the clustering number of FO1 in the CO1 pair was greater than that of the FO1–FO2 pair (p<0.001, Bonferroni test) and the FO2–CO1 pair (p<0.001, Bonferroni test). In the beta bands, the clustering number of FO1 in the CO1 pair was also greater than that of the FO1–FO2 pair (p<0.001, Bonferroni test) and the FO2–CO1 pair (p<0.001, Bonferroni test). In the gamma band, the FO2–CO1 pair had the greater number of pairs than that of the FO1–FO2 pair (p<0.001, Bonferroni test) and the FO1–CO1 pair (p<0.001, Bonferroni test).
Like the FV, in the CV, the clustering number in all frequency bands depending on the categories of odors was significantly different only in the “within group condition” (Table 2; delta: X2[2]=51.21, p<0.0001; theta: X2[2]=1047.92, p<0.0001; alpha: X2[2]=167.78, p<0.0001; beta: X2[2]=65.40, p<0.0001; gamma: X2[2]=1882.45, p<0.0001). In the delta, theta, and alpha bands, the clustering number of the CO1–CO2 pair was greater than that of the CO1–FO1 pair (delta: p<0.001, Bonferroni test; theta: p<0.001, Bonferroni test; alpha: p<0.001, Bonferroni test) and the CO2–FO1 pair (delta: p<0.001, Bonferroni test; theta: p<0.001, Bonferroni test; alpha: p<0.001, Bonferroni test). The clustering number of the CO2–FO1 pair in the beta band was above the random level (333.33) and greater than that of the CO1–CO2 pair (p<0.001, Bonferroni test) and the CO1–FO1 pair (p=0.028, Bonferroni test). The clustering number of the CO1–FO1 pair in the beta band was also above the random level (333.33) and greater than that of the CO1–CO2 pair (p<0.001, Bonferroni test). In the gamma band, the clustering number of the CO1–FO1 pair was greater than that of the CO1–CO2 pair (p<0.001, Bonferroni test) and the CO2–FO1 pair (p<0.001, Bonferroni test). These results suggest that some frequency bands of the ERSP data could be clustered depending on the categories of odors in accordance with the results of the congruency test. In the case of FV, the ERSP data of the delta and theta bands were clustered as a FO1–FO2 pair. In the case of CV, the ERSP data of the delta, theta, and alpha bands were clustered as CO1–FO2 pairs. Similar patterns were observed in the k-means clustering results (Table S1, S2). ERSP data of delta and theta bands were clustered as the FO1–FO2 pair in FV, and ERSP data of delta, theta, and alpha bands were clustered as CO1–FO2 pairs in CV.
We also conducted an SVM to check whether it could show better results than using clustering analyses (Table S3, S4). In the case of FV (Table S3), FO2 was classified as FO1 with 64.4%, 78.7%, and 70.3% accuracy in the delta, theta, and gamma bands. In the case of CV (Table S4), CO2 was classified as FO1 with 65.4%, 79.5%, 68.7%, and 96.5% accuracy in the delta, theta, alpha, and beta bands. However, the average accuracy of the model based on 5-fold cross-validation was generally near random chance levels (50%). Therefore, the results of the SVM were not reliable.
We found that the congruency between the videos and odors in similar categories was higher than that between the videos and odors in different categories. We chose lavender oil, geraniol, 2-furanmethanethiol, and 2-ethyl-3,5-dimethylpyrazine. We confirmed that the participants principally grouped these odors into two categories: flower and coffee (Fig. 2). This result is in line with those of previous odor studies [42-45]. Even if odors in the flower category and coffee category shared at least two categories, they were not the same odors (Fig. 2a). However, odors in the flower category were more similar to odors in the coffee category, and vice versa (Fig. 2b). This similarity based on the categories of odors might be reflected in the congruency between the videos and odors (Fig. 3). In addition, among the 27 participants who were presented with matchable sets of stimuli, including videos and odors (FO1 and FO2 with FV, and CO1 and CO2 with CV), only one participant showed low congruency in both videos. Participants who were unable to recognize FO1 and FO2 as flowers, and CO1 and CO2 as coffee, showed low congruency between the videos and odors. In line with previous multimedia studies, we used EEG to measure viewers' neuronal responses [13, 27, 31]. ERSP analysis was used for the frequency bands. In our results, the ERSP data of odors with videos were clustered according to the similarity of the categories of odors (Table 1, 2). As our congruency result, ERSP data were clustered by AHC when odors from similar categories were matched with the videos. Especially in the delta and theta bands, the ERSP data of odors of similar categories in both videos were principally clustered. However, the alpha, beta, and gamma bands did not cluster depending on the odors of similar categories. All the results suggest that viewers’ responses to multimedia could be similarly induced even after replacing the odors with other odors in similar categories. Based on our results and methods, we suggested a system that can find odors as representatives or replacers to reduce the number of odors for multimedia.
Our results suggest that selecting odors based on the categories of odors is possible. Matching odors with objects based on categories has been used in multimedia and the film industry [1, 13, 49]. However, there is little evidence that viewers’ responses to multimedia presented with a given odor can be generalized to their responses to multimedia with other odors when they belong to similar categories. Previous studies that selected odors for multimedia by odor categories (Table 3) did not examine viewers’ responses to multimedia with several odors within similar categories. Furthermore, most of the studies in Table 3 measured behavioral responses. Our study first showed that odors could induce similar congruency and neuronal responses to viewers in multimedia as odors of similar categories. In addition, a previous study observed neuronal responses to multimedia with an odor. The delta, theta, and alpha bands changed in this study [13]. These bands were also crucial in our study because delta and theta bands were clustered depending on the odor categories (Table 1, 2). The delta band could be clustered depending on the category of odors in both videos. Previous studies [13, 31] suggest that the delta band is related to discriminating emotional responses while watching videos. In addition, the theta band predominantly clustered among these bands. According to previous olfactory studies [33, 34, 50], the theta band could include information on olfactory stimulation and be generated in the PC of the human brain. The PC is known as one of the major areas that perform olfactory processing in the brain and is related to categorizing odors [17-20]. Therefore, the theta band in our results might represent the similarity of odors in similar categories.
However, some results did not align with odor categories. The alpha band in the CV clustered depending on the similarity of odor categories, unlike in the FV (Table 1, 2). Previous studies related to the alpha band [51-54] suggest its association with the pleasantness of stimuli. In our study, there was tendency that the pleasantness of odors differed in the CV but not in the FV (Fig. S1d). Hence, differences in the pleasantness of odors may have influenced our alpha band results. Furthermore, the beta and gamma bands indicated that a significant number of closest pairs were unrelated to the categories of odors in both videos. Since the clustering pattern of FO1–CO1 in the FV and CO1–FO1 in the CV differed (Table 1, 2), odors might not be involved in the clustering patterns of the beta and gamma bands. The beta band is also known to be related to emotional responses such as arousal and stress [24, 55-58]. One explanation could be that the odors offered to viewers were different, even though they belonged to similar categories. These variations among odors could induce different emotional responses in viewers. Regarding the gamma band, a previous odor study without multimedia observed the gamma band clustered depending on the similarity of odors [21]. Possible reasons for this inconsistency include our offering odors for 5 seconds and analyzing 10 seconds of data, whereas gamma oscillation was transient within 1 second in the previous study [21]. This temporal difference might affect the clustering results of the gamma band. Additionally, the gamma band is related to respiration [59]. Despite offering odors at the beginning of the videos, participants’ respiration might not synchronize precisely with the onset of odors, potentially introducing variability in the gamma band in our results.
Several factors merit consideration when interpreting results related to odor categories. Our study differentiates between specific objects and categories of objects, recognizing that the criteria for a specific object and its category are not absolute but relative due to potential differences in the level of categorization. This implies that words used for object categories could also function as specific objects depending on their contextual relationship. For instance, in the case of ‘flower,’ it serves as the category of object in our study. However, ‘flower’ can be a specific object in relation to the broader category of ‘plant.’ The similarity among specific objects at lower levels of categorization (e.g., flowers) generally tends to be higher than that among specific objects at higher levels of categorization (e.g., plants) [60]. Despite this, clear standards for specific objects and their categories remain elusive. While our analysis revealed statistically significant EEG patterns across groups, we cannot fully exclude the potential influence of physiological noise, such as EMG, EOG, and ECG signals, due to the limitations of the preprocessing methods applied. Although independent component analysis (ICA) was considered, its application to single-trial data has been shown to be unreliable in distinguishing between brain signals and noise [47, 61]. For this reason, applying ICA to remove artifacts can remove not only artifacts but also EEG signals, making it unsuitable for our data. Therefore, the presence of physiological artifacts may have influenced the observed patterns to some extent. Additionally, there may be major and minor odors within a given category. In our data, 2-furanmethanethiol exhibited the highest congruency (88%) when matched with the CV and displayed a higher congruency than CO2 (2-ethyl-3,5-dimethylpyrazine), though the difference was not statistically significant. Notably, 2-furanmethanethiol is recognized as a crucial aroma component of fresh coffee [44], potentially influencing our congruency and neuronal response results. However, delving into major and minor odors within this category extends beyond the scope of our study, warranting further investigation to address this aspect.
In summary, our study primarily aimed to investigate whether odors belonging to similar categories could elicit comparable behavioral and neuronal responses compared to odors from different categories. We observed that viewers’ behavioral and neuronal responses to multimedia align more closely with odors in similar categories than with odors in different categories. This finding offers insights into a potential matching method for integrating odors with objects in multimedia. Based on the results and methods we used, we suggested a system that can find odors as representatives or replacers to reduce the number of odors for multimedia. Therefore, our study could be helpful in conserving effort and saving resources in selecting odors for multimedia.
This work was supported by Priority Research Centers Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2020R1A6A1A03040516).
KK, JB, and CM conceptualized and designed the work; KK and SM conducted the experiments. KK, JL, and JB analyzed and interpreted the data as well as drafted the article and figures; SM, SL, WK, and CM contributed critical revisions of the article; CM supervised all experiments and analyses.
The authors declare no competing interests.
The datasets generated or analyzed during the current study are available from the corresponding author upon reasonable request.
Hierarchical clustering of closest odor pairs in each frequency band depending on categories of odors in each video
Video | Condition | Frequency band | Count of closest odor pairs | Cophenetic correlation | X2-value | |||
---|---|---|---|---|---|---|---|---|
FO1 – FO2 | FO1 – CO1 | FO2 – CO1 | Total | |||||
Flower | Within group | Delta | 436 | 72 | 492 | 1,000 | 0.79 | 312.02*** |
Theta | 718 | 25 | 257 | 1,000 | 0.79 | 746.61*** | ||
Alpha | 0 | 1,000 | 0 | 1,000 | 0.95 | 2,000.03*** | ||
Beta | 9 | 904 | 87 | 1,000 | 0.82 | 1,474.62*** | ||
Gamma | 5 | 96 | 899 | 1,000 | 0.78 | 1,452.28*** | ||
Random sampling group | Delta | 311 | 357 | 332 | 1,000 | 0.82 | 3.18 (n.s) | |
Theta | 349 | 304 | 347 | 1,000 | 0.81 | 3.88 (n.s) | ||
Alpha | 337 | 328 | 335 | 1,000 | 0.84 | 0.13 (n.s) | ||
Beta | 338 | 327 | 335 | 1,000 | 0.84 | 0.19 (n.s) | ||
Gamma | 334 | 334 | 332 | 1,000 | 0.89 | 0.01 (n.s) |
The hierarchical clustering results were replicated 1,000 times, and the count of nearest odor pairs was tallied during these repetitions. The clustering count of the nearest odor pairs in the FV across the five frequency bands. A chi-square test was performed for this table. *p value<0.05, **p value<0.01, ***p value<0.001.
Hierarchical clustering of closest odor pairs in each frequency band depending on categories of odors in each video
Video | Condition | Frequency band | Count of closest odor pairs | Cophenetic correlation | X2-value | |||
---|---|---|---|---|---|---|---|---|
CO1 – CO2 | CO1 – FO1 | CO2 – FO1 | Total | |||||
Coffee | Within group | Delta | 440 | 281 | 279 | 1,000 | 0.83 | 51.21*** |
Theta | 812 | 147 | 41 | 1,000 | 0.85 | 1,047.92*** | ||
Alpha | 514 | 302 | 184 | 1,000 | 0.83 | 167.78*** | ||
Beta | 216 | 368 | 416 | 1,000 | 0.84 | 65.40*** | ||
Gamma | 0 | 980 | 20 | 1,000 | 0.89 | 1,882.43*** | ||
Random sampling group | Delta | 312 | 332 | 356 | 1,000 | 0.82 | 0.06 (n.s) | |
Theta | 336 | 318 | 346 | 1,000 | 0.83 | 0.01 (n.s) | ||
Alpha | 346 | 318 | 336 | 1,000 | 0.84 | 5.70 (n.s) | ||
Beta | 331 | 326 | 343 | 1,000 | 0.83 | 0.66 (n.s) | ||
Gamma | 339 | 327 | 334 | 1,000 | 0.85 | 0.70 (n.s) |
The hierarchical clustering results were replicated 1,000 times, and the count of nearest odor pairs was tallied during these repetitions. The clustering count of the nearest odor pairs in the CV across the five frequency bands. A chi-square test was performed for this table. *p value<0.05, **p value<0.01, ***p value<0.001.
Comparison between previous multimedia studies with odors and our study
Ref | Stimuli | Odor selection method | Number of odors in identical categories | Features |
---|---|---|---|---|
[14] | Traditional multimedia + olfaction | Selecting odors depending on odor categories and scenes | 1 | Behavior response |
[10] | Traditional multimedia + olfaction | Selecting odors depending on odor categories and scenes | 1 | Behavior response |
[2] | Traditional multimedia + olfaction | Selecting odors depending on odor categories and scenes | 1 | Behavior response |
[13] | Traditional multimedia + olfaction | Selecting odors depending on odor categories and scenes | 1 | Behavior response/ EEG |
Our study | Traditional multimedia + olfaction | Selecting odors depending on odor categories and scenes | 2 | Behavior response/ EEG |
Previous multimedia studies with odors were compared to our study in terms of stimuli, odor selection method, number of odors in identical categories, and features.