Partners

With support from our partners

About the Challenge

Leveraging Open Data Analytics and Machine Learning to Improve Mental Health Research and Innovation (2024) IUBDC 2024

The University Big Data Challenge (IUBDC) successfully concluded its latest edition, bringing together national health industry leaders, policymakers, and dignitaries for a conference centered on mental health research and innovation. Participants had the opportunity to showcase their projects in a poster session, engaging with representatives from industry and academia. The event provided a rich network of industry and academic professionals, offering pathways for scholarly publication, access to curated datasets, learning materials materials, and workshops.

Held at the Microsoft Headquarters in Toronto on July 26, 2024, the challenge featured hands-on data science and AI research projects focused on mental health. For over seven years, IUBDC has fostered collaboration between data science and health professionals and motivated university students. This year’s projects, which spanned topics from dementia to addictions, aimed to enhance our understanding of mental health, improve patient lifestyles, and reduce disease risk. Abstracts from teams were published in the NRC Research Press STEM Fellowship Journal, with finalists’ works featured by Underline publications.

How it Works

Teams of up to 5 students are each provided with datasets, workshops, learning resources, and tools for data analysis. It is recommended to make interdisciplinary teams to make most of this experience. Teams will present their research findings in the form of scientific manuscripts, competing for monetary and academic prizes, at the culminating finale event. The abstracts of all participating teams will be published in the open access, peer-reviewed NRC Research Press STEM Fellowship Journal. Finalists’ videos and manuscripts will be published with Underline. We also facilitate participants with preprint opportunities at JMIR Publications.

Finalists

Exploring the Intersection of Autism Spectrum Disorder and Mental Health Through a Mixed-Method Analysis

Edson Kenzo Takei, Aleeya Irshad, Sidney Liu, Zainab Ansari, Aboud Jalal

Autism Spectrum Disorder (ASD) is a neurological development condition that affects approximately 1 in every 54 children and youth in Canada. Individuals diagnosed with ASD may experience challenges in communication and behaviour, which may vary depending on the individual. Often, such challenges are coupled with comorbidities like mental health conditions, which are often more prominent in individuals on the spectrum. As such, the detection of early signs of mental health issues and the proper diagnosis of ASD is crucial. This project explores the overlap between ASD and mental health conditions by utilizing a mixed-methods analysis. By analyzing survey-based and omics data on ASD, the authors hope to conduct a comprehensive examination of Autism Spectrum Disorder from both a psychological and biological perspective. For the first part of this project, questionnaire-based datasets from Thabtah F. et al. on individuals sharing ASD traits were retrieved for analysis. The datasets consist of 10 questions related to ASD traits and demographic information across developmental stages. The concurrence of ASD and mental health conditions will be analyzed through machine learning, data analysis, and visualization. Subsequent stages will involve omics data to deepen our understanding of ASD.

Improving Prediction Adherence to Mental Health Treatment Programs Using Machine Learning Algorithms

Priyanka Kothapalli, Keerthana Amalanathan Valarmathi, Sai Kiran Pilli, Shreya Gopikrishnan Nair, Sharath Mohan

Substance treatment programs are essential for addressing substance addiction issues. Several considerations may explain why completion rates are lower than predicted. Previous studies have emphasized the impact of demographic and socioeconomic factors on treatment outcomes. However, there is insufficient research on utilizing advanced machine learning algorithms to predict adherence and discover trends in large datasets, particularly considering mental health concerns. The dataset was obtained from SAMHSA which contains extensive information on demographics, geography, education, employment, drug history, mental health status, and treatment outcomes. The study uses sophisticated machine learning algorithms to predict adherence to drug therapies, employing Random Forest and XGBoost models to identify significant characteristics influencing patient treatment completion. These models are quite accurate and could improve treatment outcomes through personalized methods. We investigated Random Forest and XGBoost models using cross-validation and several metrics such as accuracy, precision, recall, F1-score, and AUC-ROC. The Random Forest model achieved an accuracy of 85%, with a precision of 80%, recall of 82%, an F1-score of 81%, and an AUC-ROC score of 0.88. The XGBoost model performed even better, with an accuracy of 87%, precision of 82%, recall of 84%, an F1-score of 83%, and an AUC-ROC score of 0.90. Key predictors of treatment adherence included age, marital status, employment status, mental health status, and the number of previous treatment episodes. We also observed significant disparities in adherence rates among ethnic minorities and individuals with lower educational levels. The research shows that machine learning algorithms may accurately predict adherence to drug treatment programs. The identified predictors provide useful information for creating focused, individualized treatment methods. Future studies should use longitudinal data to evaluate post-treatment results and investigate other socioeconomic and psychological aspects. Healthcare practitioners may employ machine learning to enhance intervention efficacy, improve patient outcomes, and reduce the social impact of substance misuse.

Social Media: The Influencer of Mental Health

Yeonjae Oh, Milanda Luo, Ying Wa Ng

The Internet provides a platform for individuals to share their thoughts and interact with others, enabling a dynamic exchange of information. In fact, one’s online activities can reflect one’s mental state, moreover, it may even influence other’s mental state by their digital interactions. In this paper, we aim to leverage machine learning to monitor societal mental health by tracking users’ online activities. Using social media as a case study, we employ sentiment analysis to evaluate comments and assess users’ online behaviors. Our analysis focuses on the frequent words used, the tone of the comments, and their impact on the audience. The NLP model is designed to monitor societal mental health trend by examining the correlation between online speech and mental health. This enables us to intervene in the content that negatively impacts social mental health, fostering constructive discussions and solutions. Our primary objective is to see a strong connection between hate speech and mental health, demonstrating its influence on the targeted individuals and the surrounding audience. However, our model has limitations. Social media users, particularly those on platforms like YouTube, represent only a small fraction of the overall internet population. Additionally, our model is designed to analyze English-language content, which excludes non-English-speaking internet users.

Quantifying Childhood Trauma: Causal Machine Learning Approaches to Mental Health Outcomes

Christopher Ewanik, Rehman Tariq, Zaid Ahmed, Abeer Ahmed

We utilized the Behavioral Risk Factor Surveillance System (BRFSS) dataset for the following observational machine learning (ML) study. In contrast to recent literature using ML to predict specific outcomes, we used causal machine learning (CML) algorithms (LRSRegressors, DRLearner, Rlearner, Xregressor, and Tregressor) to quantify the cause-and-effect relationships between adverse childhood experiences (ACEs) and two different mental health targets (number of bad mental health days and whether the respondent had a depressive disorder). Specifically, we report each ACE’s average treatment effect (ATE) on each target. We also used Uplift Random Forest trees to calculate uplift scores. We found that growing up in a household where one parent had a depressive disorder and being forced into unwanted sex had the most significant effects on our chosen mental health targets. Growing up in a household where one parent had a depressive disorder increased the likelihood of being diagnosed with a depressive disorder by 30% and added 5.07 days of bad mental health in the last month. Repeatedly being forced to have sex added 5.77 days of poor mental health and increased the likelihood of being diagnosed with a depressive disorder by 33%. Uplift modeling indicated that individuals over 35 years of age earning under $100,000 USD annually were most susceptible to the effects of a parent diagnosed with depression by a factor of 1.1 compared to the average population. Male college graduates over 44 years of age earning under $100,000 USD were found to be most susceptible to the effects of childhood sexual abuse by a factor of 1.38. Future research should refine these models and employ more complicated algorithms to gain accurate, interpretable measurements and to understand the relationship between ACEs and mental health.

Utilizing Deep Learning Techniques for Mental Disorder Prediction and Support Using Reddit Data

Arshdeep Kaur, Pratheepan Gunaratnam, Jungyu Lee, Alejandro Akifarry, Sirada Thoungvitayasutee

The increasing prevalence of mental health disorders necessitates innovative approaches to early detection and intervention. Social media platforms like Reddit provide a rich source of textual data that can be leveraged to identify mental health issues based on user-generated content. This study aims to classify mental health disorders by analyzing Reddit comments. The dataset used is Reddit SuicideWatch and Mental Health Collection that includes 54,412 posts, which are classified into several mental health disorders. After that, preprocessing steps were done, which includes tokenization, padding, word embeddings, and handling imbalance data. Then, a Custom-CNN (Convolutional Neural Network) and a Custom- RNN (Recurrent Neural Network) were used to classify the data. These evaluation metrics were then used to evaluate the models: accuracy, precision, recall, F1-score, and ROC AUC Score. Notable results are that the RNN has a higher accuracy of 0.65 compared to 0.64 by CNN. In addition, both RNN and CNN also have high ROC AUC scores of 0.88 and 0.87 respectively. In conclusion, the results show that the models are good at differentiating one class from another, however, both models have problems accurately classifying the correct mental health disorder from the texts. This shows promise as the models can definitely be improved with better and more balanced data, and exploration on LLMs (Large Language Models) can also be valuable for this study and issue. In addition to that, those improvements can be applied to the act of intervention by integrating it to technology such as a conversational AI or wearable actuators that have the ability to predict and prevent through real-time monitoring.

Assessment of Depression Symptoms and Severity Levels in a Clinical Sample: A Comprehensive Analysis

Jignesh Bejjagam, Anil Sah Baniya, Sithara Sekar, Krisha Lungeli Magar, Andres Viana

Depression is a complex mental health disorder characterized by a variety of symptoms that impact an individual’s emotional and physical well-being. Accurate assessment and classification of depression states are crucial for effective treatment and management. This study aims to explore the relationship between various psychological and behavioral features and the state of depression in individuals. Methods: We utilized a dataset containing 813 entries, each recording levels of Sleep, Appetite, Interest, Fatigue, Worthlessness, Concentration, Agitation, Suicidal Ideation, Sleep Disturbance, Aggression, Panic Attacks, Hopelessness, Restlessness, and Low Energy, along with a corresponding Depression State (categorized as Mild, Moderate, Severe, or No depression). Due to missing values in the dataset, appropriate data cleaning techniques will be applied to handle NaNs. Statistical analysis and machine learning algorithms will be employed to identify patterns and predict depression states based on the given features. Techniques such as logistic regression, decision trees, and support vector machines will be considered for classification tasks. Results: Preliminary analysis indicates variability in the severity of depression symptoms among the different depression states. For instance, higher scores in Fatigue, Worthlessness, Agitation, Suicidal Ideation, Sleep Disturbance, Aggression, Panic Attacks, Hopelessness, and Restlessness are observed in more severe depression states. Conversely, lower scores in these features are associated with milder forms of depression or no depression. The final results will include the accuracy of different predictive models and the significance of each feature in determining depression states. Conclusions: This study seeks to contribute to the understanding of how specific psychological and behavioral factors correlate with varying levels of depression. By developing a reliable predictive model, healthcare professionals could potentially identify and categorize depression states more effectively, thereby facilitating timely and personalized treatment interventions. Further research and refinement of the models will be necessary to enhance predictive accuracy and clinical applicability.

Advanced Machine Learning Techniques for Early Detection and Classification of Depression and Suicidal Tendencies on Social Media

Guruprasad Tandlekar

Depression and suicidal thoughts are significant mental health concerns frequently expressed on social media platforms like Reddit. This study uses advanced machine learning techniques to analyze and classify posts from the Reddit dataset to identify and categorize high-risk content. By employing Latent Dirichlet Allocation (LDA) for topic modeling, we identify common themes in the posts, such as expressions of despair and cries for help [1]. We then apply clustering algorithms to group similar posts, uncovering underlying patterns and providing insights into frequent triggers and shared experiences [2]. We use classification models to accurately categorize posts and detect those indicating severe distress, including Logistic Regression, Support Vector Machines (SVM), and Neural Networks [3]. This methodology is implemented in a web application designed to provide real-time categorization of new posts, aiding mental health professionals and social media moderators in quickly identifying and responding to high-risk content. The application invites users to interact, processing their inputs via a text box. Visualization tools, such as 2D and 3D scatter plots, clearly display the data, allowing users to understand how posts are categorized and fit into broader trends [4]. This research improves our understanding of how mental health issues are expressed on social media, highlighting the importance of advanced data analytics in creating practical intervention tools. Our approach demonstrates the potential of combining psychological assessment with sophisticated computational methods to develop a comprehensive system that benefits research and individual users.

Model Based Severity Analysis of Mental Health Issues Using NLP and Sentiment Analysis On Social Media Data (SENTI-SEV)

Huu Duc Ngo, Ghizlane Ez-Zarrad, Kit Hung Woo, Muhammad Shahzaib Vohra, Surendrapalsingh Jhiout

Mental health issues are increasingly being addressed through machine learning techniques, which offer new possibilities for analyzing and understanding psychological conditions. This study focuses on detecting depression severity using advanced natural language processing (NLP) and sentiment analysis algorithms. Leveraging the capabilities of machine learning, we aim to provide a robust framework for assessing mental health by analyzing social media data. Our approach involves the use of three distinct models: EMOTION ENGLISH DISTILROBERTA-BASE for extracting emotional features, MULTIWD for assessing various dimensions of wellness, and Vader for sentiment analysis. These models generate a comprehensive feature set, which is then used to compute depression severity, categorized into five levels—Minimal, Mild, Moderate, Moderately Severe, and Severe—based on the PHQ-9 questionnaire. To evaluate the effectiveness of our approach, we compared the calculated severity classes with results from Latent Dirichlet Allocation (LDA) for topic modeling and K-Means clustering for numerical feature-based classification. Our findings indicate that K-Means clustering achieved a 76% accuracy in aligning severity classes, which surpasses the LDA accuracy 63%. This suggests that numerical features may offer more reliable clustering for mental health severity than topic-based methods. This study highlights the potential of combining NLP and sentiment analysis techniques in mental health assessments and suggests directions for further refinement and research.

Deciphering Depression: A Multivariate Analysis of Influential Factors and Their Relationships

Jyotsna Sehgal, Nguyen Quoc Phuc Nguyen, Sandip Poudel, Francis Neal Altares, Kinjal Jha

Depression is popularly known as major depressive disorder (MDD), which is a common but serious medical sickness that negatively affects how an individual feels, the way they think, and possibly how they act. It potentially leads to many emotional and physical breakdowns and can reduce a person’s capability to perform at work as well as at home. The project employs a robust methodology to analyze the relationships between depression and various risk factors using the National Health and Nutrition Examination Survey (NHANES) dataset. The project involves data acquisition and preprocessing, depression score calculation using the PHQ-9 questionnaire, and technical infrastructure setup. Data aggregation, preprocessing, and exploratory data analysis (EDA) are performed using Python and its libraries. Hypotheses are formulated and tested using statistical methods, and machine learning techniques are applied for predictive modeling. Feature importance and hyperparameter tuning are used to improve the models.The project identifies key factors associated with depression, including socioeconomic features, alcohol and drug consumption, and mental health measures. Predictive models are being developed to predict depression levels based on multiple variables. A web interface with interactive dashboards is being designed to visualize key insights and model predictions. The findings have the potential to inform and shape public health policies and interventions. Also, successfully identifying and analyzing these complex associations will provide better insights into the whole picture of mental health treatment, particularly depression, which ultimately improves the effectiveness of public health strategies and clinical practices. Every finding in this research project might be beneficial not only to patients with depression but also to the academic community and public health policymakers.

Using Machine Learning and Deep Learning for Depression Detection on Social Media

Sivabalan Sandh Muthurajan, Qiuyu Huang, Cyrus Y. H. Fung, Peizhe Guan

Depression is a widespread mental health disorder affecting millions worldwide. In the age of social media, it is crucial to leverage technology to detect, help, and support individuals displaying symptoms online. Our investigation revealed that social media, particularly Reddit, has a wide demographic of people and hosts a variety of discussion topics, including a significant number of posts on mental health. Using the large amounts of data on Reddit, the goal is to train and develop various machine learning models to recognize manifestations of depression and evaluating their performance. These models include machine learning and deep learning methods, such as Support Vector Machines (SVMs), transformers, and various Recurrent Neural Network (RNN) architectures — simple RNNs, Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) units — for sentiment analysis and depression detection on Reddit. The objective is to evaluate and enhance the models’ accuracy. Early detection facilitated by such models could potentially mitigate the impact of mental health disorders through timely intervention and support mechanisms.

Using Interactive Games as Tools for Gauging Anxiety

Nina Geng, Mona Filali, Siddharth Gollapudi

Anxiety disorder (AD) is a condition associated with feelings of dread and an incapability to perform daily tasks. Symptoms can be physical, such as a rapid pulse and sweating, and emotional, including constant stress and difficulty concentrating. Anxiety disorders are subjective, variable and often overlap with underlying conditions, complicating detection and treatment. This study focuses on three factors, implicit memory, reaction time, and reaction to a perceived threat—all measured in previous studies and correlated with anxiety levels. Our goal is to determine if anxiety presence can be identified, using three games and one simulation. The first game centers around implicit memory, enabling us to perform tasks subconsciously. Anxious individuals exhibit an implicit memory bias for threat-related information. The second game will associate a level of anxiety to reaction times. The third game first makes the user complete a regular task and progressively introduces a threat to the successful completion of the task. Attentional bias shows that anxious people are more likely to respond to threat-related stimuli rather than stick to their task. Finally, the user will take part in an artificial anxious situation. In the first stage, the user plays a regular game. In the second, distracting stimuli appear. In the third, the user’s available resources decrease. Throughout the stages, both the user’s effectiveness and efficiency will be evaluated. If the user experiences anxiety, efficiency should decrease in stage 2 and effectiveness should decrease in stage 3. Based on their results, they will receive an assessment of their optimal stress level for highest performance. The results from all of three games will be compiled to provide an anxiety assessment. The player will then be prompted to complete the Hamilton Anxiety Rating Scale (HAM-A) which will be used to verify whether the game results match their true anxiety levels.

Imposter Syndrome Comparison Map

Aanal Kamleshkumar Patel, Bimal Kumar Shrestha, Danilo Diaz, Ernie sumoso, Jayachandran Saravanan

Impostor syndrome is defined as a combination of self-perceived fraudulence, achievement pressure, and negative emotions [3], which creates doubts in one’s capacity, leading to unfavorable comparisons with others. Our study represents an approach to evaluate individuals based on a set of questions from the Young Impostor Syndrome (YIS) scale [4] built within a web application. These questions assist users in identifying impostor syndrome symptoms, such as self-doubt, and use Likert-scale responses ranging from ‘agree’ to ‘disagree’ to calculate a more accurate impostor syndrome metric. Individuals are encouraged to provide personal information such as age and years of experience in the field, enabling users to attribute their accomplishments to their own ability. Furthermore, our initial data, from a dataset predicting impostor syndrome in medical students [2], is processed as the basis for our analysis. We then expand this data with online user responses, using a cloud solution to store their inputs. The data is then presented as clusters to the user so they can observe how many other individuals are dealing with similar kinds of issues. This enables the investigation of relationships between impostor syndrome traits and demographics. We use clustering algorithms [1] to place users into groups according to their profiles. Users can see where those clusters fit in relation to others by viewing them in 2D and 3D scatter plots. Visualizations help to better understand the frequency and the variability of this mental health phenomenon among demographic groups, in addition to quantifying it. This study adds to our understanding of the intricacy and effects of impostor syndrome. Users’ experience can be contextualized by the visual representation, which promotes awareness and debate of the psychological phenomenon. Our method emphasizes how sophisticated data analytics and psychological testing may be used to create a comprehensive tool that helps research and individuals. Keywords: impostor syndrome, data analysis, clustering algorithms, visualization, web application, mental health, Likert scale

Machine Learning in Neurological Disorders: A Comprehensive Review of Diagnostic and Predictive Applications

Saivenkat Jilla, Peter Macdonald, Jamal Khattak, Nicole Go, Mehul Sharma

With the advent of machine learning becoming more and more integrated into healthcare, this study explores the efficacy of machine learning models in classifying and predicting various brain disorders and diseases such as schizophrenia, Alzheimer’s disease and Parkinson’s disease. Our review highlights the potential that ML has in identifying effective biomarkers, with cortical thickness emerging as a more accurate indicator for schizophrenia. Additionally, the connection between Alzheimer’s disease and metabolic syndrome was made much more clear through ML algorithms, identifying eight diagnostic genes with high AUC values. The versatility of ML is further demonstrated through the application of natural language processing to analyze social media data, uncovering mental health issues in patients with diabetes. The challenges associated with ML use in a healthcare setting were also analyzed; the performance of ML models can drop substantially when encountering new data significantly different from the training set. While ML has great promise, it is recommended to use as a supplementary tool in collaboration with clinicians for the greatest efficacy. Continuous improvement in algorithms, data quality, and model validation will be essential to fully realize the potential of ML in improving patient care and outcomes.

Enhancing Wellbeing: An AI-Driven Stress Level Identifier

Rima Rajan, Arvind R. Rajan

Stress affects over 79% of individuals globally, presenting a significant challenge for timely and accurate identification within mental health care. Traditional methods, which rely on self-reported data and sporadic clinical assessments, often result in delayed diagnosis and inadequate stress management. To address these challenges, this project employs artificial intelligence (AI) to develop a real-time stress level detection model using unsupervised learning algorithms. By integrating data from wearable devices, such as Apple Watches, with self-reported mood and activity logs, our model provides continuous and comprehensive stress analysis. Achieving an accuracy of 60% upon validation, the model demonstrates a significant step toward more effective stress management. This AI-driven approach shows promise in enabling early stress detection and personalized interventions. The findings underscore the potential for AI to enhance stress monitoring accuracy, thereby contributing to improved individual well-being and broader accessibility to quality mental health care. Future research will focus on refining the model to improve its accuracy and exploring additional data sources to further enhance its predictive capabilities.

Predictive Models on the Therapeutic Effects of Diverse Music Repertoires on Mental Health

Joey Qiao, Cindy Qiao

Music therapy is a recognized field of therapeutic intervention used in a diverse variety of contexts, including hospital settings, retirement homes, palliative care, and community programs. It leverages intrinsic qualities of music to address mental health issues and enhance overall mental health. A challenge for music therapists working in hospice is lack of suitable musical repertoires, called “music unpreparedness”. This study aims to use a machine learning model to evaluate repertoires and predict their effects on an individual’s mental health given their age and mental health condition, if any. Streaming services such as Spotify can also implement this model to improve the overall effects of services on listeners’ mental health–for instance, promoting playlists that will likely improve, rather than worsen, the user’s mental health on their home page. To train the model, an open-access dataset was used containing features such as favorite genre, age, and mental health condition. Categorical variables including ‘Fav genre’ and ‘Frequency [genre]’ were prepared using one-hot encoding and the dataset was split into training and testing sets, using a stratified approach to maintain the distribution of mental health conditions and music preferences. A decision tree model was chosen due to its interpretability and ability to handle categorical data. The model was trained with the entropy criterion and a maximum depth of 6, achieving a training accuracy of 82% and a validation accuracy of 75%. Key features influencing the model included listening hours/frequency, favorite genres, and mental health condition levels. The model provides insights into how these factors contribute to the perceived effect of music on mental health, offering valuable predictions that can enhance therapeutic interventions and user experiences on streaming platforms. Future improvements include incorporating more diverse datasets to improve the model’s generalizability. Additionally, other machine learning algorithms can be explored to enhance prediction accuracy.

AI For Emotional Wellbeing: Designing an NLP-based Mental Health Chatbot

Uma Maheshwari, Raemil Comiel, Sonal Parmar, Jharana Adhikari, Aparna Suresh

Mental Health is a subjective phenomenon for which there is only one way to diagnose, and that is knowing the root cause of this health problem. In many cases, people hesitate to consult for treatment due to discrimination, stigma, or fear of the unknown. This study shows how a Natural Language Processing (NLP) based Chatbot can be an interactive medium for patients to seek guidance for their mental health-related issues without hesitation and in a healthy and safe environment that will protect their privacy. NLP-based chatbots are governed by an artificial neural network (ANN) that is deployed as an app. The model is trained on a dataset with various questions, patterns, and responses. Based on the trained data, it provides 86.04% accurate responses after the training and could increase if its knowledge base is increased. This Chatbot can address issues like anxiety, depression, phobia, post-traumatic stress disorder (PTSD), and Schizophrenia. However, there is a need to expand the volume and variety of data and include a learning curve in the model to improve the responses. Long Short-Term Memory (LSTM) or GPT-3 can make the NLP-based Chatbot play a crucial role in elevating efficiency as it can foster personalized support with tailored needs of individuals.

Investigating Multi-Perceptron Neural Networks for Stress Detection Using EEG and ECG

Nour Boulos, Patricia Krisanti, Lauren Altomare, Najma Sultani, Heraa Muqri

Stress is a common experience reported to impact 30% of the global population with significant negative mental health effects and has sparked an increase in research aimed at stress detection. This study explores the effectiveness of using electrocardiogram (ECG) and electroencephalography (EEG) signals as predictors for stress detection using a multi-perceptron neural network. The model is trained and validated using a dataset from a study with 40 participants, 21 female and 19 male, at the Prince of Songkla University in Thailand. The participants were between the ages of 18 and 25 and performed mental arithmetic tasks (MAT) while their EEG and ECG signals were measured. The data is divided into 3 classes: EO, A1, A2 representing normal stress, high stress, and low stress levels in participants respectively. In addition to the participant’s gender included in both the EEG and ECG data, the EEG data contains the Fp1, Fp2, F3, F4, T3, T4, P3, P4 while the ECG data contains mean heart rate, AVNN, SDNN, NN50, pNN50, RMSSD, LF, LF Norm, HF, HF Norm, and L/F ratio. The model’s architecture including its learning rate, activation function, dropout rate, epoch, and layers were adjusted and evaluated to currently achieve 62% accuracy. In comparison to existing studies using the same dataset with various machine learning algorithms such as Adaptive Boosting, RF, SVM, k-Nearest Neighbor, LR, and NB, this model achieves similar accuracy, indicating competitive performance of the multi-perceptron with existing stress detection models. Future iterations of the model aim to improve the test accuracy and decrease the loss of the model by creating a training set using the nearest centroid neighbor (NCN) approach. The scarcity of data on EEG, ECG, and stress while training the model suggest a need for further data collection to further improve the model’s training set and accuracy.

Event Schedule

FAQ

You do not need previous experience with programming, although it is recommended. We welcome all students who are eager to put effort into learning and expanding their skillsets, as well as those who show any level of interest in data science or the challenge topic. Additionally, we will provide you with access to resources and webinars to learn everything you need to succeed!

We encourage participants to start forming teams before the event. You may also register and participate on your own or request to be placed into a team after registration. It is also recommended to make interdisciplinary teams given the nature of some of our data challenge topics. Each team is encouraged to have at least one member having a medicine, life sciences, biology or a related field. This is recommended and not mandatory.

Think about what interests you the most in the field of the provided topic. Reflect on your day-to-day; talk to your friends and professional network from academia and industry; explore emerging technologies and platforms; read the internet and research articles. In hackathons like these, many teams come up with their topics in the first few days of the challenge, rather than beforehand.

No, students from any country can sign up. The IUBDC is not limited to Canadians.

Undergraduate and graduate students can register for the Big Data Challenge.

Yes, students do not necessarily have to represent the university at which they are studying.