A Novel Interpretative Deep Neural Network with Grad-CAM’s Heatmap for The Early Diagnosis of Alzheimer’s Disease

Student: Annabelle Yao
Table: MED8
Experimentation location: School, Home
Regulated Research (Form 1c): No
Project continuation (Form 7): No

Display board image not available



[1] Alzheimer‘s disease facts and figures 2023. 19(4):1598–1695, Mar 2023

[2] Sabbagh, M. N., Lue, L. F., Fayard, D., & Shi, J. (2017). Increasing Precision of Clinical Diagnosis of Alzheimer's Disease Using a Combined Algorithm Incorporating Clinical and Novel Biomarker Data. Neurology and therapy, 6(Suppl 1), 83–95. https://doi.org/10.1007/s40120-017-0069-5

[3] Xin Yang, Lequan Yu, Shengli Li, Huaxuan Wen, Dandan Luo, Cheng Bian, Jing Qin, Dong Ni, and Pheng-Ann Heng. Towards automated semantic segmentation in prenatal volumetric ultrasound. IEEE transactions on medical imaging, 38(1):180–193, 2018.

[4] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai,Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

[5] ADNI | Alzheimer’s Disease Neuroimaging Initiative. (2024). Usc.edu. https://adni.loni.usc.edu/

[6] Srikanth Tammina. Transfer learning using vgg-16 with deep convolutional neural network for classifying images. International Journal of Scientific and Research Publications (IJSRP), 9(10):143–150, 2019

[7] Shihong Yue, Ping Li, and Peiyi Hao. Svm classification: Its contents and challenges. Applied Mathematics-A Journal of Chinese Universities, 18:332–342, 2003.

[8] Aniruddha Bhandari. Guide to auc roc curve in machine learning : What is specificity?, Jun 2020.

Additional Project Information

Project website: https://s3.s100.vip:4794/
Presentation files:
Research paper:
Additional Resources:
  • Github with codes
Project files:
Project files

Research Plan:

Rationale:   In recent years, Alzheimer’s disease, a disease that targets the patient’s cognitive abilities and causes dementia, has become increasingly prevalent among the older population. Clinical practices today diagnose the disease through MRI imaging and cognitive tests based on individual doctor’s personal knowledge and experience. Such strong reliance on experiential diagnosis can pose a huge challenge in the diagnosis of Alzheimer’s disease at an early stage when the symptoms of the disease are subtle and hard for clinicians to discern. [1] The diagnosis also has a low accuracy rate of 77% for correct diagnosis at early stages. [2] Furthermore, in areas of development, advanced healthcare for early diagnosis is extremely rare. As a result, many elderly today are only diagnosed with Alzheimer’s when the disease has already progressed into a later, much more obvious stage, significantly reducing the patients ’ chance of receiving effective treatments.

In addition, Neural Networks, a subset of deep learning in Machine Learning (ML), are known for recognizing and modeling complex, nonlinear relationships between data sets. Analogous to human brain functions, Neural Networks, once trained to understand the input-output relationship of existing datasets, can form relatively accurate predictions (the output) to new and even more complicated datasets. However, despite their powerful capability in forming data-backed predictions, Neural Networks operate as “black box” models and explain little to users regarding the approach and logical reasoning behind their prediction. As a result, Neural Networks are often viewed with caution in critical domains like disease diagnosis and are not commonly used. [3-4]

Our study will provide a solution that addresses both the medical diagnostic challenges and the ML challenges, advancing the interdisciplinary field of healthcare and technology research.  

Research Questions: Hypothesis, Engineering Goals, Expected Outcomes: In our work, we want to create and apply a novel method from a newly discovered field to Alzheimer’s disease diagnosis, solving both of the critical issues above and achieving highly accurate results while doing so. We will try to use processes similar to the current diagnosis by taking and analyzing MRI scans. However, we will be conducting analysis with computers and the interpretability heatmaps created. We will analyze the differing areas of attention between each machine learning model, observing how the different models make their decisions through the heatmaps and comparing their resulting accuracies and precisions. We hypothesize that machine learning models with a more global attention weight distribution will have a higher accuracy because of their contextual awareness of other regions of the MRI. This can mimic doctoral diagnosis as doctors look for an overall indication of disease rather than only a specific region that indicates it. We also hypothesize that the heatmap will show areas of the MRI with major brain shrinkage, as that is one of the commonly seen symptoms across patients with Alzheimer’s Disease. 

Procedures: The figure below shows the overall flowchart for the process of the stages in the present study.

For this research, we conduct a classification task and make predictions based on the Alzheimer's dataset using machine learning. Alzheimer's is a progressive disease that targets the brain and causes dementia, destroying cognitive abilities and resulting in memory loss. We extract a patient dataset from Kaggle and government public hospital repositories (ADNI) that are available upon request and application. [5] Our dataset includes a total of 6400 pre-processed MRI (Magnetic Resonance Imaging) Images from NifTi to JPG, split into four classes of images: Very Mild Demented, Mild Demented, Moderate Demented, Non-Demented, and are all resized into an image consisting of 128 x 128 pixels. The precise data statistics can be found in the table below (Table 1). 


The dataset will then be split into 80:20 train test split for the machine learning model to be trained with, and the image orientation will be randomized for optimal learning. We will then optimize current popular ML models to form baseline comparisons to our custom transformer model. These baseline models include SVM, XGBoost, MLP, AlexNet, and VGG. [6-7] We will then create our custom ViT (Vision Transformer) with the overall structure of the neural network shown below.

This structure for the vision transformer will optimize its performance due to the involvement of positional encodings and its encoder-decoder structure, which will transform it into vectors with self-attention weights that have contextual awareness. The precise structure for the encoder-decoder is shown below. 


Then, we create Grad-CAM’s structure. First, we back-propagate the gradient for each layer on our machine-learning model to find the output class’s probability score with respect to the self-attention layer’s query embeddings. Then, we take the mean of each self-attention head in the layer and find the weighted sum of individual embeddings. This will allow for the most important query in the layer (the vector with the most attention weight) to be emphasized. This all then flows through the Softmax Activation function to ensure that only the positive contributions are retained. The activations are then converted into a heatmap through the upsampling technique of bilinear interpolation. This technique estimates the values of pixels at non-integer coordinates based on the values of neighboring pixels and resizes the heatmap to the desired size, maintaining its clear resolution. The heatmap is then overlaid onto the original MRI scan and visualized in the form of an intensity interpretability heatmap for the ML model.

Lastly, we will bridge the healthcare divide by creating an open-access website that showcases our project and allows users to upload their MRI scans for a result. The website will be made first with the writing of its UX/UI main structuring and front-end via Figma and HTML, CSS, and JavaScript. Then, we will connect to a server using Linux’s Ubuntu and lay foundational code to get it running. We will then set up the server so that machine learning code can be run on it. This will all be done using Flask, Terminal, Python, and Pytorch. Then, we will connect our ML model and interpretability features to the front end so that anything the user uploads for their MRI imaging can be run through our model. We will then add resources and treatment methods on our website so that areas with less advanced healthcare can still have the knowledge necessary to mitigate the disease. 

Risk and Safety: This research has no significant risk and safety concerns.

Data analysis: All data analysis will be conducted using selected standard metrics for comparison for machine learning models. This includes Precision, Recall, Macro F1 Score, Weighted F1 Score, and ROC AUC. [8] We will also analyze the heatmaps by visually observing areas of interest in the highlighted heatmap. 

Questions and Answers

  1. What was the major objective of your project, and what was your plan to achieve it?


The major objectives of my project were to create an accurate and trustworthy method of diagnosing the early stages of Alzheimer’s disease, solve the black box problem commonly seen in machine learning that prevented its implementation in the medical field, and increase medical resource accessibility, bridging the healthcare gap between developed and developing areas. My plan was to create a custom image recognition machine learning neural network model, combined with a saliency heatmap, to increase human understanding and interpretability of the predicted results. I then created a website that could allow users to use my model by uploading their own MRI scans for an early diagnosis result and heatmap, as well as access the other resources and information on Alzheimer’s disease on the website. 


  1. Was that goal the result of any specific situation, experience, or problem you encountered?


Last year, my grandfather tragically died of late-stage Alzheimer's disease. Dealing with grief, I questioned whether things would’ve been different if our family had realized his situation earlier. This is what initially inspired me to research Alzheimer’s Disease. As I learned about the problems and challenges of diagnosing early-stage Alzheimer’s Disease, I was motivated to find a method to prevent our family’s situation from happening to others. I decided to apply my technological and computing skills to this field, using machine learning to early diagnose Alzheimer's disease based on MRI scans.


  1. Were you trying to solve a problem, answer a question, or test a hypothesis?


I was trying to solve the problem of the low accuracy and high variability in Alzheimer’s disease early-stage diagnosis as well as try to increase the level of healthcare accessible to those in areas with less advanced medical knowledge. 


2. What were the major tasks you had to perform in order to complete your project?


Before conducting the project itself, I had to apply for access to government health databases for the MRI brain scans, in addition to strengthening my knowledge of machine learning and interpretability measures implemented in the past. I learned advanced Python programming, JavaScript, machine learning, and medicinal knowledge of Alzheimer’s disease. 


During the research project, I created my own custom machine learning model (a custom Vision Transformer) that could draw contextual and specific regions’ data while determining the diagnosis result so that the prediction could be as accurate as possible. Furthermore, I also adapted 5 traditional machine learning models as baselines for performance comparisons. 


Once I figured out the machine learning model, I further created a custom interpretability module (a custom Grad-CAM) to create the machine learning model’s heatmap in order to offer a direct visible result to users regarding the machine learning model’s results. 


Finally, I created a website with open access using my code and connected to an Ubuntu server to upload the code and connect the website. With this website, my scientific research turned into a practical web-based application, allowing normal patients with no computer science knowledge or medicinal knowledge to not only obtain early predictions of Alzheimer's disease by a simple upload-and-run but also develop more knowledge about the disease itself and how to take care of patients with symptoms. 


  1. For teams, describe what each member worked on.



3. What is new or novel about your project?


My largest contribution is that my custom ViT significantly improves the accuracy of Alzheimer's Disease (AD) diagnosis at its early stages by combining avant-garde machine learning models with traditional disease diagnosis. My ViT model with the Grad-CAM module improves diagnostic accuracy to three standard deviations from the current manual diagnosis accuracy of roughly 77% to 99% accuracy and precision. Acknowledging the black-box nature of Machine Learning and how its lack of explanation for results could prevent its usage within the medical field, known as the “black box problem,” I created a custom interpretability heatmap Grad-CAM that can show a heatmap of places the machine learning model thought was the most essential in the result of the diagnosis, giving doctors and patients a direct, visual reference. 


Furthermore, I improved the methodology of using the machine learning model in medical research. Firstly, current research tends to train machine models on different datasets and evaluation metrics. I unified the comparison metrics and compared 6 Machine learning models, including my ViT, making the research result more convincing. Secondly, existing research also mostly uses traditional CNNs and RNNs for disease diagnosis. My project uses an interpretive vision transformer neural network (ViT and Grad-CAM) for disease diagnosis through MRls, which is very innovative in the Alzheimer’s Disease diagnosis field. Finally, my research also analyzes and explores the differences and reasons for differences between a transformer neural network (ViT) and a traditional VGG neural network. 


Last but not least, my research findings and machine learning model have become practical web-based applications since the creation of my website. Users can easily upload their MRI scans with a simple upload-and-run click to get their predictive results. I am among the first few to offer an open-access website for diagnosing early-stage Alzheimer’s disease through MRIs.


  1. ls there some aspect of your project's objective, or how you achieved it that you haven't done before?


I have never done an entire research project before. Before this research, I knew little about MRI scans or the biological composition of the brain, let alone Alzheimer’s Disease. To conduct research in the early diagnosis field, I had to read hundreds of pages of medical papers to build my knowledge in this field. 


Also, before this project, I had never created a machine-learning model. My prior computer science knowledge was mainly limited to coding competitions, and I had never tried to use my coding skills in real scientific research. When I started this project, I taught myself AI and machine learning through online courses and videos and proactively sought expert help. This is not just a great learning experience but also a big growth in life. I grew more curious, persistent, and motivated.  


  1. ls your project's objective, or the way you implemented it, different from anything you have seen?


My project’s objective is very different from many things I’ve seen in the past. Previously, all diagnoses of Alzheimer’s Disease is done manually by doctors. My project utilizes machine learning to diagnose AD. Furthermore, my project uses machine learning with the transformer architecture. In the past, the machine learning models of disease diagnosis have all been with RNNs, CNNs, and more traditional architecture. Most importantly, my project has an interpretability feature that solves the black box problem for all my codes used. This hasn’t been done much in the past before, as all the previous research I saw used machine-learning models with traditional structures and didn’t implement similar interpretability features or a website. 


  1. lf you believe your work to be unique in some way, what research have you done to confirm that it is?


All the points mentioned above set my work apart from past research. When preparing for my project, I combed through many past research papers and articles. I confirmed that my method of implementing a custom vision transformer with a custom interpretability heatmap for doctor referencing for the early diagnosis of Alzheimer’s disease hadn’t been done before. 


Furthermore, I created a website with open access using my code and connected to an Ubuntu server to upload the code and connect the website. My website allowed normal patients with no computer science knowledge or medicinal knowledge to not only obtain early predictions of Alzheimer's disease by a simple upload-and-run but also develop more knowledge about the disease itself and how to take care of patients with symptoms. 



4. What was the most challenging part of completing your project?


The most challenging part of the project was writing my own vision transformer machine learning model. Before I worked on the research, I only had preliminary knowledge of machine learning algorithms and model structures. As a result, when starting the project, I relearned most of the machine learning from scratch through Google courses, research, articles, videos, etc. It also took a long time to write my transformer and interpretability features. 


  1. What problems did you encounter, and how did you overcome them?


When comparing past machine learning models, I realized that each was created to fit a specific type of data training format. For instance, one model’s default training format would be with jpegs of a certain size and another with something different. This caused serious compatibility issues as I attempted to feed one dataset into all six machine-learning models in order to compare their accuracy. I overcame the difficulty by adapting the structures of each model and modifying them to fit one dataset. 


When writing the model’s backpropagation and gradient updating portion, I overlooked one concept and got confused between zeroing the optimizer before and after the step. This mistake cost me a total model crash and several weeks of work. I had to dedicate more time to reviewing the concept, investigating more details, and writing the code again more meticulously. 


  1. What did you learn from overcoming these problems?


After overcoming the above challenges, I realized that in order to conduct successful scientific research, I have to clear prior assumptions, let down my ego, and build up my research by exercising extreme academic rigor. I admit that there are always things I don’t know. Still, they can all be learned if I am modest and patient. 


I also learned that every detail matters in research. Sometimes, I grew a little discouraged after seeing how slow the progress was, but I learned that I needed to be more patient in learning and nibble at the details rather than swallow a chunk of information. Even if doing so might delay my progress, it would build a solid foundation for the future. 


Now, I changed my learning approach, reviewing every concept, even if I thought I knew it well already, and revisiting all traditional codes until I was fully comfortable.


5. lf you were going to do this project again, are there any things you would you do differently the next time?


If I had more time, I would add more modalities so that I could process the different types of MRI scans. Currently, the input images are all preprocessed from NifTis into JPGs (2D). If I added features that would allow my machine learning model to process all forms of the MRI images, including its 3D form directly from the machine, it could potentially help show other areas of concern with the brain, formulating relationships between Alzheimer’s disease and other mental diseases. 


6. Did working on this project give you any ideas for other projects?


After working on this project, I realized that there are many ways that machine learning and recent technological advances could impact traditional fields such as disease diagnosis, protein identification, or cancer research. In the future, I’d love to investigate the evolution of these fields a bit more, for example, modeling through machine learning the abnormal buildup of proteins and their protein-protein interactions that lead to Alzheimer’s disease in the brain. 


7. How did COVlD-19 affect the completion of your project?


COVID-19 didn’t affect the completion of my project.