3 minute read


This work was awarded ‘Best Technical Contribution’ at the Michigan AI Symposium 2023. Link to paper

Context: Diagnosis and surveillance of thoracic aortic aneurysm (TAA) involves measuring the aortic diameter at various locations along the length of the aorta, often using computed tomography angiography (CTA) scans. Rapid aorta growth indicates cause for concern and possible surgical interventions.

Challenge: Aortic measurements performed by expert human raters require 15 to 45 min of focused effort that can be prone to be mistakes.

Objective: Develop a computer vision pipeline to automate accurate aortic diameter measurements.


Developed a fully automated aortic diameter measurement pipeline using deep learning and classical computer vision techniques. We demonstrated low errors that are comparable in magnitude to those from manual measurement. Our pipeline offers significant time savings compared to manual measurements that take 15 to 40 minutes.

My Contributions

  1. Data preparation: Developed a robust ETL pipeline in Python to clean and organize 720+ new 3D CTA volumes into a standardized format. Final dataset is about 700 GB.
  2. Training optimization: Improved CNN training speed by a factor of 500 by minimizing data movement, implementing half precision training, utilizing parallel GPU processing on a SLURM cluster, and addressing other bottlenecks.
  3. Prediction optimization: Improved prediction accuracy by utilizing a sliding window inference function. Improved prediction time by a factor of 10 by addressing pipeline bottlenecks like unnecessary data movement.
  4. Performance Validation: Implemented cross-fold validation and ran statistical paired t-tests to compare different network architectures and training approaches. Found that multitask training improved accuracy compared to traditional single task training approach.
  5. Guard Rails: Implemented guardrails for when this software is deployed in a clinical setting. Specifically, the pipeline not only outputs diameter measurements, but displays visuals of those measurements that can be easily verified by a physician.



U-Net is a deep neural network architecture named for its shape. It features an encoder-decoder structure with skip connections. U-Nets are commonly used in medical image analysis due to their ability to combine high-level features learned via the encode-decoder structure with precise detail maintained with skip connections. We train a 3D U-Net to segment the aorta from a 3D CTA volume and to localize six key anatomical landmarks.

Our multitask U-Net architecture.

U-Net Performance

Our problem has two tasks, segmentation and landmark localization. We trained two separate “single task” U-Nets to perform these functions and compared their performance to a single “multitask” U-Net trained to perform both functions. Using 5-fold cross validation and paired t-tests, we found that the multitask network performed better at segmentation \((p<0.0002)\) while the single tasks network approach led to better landmark localization \((p<0.0001)\).

Since accurate segmentation is more important for diameter measurement, we opted to use the multitask network in our full diameter measurement pipeline.

Automated Diameter Measurement Performance

Like many real world tasks, “ground truth” is not easily accessible. In our case, the actual diameter of the aorta is difficult to determine from a CTA scan, even when done manually by an expert. In fact, measurements performed by two experts on the same scan vary by 2-5 mm on average and similar variability is observed when the same expert measures the same scan on two different days. For our evaluation, we use the average of three expert measurements as the target or ‘ground truth’ size of the aortic diameter.

Measurements are performed at the nine standard clinical locations. The above Bland-Altman plot shows that the automated diameter measurement pipeline is accurate across the nine measurements locations and is accurate across a variety of aortic diameter sizes.

We also compared our automated pipeline to the standard deviation of the three manual measurements and demonstrate that the automated solution has comparable variability.

Overall, our automated pipeline achieved less than 1 mm mean absolute error (MAE) in six of the nine locations. MAE is 1.48 mm to 2.18 mm in the three other locations where manual measurement variance is also highest. Overall, the measurement error is within acceptable limits and comparable to human measurement variability.