Real Time Autocomplete
Powered by advanced vision-language model to provide autocomplete suggestions on Chest X-Ray report writing
Saliency Map Visualization
Highlight areas in the Chest X-Ray image corresponding to the provided autocomplete suggestions
Save Reporting Time
Save reporting time and give radiologists back their time to spend on key diagnosis and patient care
Our Mission
Provide radiologists with a supplementary tool that jumpstarts their report generation by annotating chest x-ray images and highlighting key observations of the radiology study using advanced Vision Text Informed Autocomplete Language (ViTAL) model.
Transform how radiologists work
Too many reports to be generated, not enough radiologists
There are more than 70,000,000 Chest X-rays performed each year in the U.S.
Lead to long turnaround time
The U.S. National average cost for Chest X-ray is $420
Long turnaround time attributes to higher cost
An AI system that provides X-ray informed sentence completion suggestion as radiologists type their reports
Real time autocompletion
ViTAL provides autocompletion based on the patient's chest X-Ray images
Take account into patient's medical history
ViTAL takes current and historical Chest X-ray image and doctor's annotation
Our team fine-tuned pretrained Vision Transformer Model and Large Language Model to provide annotation on Chest X-Ray images
Swin Vision Transformer pretrained on ImageNet dataset + GPT2 architecture
Such architecutre allows Swin ViT model to produce embeddings based on Chest X-ray images, along with text embeddings based on radiologist input, are fed to a Large Language Model (GPT2) to enable annotation on Chest X-Ray images
Pretrain ViTAL model with MS COCO dataset, then fine-tuned on MIMIC-IV Chest X-Ray images
Use large image caption MS COCO dataset to pretrain the ViTAL model such that the ViT and LLM talks in the same language; then fine-tunes the ViTAL model with MIMIC-IV Chest X-ray images and ground truth radiologist annotations to produce high quality reports based on Chest X-ray images
ViTAL In Action
Watch a 5 min demo on how to use ViTAL to assist composing a Chest X-ray report
Try it out yourselfResearch
Vision Text Informed Autocomplete Language (ViTAL) model links pretrained Swin Vision Transformer and GPT-2 LLM
to enable vision + language to language model to generate a descriptive narrative of X-ray findings based on given Chest X-ray images.
Our Team
We are researchers from the UC Berkeley Master of Information and Data Science (MIDS) program
who aim to leverage advanced AI models to solve real-world problems