visual ai in healthcare 2025

#post #industry #career

11 Signposts

5 Chests

4 Saplings

3 Seedlings

3 Trees

2 Withered

1 Stone

Visual AI in Healthcare | 2025-06-26 @ 9am

I spoke at the Visual AI in Healthcare virtual meetup hosted by Voxel51. I like using their open-source tool to curate my vision datasets.

Visual AI in Healthcare - 2025

The event pages for each day now serve as recaps of the talk. Check them out below:

My talk - Continuous patient monitoring with AI

We recently published a peer-reviewed paper at Frontiers in Imaging. We think it is the first peer-reviewed work out of industry on vision AI in hospitals. I was invited to speak on this work.

Here's some content that was generated as a result.

Sources

Figures courtesy of LookDeep Health. Video provided by Voxel51.

Notes on other talks

Here are my short notes for each talk.

Day 1

Deep Dive: Google’s MedGemma, NVIDIA’s VISTA-3D and MedSAM-2 Medical Imaging Models

The speaker, Daniel Gural, has been with Voxel51 as a ML Engineer for quite some time. They has give a number of talks already. Recently, the focus of their talks has been on medical image segmentation. This time, three models are discussed (see title). They are nicely integrated into the Voxel51 tool for curating image datasets. They left early to give another talk with NVIDIA, because they also recently integrated Omniverse for synthetic data generation.

Building Your Medical Digital Twin — How Accurate Are LLMs Today?

The speaker, Kate Kondrateva, used today's top LLM tools (chatGPT, Gemini, MedLLaMA, etc) to create a "medical digital twin" using their own health records. They probed where each model delivered and fell short. They also did an analysis on cost and data retention from a privacy perspective. In short, using these tools can help you better prepare for participating in your own care, but user beware.

Also, they recently started a new role at Atelic.ai!

Vision-Driven Behavior Analysis in Autism: Challenges and Opportunities

The speaker, Somaieh Amraee, is a post-doc at Northeastern University, working at the Institute for Experiential AI. Their research focuses on analyzing the behavior of individuals with ASD. They gave an overview of using pose estimation and multi-object tracking as a way to monitor movement behaviors of individuals with ASD. They also discussed the challenges in building clinically meaningful tools (very relatable).

PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion

The speaker, Amar Kumar, gave a sneak preview of their upcoming oral presentation at MIDL 2025. They presented their new framework. From their project page...

PRISM can generate high-fidelity counterfactual medical images with precise control over specific attributes.

Day 2

Multimodal AI for Efficient Medical Imaging Dataset Curation

The speaker, Brandon Konkel, is a senior machine learning engineer at Booz Allen Hamilton. They talked about how they provide multi-modal medical imaging datasets for their partners using FiftyOne.

AI-Powered Heart Ultrasound: From Model Training to Real-Time App Deployment

The speaker, Jeffrey Gao, is a PhD student at Caltech. They are working on a real-time ultrasound app. It uses AI to automatically detect and assess key heart parameters. The demo uses iPhone and the Butterfly ultrasound probe.

Read the paper for more information.

AI in Healthcare: Lessons from Oncology Innovation

The speaker, Asba Tasneem, is the program chair at NCER. They talked about the business of AI in healthcare, from an oncology perspective. Their reference project was about bring AI-enabled portable X-ray machines to the Himalayas.

Day 3

LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging

The speaker, Maximilian Rokuss, is a PhD student at DKFZ German Cancer Research Center. They presented the LesionLocator model, which can find lesions and track tumors in 3D imaging. I was impressed with the level of advanced techniques applied to get close-to-human performance.

Check out the github repo for more details.

MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders

The speakers, Ashwin Kumar and Maya Varma, are PhD students at Stanford University. They presented MedVAE, a family of six variational autoencoders (VAE). These models encode medical images, preserving their information in less space than full-resolution images. What is most impressive is how the two demonstrated that these models work across several medical imaging domains.

Check out the github repo for more details.

Leveraging Foundation Models for Pathology: Progress and Pitfalls

The speaker, Heather Couture, is a consultant, research scientist, and podcast host working at PixelScientia. They discussed, in detail, what the world is starting to look like as foundation models are being used to diagnose medical conditions. The four uses for foundation models they outlined were 1) gigapixel images and weak supervision; 2) limited labeled data; 3) robustness but only to site data; 4) rapid prototyping.

You can learn more by listening to their podcasts

LLMs for Smarter Diagnosis: Unlocking the Future of AI in Healthcare

The speaker, Gaurav K Gupta, is a software developer at the Lake County Health Department in Illinois. They talked about their experience with using LLMs over the past year. They discussed two studies they ran comparing different LLMs. They also motivated using agentic ai systems for orchestrating clinical workflows.