Synthetic Healthcare Data - Part 2

Nov 27, 2024

Let's Talk About Building a Healthcare Data Generator - Part 2

Hey there! Last time we covered the basics of our healthcare data generator. Today, let's dig into how we're making this data feel real and clinically meaningful.

Making Hospital Stays Feel Real

Think about the last time you or someone you know was in the hospital. The length of stay probably depended on why they were there, right? That's exactly what we're modeling here. For planned surgeries (we call these "elective" admissions), patients usually stay for a predictable time. But emergency cases? That's where it gets interesting.

Here's how we're handling it:

if spell['admission_method'] == 'Day Case':
    los = 0  # In and out the same day
elif spell['admission_method'] == 'Elective':
    los = np.random.geometric(0.3)  # Pretty predictable
else:  # Emergency
    los = np.random.geometric(0.2)  # More variable
    # And if they have something serious like heart failure?
    if any(c in ['Heart Failure','Pneumonia'] for c in conditions):
        los += np.random.geometric(0.15)  # They'll likely stay longer

A&E (The Emergency Department)

We all know A&E gets busier at certain times. Our generator reflects this - more people show up after work hours, fewer at 3 AM. We're also thinking about how long people wait (averaging about an hour) and how long they spend being treated (roughly two hours).

Outpatient Visits

If you've ever had a chronic condition, you know the drill - regular check-ups. That's why our diabetic patients tend to have about 4 appointments a year, while others might only need 1 or 2. And yes, we've included those frustrating appointment no-shows (DNA - Did Not Attend) that keep hospital administrators up at night!

What's Next?

While our generator is pretty good at creating believable hospital data, there's still more we could add.

Remember, good synthetic data isn't just about getting the numbers right - it's about telling believable patient stories. Each generated record represents a journey through the healthcare system, and we want that journey to make clinical sense.

What aspects of healthcare data would you like to see me model next?

Ben Terry