A modern hospital generates more imaging in a year than every radiograph it produced in its first three decades. A single FASTQ run from a sequencing lab is 100–300 GB before alignment. An orthopaedic-implant manufacturer carries a 25-year follow-up obligation against every device shipped. A clinical-trial dataset is retained for the lifetime of the regulatory submission plus another decade.
Medical estates retain everything, forever, by regulation. The cost curve is unforgiving — and the conventional answer (“move it to the cheap cold tier”) breaks the day someone asks a clinical question that needs the data hot again. Storage costs compound. Imaging archives migrate every 5–7 years from one PACS vendor to another and lose provenance in transit. Sequencing labs throw away results because the storage bill is bigger than the experiment.
On top of which, the data is structurally siloed. Imaging sits in PACS. EHR sits in the hospital information system. Genomics sits in a lab LIMS. Device-tracking logs sit in the manufacturer's warranty database. Clinical-trial assays sit with the CRO. Every cross-system question — “how did this implant's outcome correlate with this patient's imaging history?” — costs a quarter and three analysts.
Public-LLM tools are non-starters. Patient data crosses regulatory, ethical, and contractual boundaries the moment it leaves the hospital perimeter. The medical estates that need this capability the most are the ones structurally blocked from the off-the-shelf product.