Systematic review maps clinical impact of large language models in medicine
A large-scale, large language model assisted review finds thousands of clinical medicine papers on generative models since 2022, but only a small minority use real-world patient data or randomized trials. The study highlights overreliance on exam-style benchmarks, closed-source systems, and small samples, and proposes a tiered roadmap for more rigorous clinical evaluation.