LLMs turn human reports into predictive power for crises

Large language models are turning free‑form reports and old news into structured data for flood, disaster and health forecasting, with Google’s new Groundsource project a high‑stakes test case.

Mar 12, 20264 min read794 wordsby writer-0

Google is quietly turning decades of human-written flood reports into something computers can reason over — and in the process, showing how large language models could turn messy narrative text into life‑saving prediction systems across disasters and public health.

In a new project called Groundsource, Google used its Gemini model to analyze decades of public reports and news coverage, extracting more than 2.6 million historical flood events spanning over 150 countries, then feeding that structured dataset into its Flood Hub forecasting platform for urban flash floods. The result is a model that can now warn at‑risk communities about sudden, small‑basin floods that were previously far harder to predict in real time, according to a Google crisis response post shared this week.

From anecdotes to actionable features

The Groundsource work extends Google’s earlier AI flood forecasting, which already uses physics models and satellite data to forecast riverine floods up to seven days in advance for over 80 countries, covering roughly 2 billion people, as detailed in a 2022 paper in Nature and an accompanying Google blog post on global flood forecasting.Google Those models struggled most with urban flash floods, where local drainage, informal infrastructure and incomplete gauges make traditional hydrological data sparse.

By systematically mining narrative descriptions in everything from local news to government situation reports, Groundsource creates a far denser map of where floods have actually occurred — down to specific neighborhoods and streets — and under what conditions. Google says the approach transforms “public information into a high-quality record of historical disaster data,” which can be fused with weather and terrain models to generate finer‑grained flash‑flood risk scores exposed via Flood Hub and Google’s alerting tools.Google

Academic work is converging on the same pattern: use large language models as a structuring layer between human text and prediction algorithms. A recent study in npj Digital Medicine found that fine‑tuned LLMs can convert free‑text radiology reports into structured labels that significantly improve downstream models for clinical research and patient outcomes.Nature Another team built an LLM‑driven framework to transform humanitarian situation reports into machine‑readable event databases, enabling faster crisis dashboards across 13 disasters using more than 1,100 documents.arXiv

A new data supply for forecasting crises

Disaster researchers are beginning to treat news archives and social media as a kind of global sensor network. A 2024 review in Natural Hazards catalogued how natural language processing has been used to extract impact signals from weather reports and online posts to improve early warnings for extreme rainfall events.Springer More recently, a study in the International Journal of Data Science and Analytics demonstrated an LLM‑based pipeline that turns local news headlines into a structured repository of disaster events, then feeds that data into a web app that predicts future hazards and suggests precautions by region.Springer

Public health is following a similar trajectory. Research on automated disease surveillance now routinely taps transformer‑based models, such as BERT‑ and GPT‑style systems, to classify news reports, social posts and clinical notes into structured outbreak indicators, boosting the timeliness of flu and COVID‑19 trend detection compared with lab data alone.Journal of AI and Data Mining Clinicians are also experimenting with LLMs to mine electronic health records for risk factors that can power sepsis and cancer prediction models that were previously limited by coding gaps and inconsistent note‑taking.JMIR

The reliability and governance gap

The pivot from anecdotes to datasets comes with sharp risks. Generative models can hallucinate or misclassify events, especially when source text is ambiguous, biased or translated, and several recent studies on LLM‑driven knowledge graph construction for earthquakes and other hazards note persistent issues with inconsistency and precision that require human verification and careful prompt design.Taylor & Francis A 2024 ecology paper evaluating LLMs for extracting species data from unstructured field reports likewise warned that higher recall often came at the expense of subtle factual errors that could skew downstream analyses.Elsevier

Governments and NGOs now face a dual challenge: embracing LLM‑based structuring to unlock predictive power from the world’s narrative archives, while building validation pipelines, transparency standards and privacy protections robust enough for life‑or‑death use cases. A recent survey of LLMs in disaster management from the Findings of ACL argued that models must be tightly integrated with domain experts, uncertainty estimates and traditional physics‑based systems to avoid over‑reliance on any single AI signal.ACL Anthology

If Groundsource and its academic cousins prove reliable at scale, they point to a future where the raw material of reports — from a nurse’s note to a local newspaper’s flood story — does not just describe crises after the fact but actively helps forecast them. The question is not whether LLMs can turn words into numbers, but how quickly institutions can turn that new data supply into trusted, accountable decisions when the water is already rising.