EvergreenMay 8, 2026

GDELT and Alternative Data in Commodity Markets: How News Flow Becomes a Mineral Volatility Signal

CobaltNickelLithiumCopper
Volterra processes 96 GDELT GKG files daily; mean AUC 0.815

Traditional commodity volatility models rely on price history, term structure, and positioning data. These inputs capture what markets have already priced. The informational edge in mineral volatility forecasting lies in capturing what markets have not yet priced: shifts in geopolitical narrative, supply disruption reports, and regulatory announcements that precede realized vol spikes by days or weeks. The GDELT Global Knowledge Graph is the largest open-access event and news dataset in the world, and it provides the raw material for converting unstructured global news into quantitative trading signals.

What GDELT Captures and Why It Matters for Minerals

The GDELT project monitors broadcast, print, and online news in over 100 languages, updating every 15 minutes. Its Global Knowledge Graph (GKG) layer extracts entities, themes, tones, and geographic references from each article, producing structured records that can be aggregated and filtered programmatically. GDELT processes over 300,000 articles per day across its full pipeline. For commodity applications, the relevant signal is not individual article content but rather shifts in volume, tone distribution, and geographic clustering of coverage around specific supply chain nodes.

Consider cobalt. When GDELT registers a spike in negative-tone articles geolocated to the DRC and tagged with mining or export themes, that pattern often precedes realized volatility increases on LME cobalt contracts. The signal is not sentiment in the naive NLP sense; it is narrative velocity, the rate at which a geopolitical or supply chain topic is accumulating global media attention relative to its baseline.

From Raw News to Model Features: The Engineering Problem

Ingesting GDELT at scale presents engineering challenges that limit its adoption among discretionary traders but make it well suited for systematic pipelines. The GDELT GKG produces 96 update files per day, each containing thousands of records with nested fields for themes, locations, persons, organizations, and tone dimensions. Extracting mineral-relevant signals requires entity resolution (mapping article themes to specific commodities), geographic filtering (isolating coverage of producing regions), and temporal aggregation (computing rolling z-scores of volume and tone).

The Volterra model processes all 96 GDELT GKG files daily, extracting features that capture news intensity, tone dispersion, and geographic concentration of coverage for each of the 12 minerals in its coverage universe. These features enter the XGBoost classifier alongside supply chain concentration metrics like the Herfindahl-Hirschman Index and market microstructure variables from LME, COMEX, NYMEX, and SGX. The model is walk-forward cross-validated with a mean AUC of 0.815 across all minerals and horizons.

Why Narrative Velocity Outperforms Simple Sentiment

Most alternative data vendors sell sentiment scores: a single number representing whether coverage of an asset is positive or negative. For equities, this approach has some predictive value for short-horizon returns. For commodities, it largely fails. Mineral markets respond to supply disruption risk, not to whether journalists describe price moves favorably. Simple sentiment polarity is a weak predictor of mineral price volatility. A mine collapse and a capacity expansion announcement can both carry negative tone scores, but their volatility implications are opposite in direction and magnitude.

What works instead is measuring the rate of change in thematic news coverage relative to a rolling baseline. Volterra's GDELT features capture this narrative velocity rather than polarity. When articles tagged with "export ban" themes and geolocated to Indonesia increase from 3 per day to 45 per day, the absolute tone score matters far less than the volume acceleration itself. Narrative velocity in GDELT data is a stronger predictor of mineral volatility than sentiment polarity alone. This approach aligns with how volatility signals differ from price forecasts: the goal is not to predict direction but to estimate the probability that realized dispersion will exceed a given threshold.

Combining News Flow with Structural Supply Data

GDELT features alone are insufficient for robust mineral volatility prediction. News data is noisy, episodic, and subject to media attention cycles that do not always correspond to fundamental supply-demand shifts. The Volterra pipeline combines GDELT-derived news features with geographic supply concentration data and exchange-specific market signals to generate its 7-day, 14-day, and 30-day probability forecasts across five risk levels. Combining GDELT news features with supply concentration and market data produces more robust volatility forecasts than any single data source alone. The interaction between narrative velocity and structural concentration is where the strongest signals emerge: a news spike about a mineral with an HHI above 0.25 carries different risk implications than the same spike for a mineral with diversified global production.

Figures from the Volterra daily pipeline. Full historical backfill available on AWS Data Exchange.

The Volterra dataset delivers these composite signals daily for all 12 covered minerals. For systematic traders building alpha overlays or risk managers calibrating VaR thresholds, the combination of news-derived features with structural supply data offers a differentiated input that pure price-based models cannot replicate. The informational content of GDELT for commodity applications is not in any single article but in the aggregate statistical behavior of global news flow around supply chain nodes, and that behavior is measurable, forecastable, and tradeable.

Get daily volatility predictions

12 minerals. 3 horizons. Delivered before market open.