> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gdeltcloud.com/llms.txt
> Use this file to discover all available pages before exploring further.

# gdelt-discover-and-drill

> Canonical GDELT Cloud workflow: summarize → search → drill, with over-filter recovery and graph traversal recipes

## Overview

The `gdelt-discover-and-drill` skill is the canonical GDELT Cloud workflow. Use it for any question about geopolitics, conflict, security, supply chain, sanctions, infrastructure, location risk, or narrative around named actors.

It teaches the three-stage pattern that separates aggregate metrics from citable evidence, plus the over-filter recovery sequence and validated graph traversals across Events, Stories, and Entities.

## Three-stage pattern

```
SUMMARIZE → SEARCH → DRILL
   ↓           ↓        ↓
  shape    candidates   evidence
```

* **Stage 1 — Summarize for shape and metrics.** `summarize_events` or `summarize_stories` returns baseline volume, geographic and category clustering, and aggregate structured metrics per bucket (significance, magnitude, systemic\_importance, propagation\_potential, market\_sensitivity, confidence). Cheap, fast, and produces the quantitative picture before drilldown.
* **Stage 2 — Search for citable candidates.** `search_stories` (narrative) or `search_events` (incidents) with a focused semantic `search` returns the GDELT Cloud public URLs needed for citation.
* **Stage 3 — Drill for evidence.** `get_story_articles(story_id)` for the full article list, `get_entity(id)` to expand the entity graph, `EXTRACT_WEB_PAGES` (via `web_research_tool_call`) for source text when direct quotes are needed.

Most citable analytical briefs need all three stages.

## The over-filter trap

Stacking `subcategory` + `country` + semantic `search` returns sparse or empty results even on well-covered topics. Recovery sequence:

1. Drop `subcategory`, keep `category`.
2. Drop `country` — Stories often live globally even when the actor is national.
3. Switch axis: if you started on Events, try Stories.
4. Run `summarize_stories(group_by=category)` to see where the volume actually clusters.

## Graph traversal recipes

GDELT Cloud data is a graph: Entities ↔ Stories ↔ Events.

* **Topic → who's involved**: `search_stories(search=…)` → harvest `entity_refs` → `get_entity(id)`.
* **Actor → what they did**: `search_entities(search=…)` → `get_entity(id)` → walk linked Events/Stories.
* **Story → primary evidence**: `search_stories` → `get_story_articles` → `EXTRACT_WEB_PAGES`.
* **Incident → market reaction**: `search_events` → take Event date/actor → pivot to `macro_finance.TIME_SERIES_DAILY_ADJUSTED`.

## Structured metrics

Quote the structured metrics in output. "Significance 0.55, propagation\_potential 0.35, confidence 0.98" is the analyst's value-add over a generic agent with web search.

| Metric                  | Meaning                                                            |
| ----------------------- | ------------------------------------------------------------------ |
| `significance`          | Canonical importance (0–1).                                        |
| `magnitude`             | Event size on the structured scale.                                |
| `systemic_importance`   | Structural / systemic weight.                                      |
| `propagation_potential` | Likelihood of follow-on effects.                                   |
| `market_sensitivity`    | Likely market relevance.                                           |
| `goldstein_scale`       | Cooperation ↔ conflict valence (null for descriptive event types). |
| `confidence`            | LLM-coded reliability.                                             |
| `fatalities`            | Realized severity.                                                 |

## Anchor for multi-surface tasks

When `macro-finance`, `prediction-markets`, or `multi-surface-synthesis` are also loaded, this skill produces the data they consume: dates, actors, locations, entity Wikipedia URLs, and significance-ranked Stories. Run GDELT first, harvest those fields, and hand them downstream.
