> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gdeltcloud.com/llms.txt
> Use this file to discover all available pages before exploring further.

# GDELT Discover-and-Drill

> The canonical three-stage workflow for GDELT Cloud analytical tasks

## When to use

Any question about geopolitics, conflict, security, supply chain, sanctions, infrastructure, location risk, or narrative around named actors. This is the default GDELT Cloud workflow.

## The three stages

```
SUMMARIZE → SEARCH → DRILL
   ↓           ↓        ↓
  shape    candidates   evidence
```

1. **Summarize** with `summarize_events` or `summarize_stories` to see baseline volume, geographic and category clustering, and aggregate metrics. Cheap and fast.
2. **Search** with `search_events` or `search_stories` (with focused semantic `search`) to get the citable Story or Event records.
3. **Drill** with `get_story_articles`, `get_entity`, and `EXTRACT_WEB_PAGES` for the underlying article evidence and second-degree network.

## REST equivalent

```bash theme={null}
# Stage 1 — shape
GET /api/v2/stories/summary?country=Lebanon&group_by=date&date_start=2026-04-06&date_end=2026-05-06

# Stage 2 — citable candidates
GET /api/v2/stories?country=Lebanon&search=energy%20infrastructure&sort=significance&limit=10

# Stage 3 — evidence
GET /api/v2/stories/{story_id}/articles?limit=25
GET /api/v2/entities/{entity_id}
```

## MCP equivalent

```python theme={null}
gdelt_cloud_tool_call(
    tool_name="summarize_stories",
    tool_arguments={"country": "Lebanon", "group_by": "date", "days": 30}
)

gdelt_cloud_tool_call(
    tool_name="search_stories",
    tool_arguments={
        "country": "Lebanon",
        "search": "energy infrastructure",
        "sort": "significance",
        "limit": 10,
    }
)

gdelt_cloud_tool_call(
    tool_name="get_story_articles",
    tool_arguments={"story_id": "{STORY_ID}", "limit": 25}
)
```

## The over-filter trap

Combining `subcategory` + `country` + semantic `search` returns sparse or empty results even on well-covered topics. If a query is empty:

1. Drop `subcategory`, keep `category`.
2. Drop `country` — Stories often live globally even when the actor is national.
3. Switch axis: if you started on Events, try Stories.
4. Run `summarize_stories(group_by=category)` to see where the volume actually clusters.

## Graph traversal

GDELT Cloud is a graph: Entities ↔ Stories ↔ Events. Most non-trivial questions need 2–3 hops:

* **Topic → who's involved**: `search_stories` → harvest `entity_refs` → `get_entity`.
* **Actor → what they did**: `search_entities` → `get_entity` → walk linked Events/Stories.
* **Story → primary evidence**: `search_stories` → `get_story_articles` → `EXTRACT_WEB_PAGES`.
* **Incident → market reaction**: `search_events` → take date/actor → pivot to `macro_finance.TIME_SERIES_DAILY_ADJUSTED`.

## Cite the structured metric

Every Event and Story carries scores: `significance`, `magnitude`, `systemic_importance`, `propagation_potential`, `market_sensitivity`, `confidence`. Quote them in output — *"significance 0.55, propagation\_potential 0.35, confidence 0.98"* — instead of *"this seemed important."* The numbers are the analyst's value-add.
