Skip to main content

Overview

GDELT Cloud provides access to 10 GDELT database tables covering structured events, raw article metadata, entity extractions, and historical tracking. Each table is optimized for specific query patterns and use cases.
Access schemas programmatically using the get_resource tool or via MCP resource URIs (gdelt://schema/*).

Table selection guide

Choose tables based on your question type:

Events

gdelt_eventsWHO-WHAT-WHERE-WHEN structured eventsSpeed: < 1s

GKG

gdelt_gkgRaw article metadata (use extracted tables instead)Speed: SLOW

Mentions

gdelt_mentionsEvent-to-article bridge tableCoverage analysis

Stats (View)

mv_event_mention_statsPre-aggregated trendingSpeed: < 0.5s (fastest)

Themes

gdelt_gkg_themes_extractedTopic/theme filteringENV_, ECON_, etc.

Persons

gdelt_gkg_persons_extractedPerson mentions in articles30-day window

Organizations

gdelt_gkg_organizations_extractedOrganization mentions30-day window

Locations

gdelt_gkg_locations_extractedGeographic mentionsWith coordinates

Persons Master

gdelt_persons_masterHistorical person trackingAll-time data

Orgs Master

gdelt_organizations_masterHistorical org trackingAll-time data

Core tables

gdelt_events

The primary table for structured event data with WHO-WHAT-WHERE-WHEN information.
Primary keys:
  • event_id (UInt64) - Unique event identifier
  • day (Date) - Event date for partitioning
Actor fields:
  • actor1_name, actor2_name (String) - Actor names
  • actor1_country_code, actor2_country_code (FixedString(3)) - ISO-3 country codes
  • actor*_type1_code, actor*_type2_code, actor*_type3_code (FixedString(3)) - CAMEO type codes
  • actor*_known_group_code (String) - Known organization codes
  • actor*_ethnic_code, actor*_religion1_code (String) - Identity codes
Event fields:
  • event_code (FixedString(4)) - Full CAMEO event code
  • event_base_code (FixedString(3)) - Base event code
  • event_root_code (FixedString(2)) - Root category (01-20)
  • goldstein_scale (Float32) - Event intensity (-10 to +10)
  • quad_class (Int8) - Quad classification (1-4)
  • num_mentions (UInt32) - Number of source articles
  • num_sources (UInt32) - Number of unique sources
  • avg_tone (Float32) - Average sentiment tone
Location fields:
  • action_geo_country_code, actor1_geo_country_code, actor2_geo_country_code (FixedString(2)) - FIPS country codes
  • action_geo_latitude, action_geo_longitude (Float32) - Geographic coordinates
  • action_geo_full_name (String) - Full location name
Metadata:
  • date_added (DateTime) - When event was added
  • source_url (String) - Source article URL
Performance: Very fast (< 1s for 7-30 day queries)Resource URI: gdelt://schema/events
Use cases:
  • WHO: Track specific actors or countries (military, government, organizations)
  • WHAT: Filter by event types (protests, violence, cooperation, statements)
  • WHERE: Geographic analysis by location or country
  • WHEN: Time series analysis of event patterns
  • HOW INTENSE: Filter by Goldstein scale for conflict/cooperation levels
Example query:
SELECT 
    day, 
    actor1_name, 
    actor2_name, 
    event_code,
    goldstein_scale,
    source_url
FROM gdelt_events
WHERE day >= today() - INTERVAL 7 DAY
    AND actor1_country_code = 'USA'
    AND event_root_code = '14'
ORDER BY day DESC
LIMIT 100

gdelt_mentions

Bridge table connecting events to GKG articles. One event can have many mentions across different sources.
Primary keys:
  • event_id (UInt64) - Links to gdelt_events
  • mention_identifier (String) - Unique mention identifier
  • day (Date) - Mention date for partitioning
Mention fields:
  • mention_time_date (DateTime) - When article was published
  • mention_type (UInt8) - Type of mention (1=web, 2=citation, etc.)
  • mention_source_name (String) - Source publication name
  • mention_doc_len (UInt32) - Document length in characters
  • mention_doc_tone (Float32) - Document sentiment tone
  • confidence (UInt8) - Confidence score (0-100)
Metadata:
  • event_date (DateTime) - Original event date from gdelt_events
Performance: Fast for specific event lookups, slower for broad date scansResource URI: gdelt://schema/mentions
Use cases:
  • Coverage analysis: Count mentions per event
  • Source tracking: Find which outlets covered an event
  • Temporal tracking: Track when coverage peaked
  • Join events to GKG: Bridge to themes, entities, locations
Example query:
SELECT 
    e.event_id,
    e.day,
    e.actor1_name,
    COUNT(m.mention_identifier) as mention_count,
    COUNT(DISTINCT m.mention_source_name) as source_count
FROM gdelt_events e
LEFT JOIN gdelt_mentions m 
    ON e.event_id = m.event_id
WHERE e.day >= today() - INTERVAL 7 DAY
    AND e.event_root_code = '14'
GROUP BY e.event_id, e.day, e.actor1_name
ORDER BY mention_count DESC
LIMIT 100

gdelt_gkg

The raw Global Knowledge Graph (GKG) table containing article-level metadata including themes, entities, locations, tone, and more. This table is SLOW - use the extracted tables instead for most queries.
The gdelt_gkg table contains semicolon-delimited strings that require parsing. For 10-100x faster queries, use the extracted tables: gdelt_gkg_themes_extracted, gdelt_gkg_persons_extracted, gdelt_gkg_organizations_extracted, or gdelt_gkg_locations_extracted.
Primary keys:
  • gkg_record_id (String) - Unique GKG record identifier
  • date (DateTime) - Article publication date and time
  • document_identifier (String) - Source article URL/identifier
Source fields:
  • source_collection_identifier (UInt8) - Collection ID (1=web, 2=print)
  • source_common_name (String) - Source publication name
Entity fields (semicolon-delimited strings):
  • v1_themes, v2_themes (String) - Theme codes (ENV_, ECON_, etc.)
  • v1_persons, v2_persons (String) - Person mentions
  • v1_organizations, v2_organizations (String) - Organization mentions
  • v1_locations, v2_locations (String) - Location mentions with coordinates
Tone field (comma-delimited):
  • v1_5_tone (String) - Format: tone,positive,negative,polarity,activity,self_ref,word_count
Extended fields:
  • v1_counts, v2_counts (String) - Count codes (KILL#150#CIVILIAN#Syria)
  • v2_1_enhanced_dates (String) - Temporal references
  • v2_gcam (String) - Global Content Analysis Measures
  • v2_1_quotations (String) - Direct quotes with attribution
  • v2_1_sharing_image (String) - Primary sharing image URL
  • v2_extras_xml (String) - Extended metadata in XML
Metadata:
  • ingested_at (DateTime) - Ingestion timestamp
Performance: SLOW (requires parsing delimited strings), use extracted tables insteadResource URI: gdelt://schema/gkg
When to use gdelt_gkg:
  • ✅ Tone/sentiment analysis (simple field parsing)
  • ✅ Source comparison (indexed field)
  • ✅ Access to extended fields (v2_gcam, quotations, etc.)
  • ❌ Theme filtering (use gdelt_gkg_themes_extracted instead)
  • ❌ Person/org/location filtering (use extracted tables instead)
Example query (tone analysis):
SELECT 
    toDate(date) AS day,
    source_common_name,
    splitByChar(',', v1_5_tone)[1] AS tone_score,
    splitByChar(',', v1_5_tone)[7] AS word_count,
    document_identifier
FROM gdelt_gkg
WHERE date >= today() - INTERVAL 7 DAY
    AND source_common_name IN ('bbc.com', 'cnn.com')
ORDER BY day DESC
LIMIT 100
Example query (source comparison):
SELECT 
    source_common_name,
    COUNT() AS article_count,
    AVG(toFloat64(splitByChar(',', v1_5_tone)[1])) AS avg_tone
FROM gdelt_gkg
WHERE date >= today() - INTERVAL 7 DAY
    AND source_common_name IN ('bbc.com', 'aljazeera.com')
GROUP BY source_common_name
ORDER BY article_count DESC
The gdelt_gkg table links to gdelt_events via the gdelt_mentions bridge table using the document_identifier field.

Materialized views

mv_event_mention_stats

Pre-aggregated materialized view with coverage metrics per event. This is the fastest table for trending analysis.
Primary keys:
  • event_id (UInt64) - Unique event identifier
  • day (Date) - Event date
Aggregated metrics:
  • event_date (DateTime) - Event timestamp
  • mention_count (UInt32) - Total mentions
  • source_count (UInt32) - Unique source count
  • avg_tone (Float32) - Average sentiment across mentions
  • min_tone, max_tone (Float32) - Tone range
  • confidence_avg (Float32) - Average confidence score
Performance: Fastest (< 0.5s for most queries), pre-aggregatedResource URI: gdelt://schema/event-mention-stats
Use cases:
  • Trending/viral event detection: Find events with high mention counts
  • Breaking news: Identify rapidly growing stories
  • Source diversity: Filter by number of unique sources
  • Sentiment analysis: Find highly negative or positive events
Example query:
SELECT 
    event_id,
    event_date,
    mention_count,
    source_count,
    avg_tone
FROM mv_event_mention_stats
WHERE day = today() - 1
    AND mention_count >= 50
ORDER BY mention_count DESC
LIMIT 100

GKG extraction tables

gdelt_gkg_themes_extracted

Theme codes extracted from GKG (Global Knowledge Graph) articles. Use for topic-based filtering.
Primary keys:
  • document_identifier (String) - Unique article identifier
  • day (Date) - Article date for partitioning
Theme fields:
  • theme (String) - Theme code (ENV_, ECON_, CONFLICT_*, etc.)
  • char_offset (Int32) - Character position in article
Article fields:
  • date_time (DateTime) - Article publication time
  • source_common_name (String) - Source publication
  • tone (Float32) - Article sentiment tone
  • location_country_code (FixedString(2)) - Primary location country
Performance: Fast (1-2s for theme-specific queries)Resource URI: gdelt://schema/themes-extracted
Use cases:
  • Topic filtering: Find articles about specific themes
  • Theme trending: Track theme mentions over time
  • Sentiment by topic: Analyze tone for specific themes
  • Cross-reference: Join with events via mentions table
Available theme categories:
  • ENV_* - Environmental themes (climate, disasters, etc.)
  • ECON_* - Economic themes (trade, inflation, markets, etc.)
  • CONFLICT_* - Conflict themes (military, terrorism, etc.)
  • GOV_* - Governance themes (elections, policy, corruption, etc.)
  • HEALTH_* - Health themes (pandemic, policy, etc.)
  • WB_* - World Bank indicator themes
  • UNGP_* - UN Global Pulse themes
  • CRISISLEX_* - Crisis/disaster terms
Example query:
SELECT 
    day,
    theme,
    COUNT(*) as article_count,
    AVG(tone) as avg_tone
FROM gdelt_gkg_themes_extracted
WHERE day >= today() - INTERVAL 30 DAY
    AND theme LIKE 'ENV_CLIMATE%'
GROUP BY day, theme
ORDER BY day DESC, article_count DESC
LIMIT 500

gdelt_gkg_persons_extracted

Person names mentioned in articles. 30-day rolling window.
Primary keys:
  • document_identifier (String) - Article identifier
  • day (Date) - Article date
Person fields:
  • person_name (String) - Person’s name as mentioned
  • char_offset (Int32) - Position in article
Article fields:
  • date_time (DateTime) - Article publication time
  • source_common_name (String) - Source publication
  • tone (Float32) - Article sentiment
Performance: Moderate (2-4s for common names, faster for rare names)Resource URI: gdelt://schema/persons-extracted
Use cases:
  • Person tracking: Monitor mentions of specific individuals
  • Co-mentions: Find who is mentioned together
  • Sentiment tracking: Analyze tone of coverage for individuals
  • Geographic tracking: Where is a person being discussed
Example query:
SELECT 
    day,
    person_name,
    COUNT(*) as mention_count,
    AVG(tone) as avg_tone
FROM gdelt_gkg_persons_extracted
WHERE day >= today() - INTERVAL 7 DAY
    AND person_name IN ('Joe Biden', 'Xi Jinping', 'Vladimir Putin')
GROUP BY day, person_name
ORDER BY day DESC, mention_count DESC
LIMIT 200

gdelt_gkg_organizations_extracted

Organization names mentioned in articles. 30-day rolling window.
Primary keys:
  • document_identifier (String) - Article identifier
  • day (Date) - Article date
Organization fields:
  • organization_name (String) - Organization name as mentioned
  • char_offset (Int32) - Position in article
Article fields:
  • date_time (DateTime) - Article publication time
  • source_common_name (String) - Source publication
  • tone (Float32) - Article sentiment
Performance: Moderate (2-4s for common names)Resource URI: gdelt://schema/organizations-extracted
Use cases:
  • Organization tracking: Monitor companies, NGOs, institutions
  • Industry analysis: Track sector-specific organizations
  • Crisis monitoring: Identify organizations in crisis situations
  • Sentiment tracking: Analyze coverage tone for organizations
Example query:
SELECT 
    day,
    organization_name,
    COUNT(*) as mention_count,
    AVG(tone) as avg_tone
FROM gdelt_gkg_organizations_extracted
WHERE day >= today() - INTERVAL 7 DAY
    AND organization_name LIKE '%Tesla%'
GROUP BY day, organization_name
ORDER BY day DESC, mention_count DESC
LIMIT 100

gdelt_gkg_locations_extracted

Geographic locations mentioned in articles with coordinates. 30-day rolling window.
Primary keys:
  • document_identifier (String) - Article identifier
  • day (Date) - Article date
Location fields:
  • location_name (String) - Location name as mentioned
  • country_code (FixedString(2)) - FIPS country code
  • latitude, longitude (Float32) - Geographic coordinates
  • feature_id (String) - Geonames feature ID
  • char_offset (Int32) - Position in article
Article fields:
  • date_time (DateTime) - Article publication time
  • source_common_name (String) - Source publication
  • tone (Float32) - Article sentiment
Performance: Moderate (2-4s depending on specificity)Resource URI: gdelt://schema/locations-extracted
Use cases:
  • Geographic analysis: Map where events are being discussed
  • Country-level filtering: Focus on specific countries
  • Proximity analysis: Find articles about nearby locations
  • Regional trends: Track coverage by region
Example query:
SELECT 
    day,
    location_name,
    country_code,
    latitude,
    longitude,
    COUNT(*) as mention_count
FROM gdelt_gkg_locations_extracted
WHERE day >= today() - INTERVAL 7 DAY
    AND country_code = 'UK'
GROUP BY day, location_name, country_code, latitude, longitude
ORDER BY day DESC, mention_count DESC
LIMIT 200

Historical master tables

gdelt_persons_master

Historical person tracking across all time (no 30-day limit). Aggregated view of person mentions.
Primary key:
  • person_name (String) - Person’s name
Tracking fields:
  • first_seen (Date) - First mention date
  • last_seen (Date) - Most recent mention date
  • mention_count (UInt64) - Total lifetime mentions
  • avg_tone (Float32) - Average sentiment across all mentions
  • source_count (UInt32) - Unique sources mentioning
Performance: Fast for specific names, slower for broad scansResource URI: gdelt://schema/persons-master
Use cases:
  • Historical analysis: Track individuals over months/years
  • Comparative analysis: Compare coverage patterns across people
  • Trend identification: Find when someone entered the news
  • Lifetime metrics: Total coverage and sentiment
Example query:
SELECT 
    person_name,
    first_seen,
    last_seen,
    mention_count,
    avg_tone,
    source_count
FROM gdelt_persons_master
WHERE mention_count >= 1000
ORDER BY mention_count DESC
LIMIT 100

gdelt_organizations_master

Historical organization tracking across all time (no 30-day limit). Aggregated view of organization mentions.
Primary key:
  • organization_name (String) - Organization name
Tracking fields:
  • first_seen (Date) - First mention date
  • last_seen (Date) - Most recent mention date
  • mention_count (UInt64) - Total lifetime mentions
  • avg_tone (Float32) - Average sentiment across all mentions
  • source_count (UInt32) - Unique sources mentioning
Performance: Fast for specific names, slower for broad scansResource URI: gdelt://schema/organizations-master
Use cases:
  • Historical analysis: Track organizations over time
  • Crisis analysis: Identify coverage spikes
  • Reputation tracking: Monitor long-term sentiment
  • Industry analysis: Compare organizations
Example query:
SELECT 
    organization_name,
    first_seen,
    last_seen,
    mention_count,
    avg_tone,
    source_count
FROM gdelt_organizations_master
WHERE organization_name LIKE '%Bank%'
    AND mention_count >= 100
ORDER BY mention_count DESC
LIMIT 100

JOIN patterns

Events → Mentions → GKG

Connect events to articles and their extracted entities:
SELECT 
    e.event_id,
    e.day,
    e.actor1_name,
    m.mention_identifier,
    t.theme,
    p.person_name
FROM gdelt_events e
JOIN gdelt_mentions m 
    ON e.event_id = m.event_id
JOIN gdelt_gkg_themes_extracted t
    ON m.mention_identifier = t.document_identifier
    AND m.day = t.day
LEFT JOIN gdelt_gkg_persons_extracted p
    ON t.document_identifier = p.document_identifier
    AND t.day = p.day
WHERE e.day >= today() - INTERVAL 7 DAY
LIMIT 500

Events → Stats

Use pre-aggregated stats for faster trending analysis:
SELECT 
    e.event_id,
    e.day,
    e.actor1_name,
    e.event_code,
    s.mention_count,
    s.source_count,
    s.avg_tone
FROM gdelt_events e
JOIN mv_event_mention_stats s
    ON e.event_id = s.event_id
WHERE e.day >= today() - INTERVAL 1 DAY
    AND s.mention_count >= 50
ORDER BY s.mention_count DESC
LIMIT 100

Performance comparison

TableSpeedBest forDate limit
mv_event_mention_stats< 0.5sTrending events30 days
gdelt_events< 1sStructured queries30 days
gdelt_gkg_themes_extracted1-2sTopic filtering30 days
gdelt_mentions2-3sCoverage analysis30 days
gdelt_gkg_*_extracted2-4sEntity tracking30 days
gdelt_*_master2-5sHistorical analysisAll time
Start with the fastest table that meets your needs. Use mv_event_mention_stats for trending, gdelt_events for structured queries, and master tables for historical analysis.

Next steps