Skip to main content

Overview

The gdelt_system_prompt is the comprehensive system prompt that provides AI assistants with everything they need to effectively query GDELT data. It includes table overviews, code references, query rules, and workflow guidance.
This prompt requires no parameters and returns a complete system message ready for use with any LLM.

When to use

Use gdelt_system_prompt when:
  • Initializing an AI agent that will query GDELT data
  • Building a research assistant with GDELT capabilities
  • Creating a chatbot that needs GDELT expertise
  • Setting up automated GDELT analysis workflows

Prompt structure

The system prompt includes seven comprehensive sections:
1

1. Table guide & speeds

Overview of all 9 GDELT tables with:
  • Performance characteristics (< 0.5s to 2-5s)
  • Use case descriptions
  • Data coverage (30-day vs all-time)
  • When to use each table
Helps agents select the fastest table for each question type.
2

2. Table selection by question

Decision tree mapping question types to optimal tables:
  • WHO-WHAT-WHERE-WHEN → gdelt_events
  • Trending/viral → mv_event_mention_stats
  • Topic filtering → gdelt_gkg_themes_extracted
  • Entity tracking → gdelt_gkg_*_extracted tables
  • Historical analysis → gdelt_*_master tables
3

3. Critical query rules

Non-negotiable requirements for all queries:
  • Date filter mandatory (prevents full table scans)
  • LIMIT clause mandatory (max 1000 rows)
  • Source URL column mandatory (enables citations)
  • Country code distinction (FIPS vs CAMEO)
These rules are enforced by validation. Non-compliant queries will fail.
4

4. Code types summary

Quick reference to all 10 code systems:
  • CAMEO country (ISO-3) for actors
  • FIPS country (2-char) for locations
  • Actor types, known groups, ethnic, religious codes
  • Event codes (01-20 root taxonomy)
  • Goldstein scale (-10 to +10)
  • Theme taxonomy (ENV_, ECON_, etc.)
  • CrisisLex terms
5

5. Goldstein scale natural language

Natural language mapping for event intensity:
  • -10 to -7: Violent conflict
  • -6 to -4: Non-violent conflict
  • -3 to -1: Mild conflict
  • 0: Neutral
  • +1 to +2: Mild cooperation
  • +3 to +6: Material cooperation
  • +7 to +10: Strong cooperation
Enables intuitive filtering like “find violent events” without memorizing numbers.
6

6. JOIN patterns

Standard patterns for multi-table queries:
  • Events → Mentions → GKG
  • Events → Stats (for trending)
  • Themes → Persons (for co-analysis)
Includes proper ON clauses and date alignment.
7

7. Query workflow

Three-step workflow for all query tasks:
  1. prepare_gdelt_query - Get schemas and codes
  2. execute_query OR present_sql - Run or validate
  3. Iterate based on results
This workflow ensures agents always have the context they need before querying.

Usage examples

from shared.mcp_config import get_mcp_client

# Initialize MCP client
mcp_client = get_mcp_client(api_key="gdelt_sk_...")

# Fetch the system prompt
messages = await mcp_client.get_prompt(
    "gdelt-cloud",
    "gdelt_system_prompt"
)

# Messages is a list of Message objects
system_prompt_text = "\n\n".join(msg.content for msg in messages)

# Use in agent initialization
agent = create_agent(
    model=llm,
    system_prompt=system_prompt_text + "\n\n" + your_additional_context,
    tools=gdelt_tools
)

Prompt contents

1. Table overview

GDELT Cloud - 9 Tables Available

FASTEST TABLES (Query these first when possible):
• mv_event_mention_stats (< 0.5s) - Pre-aggregated viral/trending events
• gdelt_events (< 1s) - WHO-WHAT-WHERE-WHEN structured events

FAST TABLES:
• gdelt_gkg_themes_extracted (1-2s) - Topic/theme filtering
• gdelt_mentions (2-3s) - Event-to-article bridge

MODERATE TABLES (30-day window):
• gdelt_gkg_persons_extracted (2-4s) - Person mentions
• gdelt_gkg_organizations_extracted (2-4s) - Org mentions
• gdelt_gkg_locations_extracted (2-4s) - Location mentions w/ coords

HISTORICAL TABLES (All-time data, slower):
• gdelt_persons_master (2-5s) - Historical person tracking
• gdelt_organizations_master (2-5s) - Historical org tracking

2. Table selection logic

Question type to Table selection:

"What did [country/actor] do?" → gdelt_events
"Show me [event type] events" → gdelt_events
"Find protests/violence in [location]" → gdelt_events

"What's trending today?" → mv_event_mention_stats
"Show me viral events" → mv_event_mention_stats
"What has the most coverage?" → mv_event_mention_stats

"Find articles about [topic/theme]" → gdelt_gkg_themes_extracted
"Show me climate change news" → gdelt_gkg_themes_extracted

"Track mentions of [person]" → gdelt_gkg_persons_extracted
"Who is being discussed?" → gdelt_gkg_persons_extracted

"Find coverage of [company/org]" → gdelt_gkg_organizations_extracted

"Where is [topic] being discussed?" → gdelt_gkg_locations_extracted

"Historical analysis of [person]" → gdelt_persons_master
"Lifetime coverage of [organization]" → gdelt_organizations_master

3. Query rules (enforced)

MANDATORY REQUIREMENTS:

1. DATE FILTER - Always required
   ✓ WHERE day >= today() - INTERVAL 7 DAY
   ✓ WHERE date_time >= now() - INTERVAL 24 HOUR
   ✓ WHERE day BETWEEN '2025-01-01' AND '2025-01-31'

2. LIMIT CLAUSE - Always required (max 1000)
   ✓ LIMIT 100
   ✓ LIMIT 500
   ✓ LIMIT 1000

3. SOURCE URL - Always required in SELECT
   ✓ SELECT ..., source_url FROM gdelt_events
   ✓ SELECT ..., document_identifier FROM gdelt_gkg_*

CRITICAL DISTINCTION:
• CAMEO country codes (ISO-3): USA, CHN, GBR → for WHO (actors)
  Use with: actor1_country_code, actor2_country_code
  
• FIPS country codes (2-char): US, CH, UK → for WHERE (locations)
  Use with: action_geo_country_code, *_geo_country_code, country_code

NEVER MIX THESE UP!

4. Code reference summary

10 Code Types Available:

1. cameo_country_codes - ISO-3 for actor filtering (USA, CHN, GBR)
2. fips_country_codes - FIPS 2-char for location filtering (US, CH, UK)
3. cameo_type_codes - Actor types (GOV, MIL, COP, EDU, etc.)
4. cameo_known_groups - Organizations (NATO, UN, EU, etc.)
5. cameo_ethnic_codes - Ethnic groups (sparse, secondary filter)
6. cameo_religion_codes - Religious affiliation (sparse, secondary filter)
7. cameo_event_codes - Event taxonomy (01-20 root codes)
8. goldstein_scale - Event intensity (-10 to +10)
9. theme_gdelt_taxonomy - GKG themes (ENV_*, ECON_*, GOV_*, etc.)
10. theme_crisislex - Crisis/disaster terms

Get codes via prepare_gdelt_query flags or get_resource tool.

5. Goldstein scale mapping

Goldstein Scale Natural Language Mapping:

Conflict (negative):
• -10 to -7: "violent conflict" (fight, assault, mass violence)
• -6 to -4: "non-violent conflict" (threaten, coerce, sanction)
• -3 to -1: "mild conflict" (disapprove, reject, protest)

Neutral:
• 0: "neutral" (investigate, make statement)

Cooperation (positive):
• +1 to +2: "mild cooperation" (consult, appeal)
• +3 to +6: "material cooperation" (cooperate, provide aid)
• +7 to +10: "strong cooperation" (major agreements, alliances)

Query examples:
• "Find violent events" → WHERE goldstein_scale < -6
• "Show cooperation" → WHERE goldstein_scale >= 3
• "Neutral statements" → WHERE goldstein_scale BETWEEN -1 AND 1

6. JOIN patterns

Standard JOIN Patterns:

Events → Mentions → GKG Entities:
SELECT e.event_id, e.actor1_name, t.theme, p.person_name
FROM gdelt_events e
JOIN gdelt_mentions m ON e.event_id = m.event_id
JOIN gdelt_gkg_themes_extracted t 
    ON m.mention_identifier = t.document_identifier 
    AND m.day = t.day
LEFT JOIN gdelt_gkg_persons_extracted p
    ON t.document_identifier = p.document_identifier
    AND t.day = p.day
WHERE e.day >= today() - INTERVAL 7 DAY

Events → Stats (trending):
SELECT e.event_id, e.actor1_name, s.mention_count
FROM gdelt_events e
JOIN mv_event_mention_stats s ON e.event_id = s.event_id
WHERE e.day >= today() - INTERVAL 1 DAY

Always align ON day columns for performance!

7. Query workflow

ALWAYS FOLLOW THIS WORKFLOW:

Step 1: PREPARE
Call prepare_gdelt_query with:
- tables: List of tables you'll query
- code flags: Only the codes you need (don't request all)

Step 2: EXECUTE or PRESENT
For research/analysis:
  → execute_query(query="SELECT...", table_rationale="...", filter_rationale="...")

For alerts/saved queries:
  → present_sql(query="SELECT...", description="...", table_rationale="...", filter_rationale="...")

Step 3: ITERATE
- Review results/validation
- Refine query if needed
- Re-execute

NEVER skip prepare_gdelt_query - it prevents errors!

Integration patterns

LangChain agent

from langchain_openai import ChatOpenAI
from langchain.agents import create_agent
from shared.mcp_config import get_mcp_client

# Fetch MCP tools and prompt
mcp_client = get_mcp_client(api_key=os.getenv("GDELT_API_KEY"))
gdelt_tools = await mcp_client.get_tools(server_name="gdelt-cloud")
gdelt_prompt_messages = await mcp_client.get_prompt("gdelt-cloud", "gdelt_system_prompt")
gdelt_core_prompt = "\n\n".join(msg.content for msg in gdelt_prompt_messages)

# Build system prompt
system_prompt = f"""
You are a GDELT research assistant.

{gdelt_core_prompt}

Additional instructions:
- Always cite sources using source_url
- Explain your table selection reasoning
- Provide context for results
"""

# Create agent
agent = create_agent(
    model=ChatOpenAI(model="gpt-4"),
    system_prompt=system_prompt,
    tools=gdelt_tools
)

Custom application

import anthropic

# Fetch system prompt
async with FastMCPClient() as client:
    await client.connect_http("https://gdelt-cloud-mcp.fastmcp.app/mcp", headers={"Authorization": "Bearer gdelt_sk_..."})
    prompt_result = await client.get_prompt("gdelt_system_prompt")
    system_text = "\n\n".join(msg.content for msg in prompt_result.messages)

# Use with Claude
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=4096,
    system=system_text,
    messages=[
        {"role": "user", "content": "Find protests in France last week"}
    ]
)

Best practices

Combine with your own context - The system prompt provides GDELT expertise. Add your own instructions for task-specific behavior (tone, output format, citations, etc.).
Update periodically - As GDELT Cloud evolves, the system prompt may be updated. Fetch it at agent initialization rather than hard-coding.
Verify tool availability - Ensure agents have access to all 4 GDELT tools (prepare_gdelt_query, execute_query, present_sql, get_resource) along with the prompt.
Don’t override critical rules - The date filter, LIMIT, and source URL requirements are enforced by the backend. Don’t tell agents to skip them.

Prompt parameters

None
void
The gdelt_system_prompt requires no parameters. Simply call it to get the complete system message.

Response format

Returns a Message object (or list of Message objects) with:
role
string
required
Message role, typically “assistant” or “system”
content
string
required
The complete system prompt text
Most MCP clients return a list of Message objects. Concatenate the content fields with newlines to get the full prompt text.

Advanced usage

Multi-agent systems

Use the system prompt to create specialized agents:
# Research agent
research_agent_prompt = gdelt_system_prompt + """
Focus on: Comprehensive analysis, citing multiple sources, identifying patterns.
"""

# Alert agent
alert_agent_prompt = gdelt_system_prompt + """
Focus on: Building precise queries for alerts, explaining filter choices, validating requirements.
"""

# Exploration agent
exploration_agent_prompt = gdelt_system_prompt + """
Focus on: Iterative queries, testing different tables, optimizing for speed.
"""

Augmenting with examples

custom_prompt = gdelt_system_prompt + """

EXAMPLE WORKFLOWS:

User: "Find protests in France"
1. prepare_gdelt_query(tables=["gdelt_events"], include_cameo_event_codes=True, include_fips_country_codes=True)
2. execute_query(query="SELECT day, actor1_name, event_code, source_url FROM gdelt_events WHERE day >= today() - INTERVAL 7 DAY AND event_root_code = '14' AND action_geo_country_code = 'FR' LIMIT 100")
3. Present results with source citations

User: "Create alert for climate articles"
1. prepare_gdelt_query(tables=["gdelt_gkg_themes_extracted"], include_theme_gdelt_taxonomy=True)
2. present_sql(query="SELECT day, theme, document_identifier FROM gdelt_gkg_themes_extracted WHERE day >= today() - INTERVAL 1 DAY AND theme LIKE 'ENV_CLIMATE%' LIMIT 500", description="Daily climate change articles")
3. Confirm with user before creating alert
"""

Versioning

The system prompt is maintained by GDELT Cloud and may be updated to reflect:
  • New tables or columns
  • Updated performance characteristics
  • Additional best practices
  • New code systems or themes
Always fetch the latest version at agent initialization rather than caching indefinitely.

Next steps