The Rise of Central Bank Talk:
Essays in Central Bank Communication and Independence

Thesis Defense →

Lauren Leek

7 October, 2025

Thesis Committee

Prof. Simon Hix – European University Institute (supervisor)

Prof. Waltraud Schelkle – European University Institute (co-supervisor)

Prof. Kenneth Benoit – Singapore Management University

Prof. Ana Carolina Garriga – University of Essex

In the headlines

FT headline on Bitcoin/digital currency FT headline on CBDC FT headline on climate and central banks Economist headline on energy Economist headline on mission Bloomberg headline on saving

& beyond the headlines

Frankfurt skyline

Puzzle

Topic prevalence over time figure

How do central banks reconcile their independent status with extensive public communication that goes beyond their narrow mandates?

Argument

Independence ≠ one‑off grant; requires continuous responsiveness and coordination.
Standard view
Communication = neutral instrument for information transmission and accountability (e.g., Woodford, Geraats, etc.).
Thesis view
Communication is dynamically shaped by (ch2) & deployed to address perceived challenges to independence (ch3 & ch4).
Addition Not only central banks embedded in advanced democracies. (Some central banks in advanced economies seem to be experiencing emerging central bank problems).

Methods & data

Global coverage map Text as data overview

Overview of Chapters & Methods

Chapter Data Empirical Method Text-as-Data
2 Independence → Communication
All speeches (1997–2023)
Staggered DiD Instrumental Variable
Gemini
Sentence cls.
3 NCB Agenda Influence
NCB + ECB speeches (1997–2023)
Sequence / Markov Cross-Sectional Timeseries
BERTopic
Topics & diffusion
4 Policy Linkages
All speeches (1997–mid‑2023)
Descriptive
GPT‑3.5
Regime cls. Validation

How Central Bank Independence Shapes Monetary Policy Communication: A Large Language Model Application

* Published as: Leek, L., & Bischl, S. (2025). How Central Bank Independence Shapes Monetary Policy Communication: A Large Language Model Application European Journal of Political Economy, 87, 102668. doi.org/10.1016/j.ejpoleco.2025.102668
Theory
Main channels diagram

Main

H1: More independence → less inflation talk.

Main

H2: More independence → more financial pressure talk.

Mechanism

H3: Independence shifts which pressures are addressed in communication.

Scope Condition

H4: Effects strongest in democracies and advanced economies.
Methods
  • LLM Google Gemini sentence classification
  • DiD Staggered Difference-in-Differences on 100 CBs (1997–2023) and augmented
  • IV Inverse distance-weighted lagged world CBI
Workflow diagram
Findings: strongest in advanced democracies
Dominance combined results
Free from government ≠ free from markets
GFC

Who Sets the Agenda? The Role of National Central Banks in the Eurosystem*

* R&R at Journal of European Public Policy.

Hypotheses

Eurosystem Structure (20 NCBs + ECB) ECB European Central Bank DE FR IT ES NL BE AT IE FI PT GR HR SK SI LU LT LV EE CY MT Legend: ECB Top 5 NCBs Other 15 NCBs Size = GDP
H1 First Movers
NCBs introduce novel topics first
H2 Public Salience
ECB uptake ↑ with national salience
H3 Economic Pressure
Responsiveness ↑ with asymmetric pressures

ECB not extra responsive under pressure

Interaction Grid

Scope conditions

🔴 Crisis ↑ responsiveness
👥 Coalitions ↑ stronger

📍 Post‑Sintra ↓ reduced

Alternative Explanations

❌ Issue Linkage
✓ Within‑Topic Variance

Key Insight

ECB responsiveness is conditional and topic audience dictates desired unity.

Introducing Textual Measures of Central Bank Policy Linkages Using GPT‑3.5

Policy Regimes

💪 Monetary Dominance

🏛 Fiscal Dominance

💼 Financial Dominance

🤝 Monetary–Fiscal Coordination

🤝 Monetary–Financial Coordination

Classification Pipeline

Three-stage classification pipeline schematic

Validation & Prompt Engineering

Manual Validation

1k sentences / 3 coders (IRR 76–84%). Accuracy 62–85%; high macro F1 (.36–.83).

Prompt Choices

Temp=0; 10 sentences/prompt; context window for disambiguation.

Compute / Tokens

Efficient token usuage ; GPT‑4 & Gemini benchmarking checks.

Dominance / Coordination Over Time

Trends in monetary dominance, fiscal/financial coordination and dominance over time

Cross‑Country Variation

Advanced vs Emerging

Democracy vs Autocracy

Crisis Effects

Theoretical contributions

  • Ch 2 Financial instability as a persistent challenge for central bank independence - beyond macroprudential turn literature
  • Ch 3 Even/especially the most independent central banks are responsive - beyond Cukierman
  • Ch 4 Fiscal and financial pressures through either coordination or dominance (policy‑linkages) - beyond Sargent and Wallace
  • Note: not a test of the intentions of governors behind communicating

Methods & Data Contributions

  • Ch 2 Ch 4 Novel LLM-based methods for detecting policy pressures in central bank communication
  • Ch 2–4 Comprehensive dataset with enriched meta-data (retrieved via LLMs)
  • Ch 3 New measure of responsiveness for central banks
  • Ch 2–4 Interactive website with dashboards and insights

Four Broader Implications

1. Dynamic Independence

CBI is an ongoing negotiation despite statutes.

2. Geographic Heterogeneity

Different mechanisms in emerging vs. advanced economies and currency areas/pegged currencies - context matters.

3. Government Coordination

Macroeconomic policy coordination - link with actual policies (fiscal & financial).

4. LLM usage

LLM Reproducibility (bias, environmental costs)

Appendix: Introduction

Jump: Analytical Framework CBI Map World Avg CBI Topic Diversity Speaker Position Audiences Over Time

Analytical framework

Core argument schematic

CBI Map

CBI Map

World Avg CBI

World Avg CBI

Topic Diversity

Topic Diversity

Speaker Position

Speaker Position

Audiences Over Time

Audiences Over Time

Appendix: Chapter 2

Jump:
Classification & Validation
Distribution of Speeches Classification Results Face Validation Over Time Confusion Matrices Dataset Comparison Map Gemini Fine-tune Settings Gemini Model Configurations LLM Performance External Validation Regimes
DiD & IV
Events Map Staggered DiD Subgroup specification Subgroup Results IV specification IV results IV Validity Mechanisms
Alt explanations
Alternative explanations overview Alt Explanations I Alt Explanations II Supervision mandate Audiences effect
Robustness test
Robustness overview Alternative Event Study Continuous Event Study Event study Robustness Event study CBI disaggregrated Placebo Randomization Placebo Event studies CBI level effects CBI level

Distribution of Speeches

Distribution of Speeches

Classification Results

Classification Results

Face validation regimes over time

Face validation regimes

Confusion Matrices

Confusion Matrices

Dataset comparison map

Dataset comparison map

Gemini fine-tune settings

Gemini fine-tune settings
Final Model Configurations
Parameter Final Value
Optimization Settings
Epochs 7
Learning rate 0.0005
Batch size 2
Prompt Engineering
Sentences per prompt 5
Temperature 0
Format instructions yes
Dataset Composition
Synthetic sentences no
Upsample factor 0
Randomise epochs yes
LLM Validation: Gemini Fine-Tune Performance

Validation Against Human Ground Truth

Metric GPT-3.5 GPT-4 Fine-tune
Accuracy 0.64 0.79 0.81
F1 (weighted) 0.69 0.78 0.79
F1 (macro) 0.35 0.40 0.47
Precision (macro) 0.33 0.48 0.49
Recall (macro) 0.43 0.40 0.45
Best Performance
Fine-tuned Gemini achieves highest scores across all metrics. Macro F1 (0.47) particularly strong for minority categories.

Confusion Matrix Insights

Both models struggle with:
• Financial & fiscal dominance
• Differentiating dominance from coordination

Gemini advantages:
• More conservative assignments
• Higher precision (fewer false positives)
• Closer alignment with human coders

Model Comparison

GPT-3.5: Over-assigns dominance/coordination
Gemini fine-tune: More "none" assignments, fewer bad mistakes

Validation Details

  • Ground truth: 3 human coders, 1,000 sentences
  • Training: 300 sentences
  • Holdout: 700 sentences for testing
  • Same scheme as Leek & Bischl (2024)
Bottom Line: Fine-tuned Gemini outperforms all tested models. Higher precision means fewer erroneous dominance assignments - critical for reliable policy regime measurement.

External validation regimes

External validation regimes

Events Map

Events Map
Empirical Analysis: Staggered Difference-in-Differences
Main Event Study Specification $$\psi^m_{ict} = \sum_{k=-5}^{k = -2} \beta_k D^k_{ct} + \sum_{k=0}^{k = 12} \beta_k D^k_{ct} + \mu_c + \theta_t + \epsilon_{ict}$$
ψᵐᵢct
Dominance indicator
Dᵏct
Event indicator
βₖ
Treatment effects (ATT)
μc, θt
Fixed effects
Endpoint Binning Strategy
$$D^{k}_{ct} = \begin{cases} \sum_{j=-5}^{-\infty} d_{c(t-j)} & \text{if } k = -5 \\ d_{c(t-k)} & \text{if } -5 < k < 12 \\ \sum_{j=12}^{\infty} d_{c(t-j)} & \text{if } k = 12 \end{cases}$$
Design Features
Pre-treatment coefficients test parallel trends. Treatment effects relative to β₋₁ = 0. Standard errors clustered at country level.

Specification Details

  • Treatment: Binary (dct ∈ {0,1}), min. increase 0.05
  • Events: Largest change per country only
  • Robustness: Gardner (2024) two-stage estimator
  • Tests: Wald tests for leveling & pre-trends
  • Window: 1985-2023, staggered CBI adoption
Subgroup analysis
Augmented Event Study with Interactions $$\psi^m_{ict} = \sum_{k=-5}^{k = -2} \beta_k D^k_{ct} + \sum_{k=0}^{k = 12} \beta_k D^k_{ct} + \sum_{j=2}^{j=J} \sum_{k=0}^{k = 12} \delta_{jk} D^k_{ct} S^j_{ct} + \mu_c + \theta_t +\epsilon_{ict}$$
S^j_{ct}
Subgroup dummies
δ_{jk}
Interaction coefficients
Effect Aggregation
Use observation-weighted average of post-treatment coefficients to avoid TWFE bias from heterogeneous treatment effects.

Key Findings by Subgroups

Political Regime
Democracies: Larger, more significant effects
Autocracies: Constrained communication freedom
Economic Development
Advanced: Strong financial dominance increase
Emerging: Weaker effects, high baseline dominance
Heterogeneity Results: Effect Magnitudes by Subgroups
Monetary Dominance Financial Dominance
Baseline
Full sample -0.1607*** 0.0548***
Supervision Capabilities
Low -0.1532** 0.0642***
Medium -0.1373*** 0.0446*
High -0.2504*** 0.0279
Political System
Autocracy -0.1161** 0.0495**
Democracy -0.1747*** 0.0595***
Monetary Sovereignty
Full sovereignty -0.1417*** 0.0556***
Union or peg -0.2268*** 0.0244
Economic Development
Emerging & Developing -0.0644* 0.0021
Advanced -0.2229*** 0.0774***
Mandates
Non-conflicting with price stability -0.1479*** 0.0542***
Conflicting objectives -0.2440*** 0.0588***
Note: Standard errors in parentheses. * p < 0.1; ** p < 0.05; *** p < 0.01. Effects strongest in democracies and advanced economies, confirming Hypothesis 4.
Instrumental Variable Approach

Addressing Dynamic Endogeneity

IV approach circumvents biases from time-varying omitted variables and reverse causality (CB communication influencing independence). Uses 2SLS with diffusion-based instruments.

Second Stage (2SLS) $$\psi^m_{ict} = \rho\, \widetilde{\psi}^m_{ict} + \beta_1\,\textrm{CBI}_{ct} + \beta_2\, \Delta \pi_{ct} + \beta_3\, \Delta u_{ct} + \theta_t + \mu_c + \epsilon_{ict}$$
ψ̃ᵐᵢct
Lagged DV (avg. 25 speeches)
Δπct, Δuct
Inflation & unemployment changes
Three Instruments
1. Inverse distance-weighted world CBI average
2. Electoral democracy in 10 nearest neighbors
3. Domestic judicial independence

Heterogeneity Results

Democracies: Larger, more significant effects
Advanced economies: Strong financial dominance ↑
Emerging economies: High baseline dominance limits effects

Methodological Notes

Lagged DV: 25-speech average maximizes log-likelihood
Controls: Inflation/unemployment changes from Fed literature
Identification: Diffusion processes drive CBI adoption
Caveat: Exclusion restriction potentially violated by bidirectional diffusion
Instrumental Variable Results: Complete 2SLS Estimation
First Stage 2SLS Effect on Dominance
Dependent Variables: CBI Monetary Financial
Model: (1) (2) (3) (4)
Variables
CBI (instrumented) -0.7524* 0.6921**
(0.4365) (0.2969)
Monetary dominance 25 prior speeches 0.0078 0.4952***
(0.0129) (0.0727)
Financial dominance 25 prior speeches -0.0009 0.3247***
(0.0376) (0.0567)
Δ Unemployment rate 0.0001 0.0001 -0.0016 0.0022
Neighbour's electoral democracy₋₁ 0.2376 0.2448
Independence judiciary 0.0056** 0.0053*
Fixed Effects
Country
Year
Fit Statistics
0.97593 0.97591 0.21513 0.11198
Observations 12,205 12,205 12,205 12,205
Instrument Validity Tests

1. Instrument Relevance

F-statistics:
• Financial dominance: 318.5
• Monetary dominance: 332.3
Result: Well above rule-of-thumb (F > 10)

2. Over-identification Tests

Sargan test p-values:
• Financial dominance: 0.08
• Monetary dominance: 0.06
Result: Fail to reject exogeneity (p > 0.05)
Caveat: LATE vs ATE
Over-identification assumes instruments identify same parameter. Different instruments may yield different LATEs due to heterogeneous treatment effects.

3. Single Instrument Robustness

Using only world CBI diffusion:
• Monetary: -0.7995* (vs -0.7524*)
• Financial: 0.7325** (vs 0.6921**)
Result: Estimates very similar

4. Instrument Correlation

Democracy vs Judicial Independence:
R² = 0.062 (correlation ≈ 25%)
Result: Instruments not redundant
2SLS Matrix Form $$\hat\beta_{\rm IV} = (Z'X)^{-1}Z'y$$
Controls W included in both X and Z (self-instrument)
Validity Assessment: Strong instruments (F > 300), exogeneity not rejected, robust to single instrument, low instrument correlation. Used as additional check for DiD analysis.
Mechanisms
Mechanism 1
Mechanism 2
Robustness Checks

1. Parallel Trends Violations

  • Linear country-specific trends
  • Conditional parallel trends with controls
  • Result: Quantitatively similar estimates

2. Treatment Definition Flexibility

  • Multiple treatments per country allowed
  • Continuous intensity & direction variation
  • Result: Patterns remain unchanged

3. Heterogeneity-Robust Estimators

  • Borusyak et al. (2024) imputation
  • Sun & Abraham (2021) cohort effects
  • Cengiz et al. (2019) stacked DiD
  • Result: Main findings confirmed

4. Sample Construction

  • Full event window requirements
  • One-year anticipation pre-dating
  • CBI sub-dimension definitions
  • Result: Robust across variations

5. Alternative CBI Measures

  • LVAU indicator (Garriga 2025)
  • Cukierman et al. (1992) dimensions
  • Result: Strong correlation, comparable estimates
6. Falsification Test
Randomization procedure: effects lie in extreme tails of placebo distributions (<1% probability under null).
Bottom Line: Results robust across estimators, sample definitions, CBI measures, and treatment specifications. Placebo tests confirm effects are not artifacts.

Event Study Robustness

Dominance Combined TWFE Robustness

Alternative Event Study

Dominance Combined Additional Event Study

Continuous Event Study

Dominance Combined Continuous

Event study CBI disaggregrated

Coef Plot DID Combined

Placebo Randomization

Placebo Aggregate

Placebo Event studies

Placebo Event studies
Alternative Explanations

1. Global Financial Crisis Effects

Concern: Effects driven by post-2008 global trends
Test: Cohort-specific effects by independence year
Result: Effects positive & significant pre-crisis

2. Euro Area Confounding

Concern: Euro adoption (1998) & SSM (2014)
Test: Exclude euro area / add ECB as control
Result: Robust when euro area excluded

3. Independence Precedent

Concern: Epistemic community effects from first increase
Test: Use only first independence change per country
Result: Similar patterns observed

4. Audience Targeting

Concern: Shift toward financial market audiences
Test: Event studies with audience indicators
Result: Audiences change little; effects hold within audiences
5. Supervision Mandate Changes
Placebo test: Replace CBI with banking supervision changes. No significant association with financial dominance found.

Data Sources

  • Masciandaro et al. (2018) supervision data
  • CBIE index policy dimension III (Romelli 2024)
  • Randomization tests following Miller et al. (2021)
Conclusion: Effects are not artifacts of global trends, euro area dynamics, audience shifts, or supervisory mandate changes.

Alt explanations I

Sample Variations I

Alt explanations II

Sample Variations II

Effect on audiences

Effect on audience
Placebo Test: Supervision Mandate Changes
Question: Are effects driven by changes in supervisory powers rather than CBI?
Approach: Replace CBI events with banking supervision changes
Supervision Change Two-way Fixed Effects Gardner et al. (2024)
Increase in Banking Supervision
Masciandaro & Romelli (2018) -0.0297 -0.0122
(0.0188) (0.0145)
CBI Policy Q3 (Romelli, 2024) 0.0093 0.0017
(0.0159) (0.0126)
Decrease in Banking Supervision
Masciandaro & Romelli (2018) -0.0083 0.0195
(0.0338) (0.0272)
CBI Policy Q3 (Romelli, 2024) 0.0026 0.0323
(0.0326) (0.0261)
Placebo Test Result
No significant association between banking supervision changes and financial dominance communication. Effects are due to CBI changes, not supervisory mandate shifts.
Heterogeneous Effects Regarding CBI Level
Monetary Dominance Financial Dominance
Baseline
Full sample -0.1607***
(0.0554)
0.0548***
(0.0200)
CBI level before independence
Low -0.1464**
(0.0606)
0.0474*
(0.0243)
Medium -0.1728**
(0.0759)
0.0741***
(0.0266)
High -0.1082*
(0.0654)
0.0371
(0.0253)

Methodological Notes

Stratification: Countries divided by pre-independence CBI index: Low (CBI < 0.5), Medium (0.5 ≤ CBI ≤ 0.6), High (CBI > 0.6)
Sample distribution: 12 high-CBI, 24 medium-CBI, 36 low-CBI independence events. Never-treated: 7 low-CBI, 4 medium-CBI, 6 high-CBI countries
Key finding: Effects strongest for medium-CBI countries, weakest for already high-CBI countries

CBI level

CBI level

Appendix: Chapter 3

Jump:
Topic Model
Workflow BERT Topic Validation Topic Regrouping Speeches Frequency Wordcloud I Wordcloud II Bertopic Pipeline
Methods and Results
Methods Overview Markov Analysis Regression Analysis NCBs First Movers ECB Direct Responsive NCBs Lead 3 Month Lags Permutations Coefficient Plot Crisis Coefficient Plot Leads Coefficient Plot Half Yearly Crisis Topic Transitions Panel Pre/post crisis coefficient plot Figure Climate Transition Figure Crisis Transition Transition Matrix Robustness
Scope and Alternative
Sintra effects Coalition effects Crisis Topic Transitions Markov Topic Transitions Topic Unity Climate Topic Unity Monetary

Workflow BERT

Workflow BERT

Topics validation

Topics Validation

Methods Overview

Methods Overview
BERTopic Model Pipeline

Core Pipeline Steps

1 Document Embeddings
all-MiniLM-L6-v2 generates 384-dimensional vectors
2 Text Vectorization
CountVectorizer with n-grams (2-4), stopword filtering
3 Dimensionality Reduction
UMAP (5 components, cosine distance)
4 Topic Clustering
BERTopic with c-TF-IDF refinement

Quality Features

Efficiency: Distilled BERT model
Noise reduction: Multi-word filtering
Coherence: Two-step outlier removal
Analysis: Temporal trend extraction
Markov Chain Analysis: Measuring Responsiveness

First-Mover Identification

Threshold criteria:
• Topic discussed in ≥25% of speech
• ≥3 other CBs discuss topic in subsequent year
Purpose: Identify main topics with follow-up (not noise)
Transition Count $$m_{ab \mid k} = \mathrm{count}\{t : CB_t = a, CB_{t+1} = b, T_t = k\}$$
How often bank a is followed by bank b for topic k
Conditional Transition Probability $$P_{ab \mid k} = \frac{m_{ab \mid k}}{\displaystyle \sum_{c \in \mathcal{B}} m_{ac \mid k}}$$
Row-stochastic matrix: each row sums to 1

Time-Varying Implementation

30-day moving window to:
• Smooth short-term fluctuations
• Capture communication cycles
• Ensure sufficient observations
• Exclude non-stationary effects
Sliding Window Probabilities $$m_{ab \mid k}^{(w)} = \mathrm{count}\{t \in \text{window } w : CB_t = a, CB_{t+1} = b, T_t = k\}$$ $$P_{ab \mid k}^{(w)} = \frac{m_{ab \mid k}^{(w)}}{\displaystyle \sum_{c \in \mathcal{B}} m_{ac \mid k}^{(w)}}$$
Markov Property
Next state depends only on present state. Enables measurement of direct responsiveness between central banks.
Robustness: Alternative windows tested (7, 14, 60, 90 days). Uses highest topic probability from BERTopic. Discrete time implementation for simplicity.
Regression Models: ECB Responsiveness to NCBs
Model A: Baseline Responsiveness $$EB_{t,p} = \alpha_p + EB_{t-1,p} + \beta_1 NCB_{t-1,p} + \epsilon_{t,p}$$
Model B: Topic Heterogeneity $$EB_{t,p} = \alpha_p + EB_{t-1,p} + \beta_1 NCB_{t-1,p} + \sum_{j=2}^{T} \beta_{2,j} (D_{p,j} \times NCB_{t-1,p}) + \epsilon_{t,p}$$
Model C: Pressure Interactions $$\begin{align} EB_{t,p} = & \alpha_p + EB_{t-1,p} + \beta_1 NCB_{t-1,p} + \gamma C_t \\ & + \sum_{j=2}^{T} \beta_{2,j} (D_{p,j} \times NCB_{t-1,p}) \\ & + \sum_{j=2}^{T} \beta_{3,j} (C_t \times NCB_{t-1,p} \times D_{p,j}) \\ & + \sum_{j=2}^{T} \beta_{4,j} (C_t \times D_{p,j}) + \epsilon_{t,p} \end{align}$$
Three-way interactions test pressure effects (H2, H3)

Pressure Proxies (Ct)

Economic: Real GDP growth, inflation
Public: ECB trust (Eurobarometer), Google Trends
Salience: "ECB" search interest (0-100 scale)
Index Construction $$I_{b,q,t} = \frac{1}{|S_{b,q}|} \sum_{s \in S_{b,q}} p_{s,t}$$
Average topic proportion across all speeches in quarter q

Technical Details

  • Controls: GDP growth, inflation (Fed literature)
  • Standard errors: Panel-corrected (Beck & Katz 2006)
  • Counterfactuals: Contemporaneous, no correlation, reverse causality
  • Robustness: Lead specifications test reverse causality
Progressive Testing: Model A → general responsiveness | Model B → topic differences | Model C → pressure conditions. Three counterfactuals examined.

NCBs first movers

First Movers

ECB direct responsive

Topic Transitions

NCBs lead 3 month lags

Coefficient Plot

Permutations

Permutations
Permutation-Based Validation of Transition Patterns

Research Question

"If topic sequence was completely random, how often would we see patterns at least as extreme as observed?"

Permutation Procedure

1. Dichotomization: Speeches >25% topic share → Low/High intensity (median cutoff)
2. Transition matrices: Four first-order Markov probabilities
3. Null distribution: 100 permutations shuffling topic intensities
4. Preservation: Marginal distributions, speech counts, seasonality
Global Test Result
χ² test rejects randomness at p < 0.001. Sequence far from "noise machine."

Observed vs Expected Patterns

Within-band continuations: Markedly higher than null
Cross-band jumps: Suppressed below expectation

Topic-Specific Patterns

Core macro topics: Strongest excess probabilities (monetary policy, economic indicators, financial markets)
Digital/supervision: Mixed patterns, spikes in high-pressure moments
National economy: Aligns with null (ECB avoids domestic debates)
Conclusion: Transition structure statistically and substantively distinct from random sequencing. Intensity conditioned by strategic topic salience.

Topic Regrouping

Topic Regrouping

Coefficient Plot Crisis

Coefficient Plot Crisis

Coefficient Plot Leads

Coefficient Plot Leads

Coefficient Plot Half Yearly

Coefficient Plot Half Yearly

Crisis Topic Transitions Panel

Crisis Topic Transitions Panel

Speeches frequency

Speeches frequency

Topic word clouds I

Figure A2 Part 1

Topic word clouds II

Figure A2 Part 2

Pre and post crisis coefficient plots

Figure A3

Figure Climate Transition

Figure Climate Transition

Figure Crisis Transition

Figure Crisis Transition

Transition Matrix Robustness

Transition Matrix Robustness
Impact of NCB Topic × Sintra Interaction on ECB Responsiveness
Topic FR IT ES DE NL
Monetary Policy -0.336
(0.280)
-0.332
(0.238)
0.039
(0.209)
0.188
(0.212)
0.045
(0.250)
Economic Indicators -0.221
(0.149)
0.344
(0.860)
-1.534
(0.959)
-0.275
(0.199)
-0.203
(0.292)
Financial Markets -0.143
(0.091)
-0.742**
(0.310)
0.144
(0.480)
-0.355**
(0.147)
0.270
(0.196)
Banking Supervision 0.200
(0.248)
-1.183***
(0.404)
-0.039
(0.115)
0.030
(0.161)
0.055
(0.074)
Digital Finance 0.029
(0.227)
-0.098
(0.142)
-0.183
(0.129)
-0.001
(0.163)
-0.013
(0.064)
International Economics -0.016
(0.183)
-0.266
(0.284)
0.160
(0.584)
-0.635**
(0.297)
0.129
(0.230)
Crisis Management 0.045
(0.096)
-0.128
(0.263)
-0.655
(1.322)
0.127
(0.173)
-0.297
(0.380)
Climate -0.078
(0.403)
0.040*
(0.023)
-0.230*
(0.120)
0.451
(0.522)
-0.032
(0.096)
Payment Systems -0.053
(0.076)
0.320
(0.270)
1.121**
(0.525)
0.061
(0.064)
-0.158
(0.386)
National Economy -0.512**
(0.240)
0.053
(0.035)
0.008
(0.036)
-0.012
(0.044)
-0.692**
(0.310)
Note:
Coefficients and clustered standard errors for the interaction term NCB Topic × Sintra. Standard errors in parentheses. * p < 0.1, ** p < 0.05, *** p < 0.01.

Coalition

Topic Transitions

Coalition Topic Transition

Topic Transitions

Markov Topic transitions

Topic Transitions

Topic unity Climate

Topic Transitions

Topic unity Monetary

Topic Transitions

Appendix: Chapter 4

Jump: Classification examples Confusion Matrices Accuracy Tokens L1 L2 Accuracy Tokens L3 Spreads Stability and Uncertainty Temperature L1 L2 Temperature L3 Topics Model Architectures Classification Shares Class Imbalance Guess the regime
Descriptive variation
Crisis Democracy GDP PPP Inflation Normative by Group Normative by Topic Polarization

Classification Examples

Classification Example
Monetary Dominance
"...monetary policy...without prejudice to our primary mandate of safeguarding price stability." (ECB, 2021)
Key: Price stability mandate above other priorities.
Fiscal Dominance
"...funds could be spend toward the direct purchase of debt..." (Argentina CB, 2008)
Key: Direct debt purchase over price stability.
Financial Dominance
"...flexible provision of liquidity contained market participants' concerns..." (BoJ, 2002)
Key: Accommodating financial markets.
Monetary-fiscal Coordination
"...new work stream on monetary-fiscal interactions..." (ECB, 2020)
Key: Direct monetary-fiscal interactions.
Monetary-financial Coordination
"...market participants...work together...for ongoing financial stability." (Fed, 2017)
Key: Coordination for financial stability.

Confusion Matrices

Confusion Matrices

Crisis

Crisis

Democracy

Democracy

GDP PPP

GDP PPP

Inflation

Inflation

Normative by Group

Normative by Group

Normative by Topic

Normative by Topic

Polarization

Polarization

Accuracy Tokens L1 L2

Sentence Count L1 L2 R

Accuracy Tokens L3

Sentence Count L3 R

Spreads

Spreads

Stability and Uncertainty

Stability R

Temperature Accuracy L1 L2

Temperature L1 L2 R

Temperature L3

Temperature L3 R

Topics

Topics
Addressing Class Imbalance in Classifier Performance

Class Imbalance Challenge

Problem: High error rates in low-frequency categories
Solution: Focus on minority categories of interest (fiscal/financial dominance)

Validation Metrics

Five standard metrics used:
• Accuracy • F1 macro • F1 weighted
• Precision • Recall
Key focus: Macro average F1 score
Performance Formulas $$\text{Precision} = \frac{\text{TP}}{\text{TP} + \text{FP}}$$ $$\text{Recall} = \frac{\text{TP}}{\text{TP} + \text{FN}}$$ $$\text{F1} = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}}$$
Why Macro F1?
Unweighted average of per-class F1 scores → sensitive to performance in minority categories (fiscal/financial dominance).

Target Categories

Fiscal dominance: Relatively infrequent
Financial dominance: Relatively infrequent
Priority: Accurate detection of these minority classes

Implementation

  • Scikit-learn package for metric calculation
  • Macro F1 used for prompt selection
  • Not interested in 'none' category performance
  • Focus on actionable policy regime detection
Model Architectures for Central Bank Communication Classification

🔍 Encoder-Only Models

Examples: BERT, RoBERTa, FinBERT
Architecture: Bidirectional attention for contextual understanding
✅ Advantages
Superior classification accuracy
Fast inference for regime categorization
Domain adaptation (FinBERT for finance)
Computational efficiency

❌ Disadvantages

  • No text generation capability
  • Fixed input length constraints
  • Single-task optimization only

📝 Decoder-Only Models

Examples: GPT-4, Claude, Gemini
Architecture: Autoregressive generation with causal attention
✅ Advantages
Flexible output + explanations
Few-shot learning capability
Long context handling
Reasoning explanations

❌ Disadvantages

  • High computational cost
  • Inconsistent classification outputs
  • Hallucination risks
  • Complex fine-tuning requirements

🔄 Encoder-Decoder Models

Examples: T5, BART, Flan-T5
Architecture: Bidirectional encoder + autoregressive decoder
✅ Advantages
Best of both worlds
Structured output formatting
Task versatility
Controllable generation

❌ Disadvantages

  • Higher model complexity
  • Memory intensive training
  • Sophisticated training required
  • May be overkill for simple tasks
💡 Recommendation
Encoder-only models (especially FinBERT) optimal for pure classification tasks. Use decoder-only when explanations needed. Encoder-decoder offers flexibility but may be unnecessary for regime categorization.
Classification Performance: Human vs GPT-3.5
Classification Validation (Humans) Validation (GPT-3.5) Full Dataset
NA 0.0% 0.0% 0.1%
Financial dominance 3.8% 10.7% 9.7%
Fiscal dominance 1.2% 2.6% 2.9%
Monetary dominance 6.1% 10.1% 10.1%
Monetary-financial coordination 10.9% 14.7% 14.8%
Monetary-fiscal coordination 2.5% 4.5% 3.6%
None 75.5% 57.4% 58.9%
GPT-3.5 "Yea-Sayer" Bias
Over-assigns dominance/coordination labels at expense of "none" category. Gemini model (Chapter 2) shows reduced bias.

Macro F1 Advantage

Sensitive to changes in minority categories we care about most (fiscal/financial dominance).

Guess the Regime

Loading example…

AMonetary Independence
BMonetary → Fiscal Coordination
CFiscal Dominance
DFinancial Coordination
EFinancial Dominance
FOther / Neutral
A–F classify N next  |  L spotlight

Website & Replication