Financial Immunology: Dynamics and Contagion in the NASDAQ Market

Abstract

Financial markets, much like biological populations, are susceptible to contagion. A localized distress in one asset can rapidly propagate through hidden dependency channels and contacts, leading to widespread systemic failure. Why does this matter? Traditional econometric models often fail to capture the dynamic and directional nature of shock propagation during extreme events. This project applies epidemiological frameworks to financial time series to understand not just that markets crash, but how the infection spreads.

We analyze the NASDAQ market across three major crises: the Dot-Com Bubble, the Subprime Mortgage Crisis, and the COVID-19 Crash. By building dynamic Granger-Causality networks and defining specific health states (Healthy vs. Sick) for every stock, we identify the "Patient Zeros" that trigger instability and the "Super-spreaders" that amplify it. Our ultimate goal is to validate a "Pandemic Potential Index" (PPI)—a novel metric designed to quantify the virulence of specific assets before a full-scale meltdown occurs. For each crisis, the same analysis is repeated at an industrial sector level.

Research Questions

Our project moves beyond simple correlation analysis to answer four specific questions about market systemic health:

Can financial contagion be modeled as a biological epidemic?
- Hypothesis: Assigning epidemiological states (Susceptible, Infected/Shocked, Recovered) to stocks based on return thresholds highlights propagation patterns invisible to standard time-series analysis.
Who are the "Patient Zeros" and "Super-spreaders" of those historical crashes?
- Goal: Identify the specific assets that initiated the cascade in 2000, 2008, and 2020. Are they always the largest cap stocks, or do peripheral assets trigger the fall?
Does the network topology serve as a leading indicator of distress?
- Goal: Analyze how the density and structure of Granger-Causality networks shift before and during a crash. We look for "densifications" of the network to predict a collapse in daily returns.
Can we define and validate a "Pandemic Potential Index" (PPI)?
- Goal: Construct a robust metric combining network centrality (influence) and transmission probability (severity) to quantify each stock's systemic risk contribution.

Data Description & Enrichment

We use a comprehensive dataset of typically available financial data, enriched with sector classifications to enable cross-sector analysis.

Primary Dataset: We process daily Open, High, Low, Close, and Volume data for 50+ major high-cap NASDAQ tickers covering the period from 1990 to present. The raw data is sourced from the Stock Market Dataset on Kaggle, from which we selectively kept only the stock data.
Sector Classification Enrichment: We assigned each stock to an industry sector using the Industry Classification Benchmark (ICB) standard. This enrichment enabled us to perform the analyses at the community level and study "cross-immunity" (e.g., how a tech crash propagates to the banking sector). Note that this dataset was generated for the purpose of this project and is not taken from an otherwise publicly available source.
Preprocessing: Data is cleaned for stock splits and dividends. We compute log-returns to ensure stationarity of our signals and apply Rolling Windows (e.g., 30-day lookback) to capture dynamic dependencies rather than static snapshots.

Methods

Our approach integrates rigorous financial econometrics with sophisticated network science. The analysis pipeline is modularized in our src/ directory for reproducibility.

1. Market Regime Segmentation

Instead of arbitrarily picking dates, we use Dynamic Programming to mathematically segment market history.

Algorithm: We minimize the Sum of Squared Errors (SSE) of the mean market return signal to find optimal "changepoints."
Result: This auto-classifies the timeline into regimes: Bull/Calm, Bull/Volatile, Bear/Calm, and Bear/Stress.

2. Causality Network Construction

We move beyond simple Pearson correlations (which imply symmetric relationships) to directional causality.

Granger Causality: For every pair of stocks $(X, Y)$, we test if past returns of $X$ statistically predict occurrence of $Y$'s returns better than $Y$'s own history alone.
Dynamic Matrix: This computation is repeated over sliding windows, resulting in a time-varying adjacency matrix $A_t$ where $A_{ij} = 1$ implies "Infection Pathway" from $i$ to $j$.

3. Epidemiological State Modeling

We implement a bespoke compartment model (SIR-like) tailored for finance:

State Definition:
- Heathy: Daily return > $30^{th}$ percentile of window.
- Sick (Infected): Daily return $\le 30^{th}$ percentile (Significant downside shock).
R0 Calculation: We compute an "Effective $R_0$" for every stock by measuring the volatility spillover to its susceptible neighbors in the network, weighted by edge strength and distance (max hops = 2).

4. Systemic Risk Metrics (PPI)

The Pandemic Potential Index (PPI) is calculated as a composite score: $$PPI_i = \text{Centrality}_i \times \text{TransmissionProb}_i \times \text{Severity}_i$$ This metric highlights stocks that are central in the causal network AND are currently experiencing severe distress.

Organization within the Team

Nazar, Luca, Benajmen: Worked on prototyping the logic for the analyses and constructing the pipelines. Developed and designed the data story website.
Samuel, Ahmad, Luca: Worked on refining the prototyped pipelines and re-writing the codebase to maximize reutilization across its components. Responsible for the structure and content of the top-level results.ipynb notebook.

Repository Structure

The directory structure of the project is organized as follows:

├── src/                        <- Source code modules
│   ├── data/                   <- Data loading and preprocessing logic
│   ├── models/                 <- Core modeling (Segmentation, Networks, Epidemiology)
│   ├── utils/                  <- Visualization and helper functions
│   ├── scripts/                <- Execution pipelines
│   └── configuration.py        <- Global settings
│
├── results.ipynb               <- Main analysis notebook (The Data Story)
├── pip_requirements.txt        <- Python dependencies
├── tests/                      <- Unit tests
└── README.md                   <- Project documentation

How to execute the code

To reproduce the analysis and results, follow these steps:

Clone the repository

git clone <project_link>
cd <project_repo>

Environment Setup It is recommended to use a virtual environment to manage dependencies:

# Create a virtual environment
python -m venv venv

# Activate the virtual environment
# On Windows:
.\venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

Install Dependencies Install the required Python packages:
```
pip install -r pip_requirements.txt
```
Run the Analysis Open the main notebook to view the data story and results.

⚠️ Please take note that the notebook is somewhat lengthy and cells may take a while (in tens of minutes) to run. The notebook is also memory intensive, make sure you have enough RAM to run it. ⚠️

Data Story

The data story website can be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Financial Immunology: Dynamics and Contagion in the NASDAQ Market

Abstract

Research Questions

Data Description & Enrichment

Methods

1. Market Regime Segmentation

2. Causality Network Construction

3. Epidemiological State Modeling

4. Systemic Risk Metrics (PPI)

Organization within the Team

Repository Structure

How to execute the code

Data Story

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
Extras		Extras
data		data
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
pip_requirements.txt		pip_requirements.txt
results.ipynb		results.ipynb

Folders and files

Latest commit

History

Repository files navigation

Financial Immunology: Dynamics and Contagion in the NASDAQ Market

Abstract

Research Questions

Data Description & Enrichment

Methods

1. Market Regime Segmentation

2. Causality Network Construction

3. Epidemiological State Modeling

4. Systemic Risk Metrics (PPI)

Organization within the Team

Repository Structure

How to execute the code

Data Story

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages