Pair Trade Screener
Overview
This skill identifies and analyzes statistical arbitrage opportunities through pair trading. Pair trading is a market-neutral strategy that profits from the relative price movements of two correlated securities, regardless of overall market direction. The skill uses rigorous statistical methods including correlation analysis and cointegration testing to find robust trading pairs.
Core Methodology:
- - Identify pairs of stocks with high correlation and similar sector/industry exposure
- Test for cointegration (long-term statistical relationship)
- Calculate spread z-scores to identify mean-reversion opportunities
- Generate entry/exit signals based on statistical thresholds
- Provide position sizing for market-neutral exposure
Key Advantages:
- - Market-neutral: Profits in up, down, or sideways markets
- Risk management: Limited exposure to broad market movements
- Statistical foundation: Data-driven, not discretionary
- Diversification: Uncorrelated to traditional long-only strategies
When to Use This Skill
Use this skill when:
- - User asks for "pair trading opportunities"
- User wants "market-neutral strategies"
- User requests "statistical arbitrage screening"
- User asks "which stocks move together?"
- User wants to hedge sector exposure
- User requests mean-reversion trade ideas
- User asks about relative value trading
Example user requests:
- - "Find pair trading opportunities in the tech sector"
- "Which stocks are cointegrated?"
- "Screen for statistical arbitrage opportunities"
- "Find mean-reversion pairs"
- "What are good market-neutral trades right now?"
Analysis Workflow
Step 1: Define Pair Universe
Objective: Establish the pool of stocks to analyze for pair relationships.
Option A: Sector-Based Screening (Recommended)
Select a specific sector to screen:
- - Technology
- Financials
- Healthcare
- Consumer Discretionary
- Industrials
- Energy
- Materials
- Consumer Staples
- Utilities
- Real Estate
- Communication Services
Option B: Custom Stock List
User provides specific tickers to analyze:
CODEBLOCK0
Option C: Industry-Specific
Narrow focus to specific industry within sector:
- - Example: "Software" within Technology sector
- Example: "Regional Banks" within Financials
Filtering Criteria:
- - Minimum market cap: $2B (mid-cap and above)
- Minimum average volume: 1M shares/day (liquidity requirement)
- Active trading: No delisted or inactive stocks
- Same exchange preference: Avoid cross-exchange complications
Step 2: Retrieve Historical Price Data
Objective: Fetch price history for correlation and cointegration analysis.
Data Requirements:
- - Timeframe: 2 years (minimum 252 trading days)
- Frequency: Daily closing prices
- Adjustments: Adjusted for splits and dividends
- Clean data: No gaps or missing values
FMP API Endpoint:
CODEBLOCK1
Data Validation:
- - Verify consistent date ranges across all symbols
- Remove stocks with >10% missing data
- Fill minor gaps with forward-fill method
- Log data quality issues
Script Execution:
CODEBLOCK2
Step 3: Calculate Correlation and Beta
Objective: Identify candidate pairs with strong linear relationships.
Correlation Analysis:
For each pair of stocks (i, j) in the universe:
- 1. Calculate Pearson correlation coefficient (ρ)
- Calculate rolling correlation (90-day window) for stability check
- Filter pairs with ρ >= 0.70 (strong positive correlation)
Correlation Interpretation:
- - ρ >= 0.90: Very strong correlation (best candidates)
- ρ 0.70-0.90: Strong correlation (good candidates)
- ρ 0.50-0.70: Moderate correlation (marginal)
- ρ < 0.50: Weak correlation (exclude)
Beta Calculation:
For each candidate pair (Stock A, Stock B):
CODEBLOCK3
Beta indicates the hedge ratio:
- - Beta = 1.0: Equal dollar amounts
- Beta = 1.5: $1.50 of B for every $1.00 of A
- Beta = 0.8: $0.80 of B for every $1.00 of A
Correlation Stability Check:
- - Calculate correlation over multiple periods (6mo, 1yr, 2yr)
- Require correlation to be stable (not deteriorating)
- Flag pairs where recent correlation < historical correlation by >0.15
Step 4: Cointegration Testing
Objective: Statistically validate long-term equilibrium relationship.
Why Cointegration Matters:
- - Correlation measures short-term co-movement
- Cointegration proves long-term equilibrium relationship
- Cointegrated pairs mean-revert predictably
- Non-cointegrated pairs may diverge permanently
Augmented Dickey-Fuller (ADF) Test:
For each correlated pair:
- 1. Calculate spread: INLINECODE0
- Run ADF test on spread series
- Check p-value: p < 0.05 indicates cointegration (reject null hypothesis of unit root)
- Extract ADF statistic for strength ranking
Cointegration Interpretation:
- - p-value < 0.01: Very strong cointegration (★★★)
- p-value 0.01-0.05: Moderate cointegration (★★)
- p-value > 0.05: No cointegration (exclude)
Half-Life Calculation:
Estimate mean-reversion speed:
CODEBLOCK4
- - Half-life < 30 days: Fast mean-reversion (good for short-term trading)
- Half-life 30-60 days: Moderate speed (standard)
- Half-life > 60 days: Slow mean-reversion (long holding periods)
Python Implementation:
CODEBLOCK5
Step 5: Spread Analysis and Z-Score Calculation
Objective: Quantify current spread deviation from equilibrium.
Spread Calculation:
Two common methods:
Method 1: Price Difference (Additive)
Spread = Price_A - (Beta × Price_B)
Best for: Stocks with similar price levels
Method 2: Price Ratio (Multiplicative)
Spread = Price_A / Price_B
Best for: Stocks with different price levels, easier interpretation
Z-Score Calculation:
Measures how many standard deviations spread is from its mean:
CODEBLOCK8
Z-Score Interpretation:
- - Z > +2.0: Stock A expensive relative to B (short A, long B)
- Z > +1.5: Moderately expensive (watch for entry)
- Z -1.5 to +1.5: Normal range (no trade)
- Z < -1.5: Moderately cheap (watch for entry)
- Z < -2.0: Stock A cheap relative to B (long A, short B)
Historical Spread Analysis:
- - Calculate mean and std dev over 90-day rolling window
- Plot historical z-score distribution
- Identify maximum historical z-score deviations
- Check for structural breaks (spread regime change)
Step 6: Generate Entry/Exit Recommendations
Objective: Provide actionable trading signals with clear rules.
Entry Conditions:
Conservative Approach (Z ≥ ±2.0):
CODEBLOCK9
Aggressive Approach (Z ≥ ±1.5):
- - Lower threshold for more frequent trades
- Higher win rate but smaller avg profit per trade
- Requires tighter risk management
Exit Conditions:
Primary Exit: Mean Reversion (Z = 0)
CODEBLOCK10
Secondary Exit: Partial Profit Take
CODEBLOCK11
Stop Loss:
CODEBLOCK12
Time-Based Exit:
CODEBLOCK13
Step 7: Position Sizing and Risk Management
Objective: Determine dollar amounts for market-neutral exposure.
Market Neutral Sizing:
For a pair (Stock A, Stock B) with beta = β:
Equal Dollar Exposure:
CODEBLOCK14
Position Sizing Considerations:
- - Total pair allocation: 10-20% of portfolio per pair
- Maximum pairs: 5-8 active pairs for diversification
- Correlation across pairs: Avoid highly correlated pairs
Risk Metrics:
- - Maximum loss per pair: 2-3% of total portfolio
- Stop loss trigger: Z-score > ±3.0 or -5% loss on spread
- Portfolio-level risk: Sum of all pair risks ≤ 10%
Step 8: Generate Pair Analysis Report
Objective: Create structured markdown report with findings and recommendations.
Report Sections:
- 1. Executive Summary
- Total pairs analyzed
- Number of cointegrated pairs found
- Top 5 opportunities ranked by statistical strength
- 2. Cointegrated Pairs Table
- Pair name (Stock A / Stock B)
- Correlation coefficient
- Cointegration p-value
- Current z-score
- Trade signal (Long/Short/None)
- Half-life
- 3. Detailed Analysis (Top 10 Pairs)
- Pair description
- Statistical metrics
- Current spread position
- Entry/exit recommendations
- Position sizing
- Risk assessment
- 4. Spread Charts (Text-Based)
- Historical z-score plot (ASCII art)
- Entry/exit levels marked
- Current position indicator
- 5. Risk Warnings
- Pairs with deteriorating correlation
- Structural breaks detected
- Low liquidity warnings
File Naming Convention:
CODEBLOCK15
Example: INLINECODE1
Quality Standards
Statistical Rigor
Minimum Requirements for Valid Pair:
- - ✓ Correlation ≥ 0.70 over 2-year period
- ✓ Cointegration p-value < 0.05 (ADF test)
- ✓ Spread stationarity confirmed
- ✓ Half-life < 90 days
- ✓ No structural breaks in recent 6 months
Red Flags (Exclude Pair):
- - Correlation dropped >0.20 in recent 6 months
- Cointegration p-value > 0.05
- Half-life increasing over time (mean-reversion weakening)
- Significant corporate events (merger, spin-off, bankruptcy risk)
- Liquidity concerns (avg volume < 500K shares/day)
Practical Considerations
Transaction Costs:
- - Assume 0.1% round-trip cost per leg
- Total cost per pair = 0.4% (entry + exit, both legs)
- Minimum z-score threshold should exceed transaction costs
Short Selling:
- - Verify stock is shortable (not hard-to-borrow)
- Factor in short interest costs (borrow fees)
- Monitor short squeeze risk
Execution:
- - Enter/exit both legs simultaneously (avoid leg risk)
- Use limit orders to control slippage
- Pre-locate shorts before entry
Available Scripts
scripts/find_pairs.py
Purpose: Screen for cointegrated pairs within a sector or custom list.
Usage:
CODEBLOCK16
Parameters:
- -
--sector: Sector name (Technology, Financials, etc.) - INLINECODE3 : Comma-separated list of tickers (alternative to sector)
- INLINECODE4 : Minimum correlation threshold (default: 0.70)
- INLINECODE5 : Minimum market cap filter (default: $2B)
- INLINECODE6 : Historical data period (default: 730 days)
- INLINECODE7 : Output JSON file (default: stdout)
- INLINECODE8 : FMP API key (or set FMPAPIKEY env var)
Output:
CODEBLOCK17
scripts/analyze_spread.py
Purpose: Analyze a specific pair's spread behavior and generate trading signals.
Usage:
CODEBLOCK18
Parameters:
- -
--stock-a: First stock ticker - INLINECODE10 : Second stock ticker
- INLINECODE11 : Analysis period (default: 365)
- INLINECODE12 : Z-score threshold for entry (default: 2.0)
- INLINECODE13 : Z-score threshold for exit (default: 0.0)
- INLINECODE14 : FMP API key
Output:
- - Current spread analysis
- Z-score calculation
- Entry/exit recommendations
- Position sizing
- Historical z-score chart (text)
Reference Documentation
references/methodology.md
Comprehensive guide to statistical arbitrage and pair trading:
- - Pair Selection Criteria: How to identify good pair candidates
- Statistical Tests: Correlation, cointegration, stationarity
- Spread Construction: Price difference vs price ratio approaches
- Mean Reversion: Half-life calculation and interpretation
- Risk Management: Position sizing, stop losses, diversification
- Common Pitfalls: Survivorship bias, look-ahead bias, overfitting
references/cointegration_guide.md
Deep dive into cointegration testing:
- - What is Cointegration?: Intuitive explanation
- ADF Test: Step-by-step procedure
- P-Value Interpretation: Statistical significance thresholds
- Half-Life Estimation: AR(1) model approach
- Structural Breaks: Testing for regime changes
- Practical Examples: Case studies with real pairs
Integration with Other Skills
Sector Analyst Integration:
- - Use Sector Analyst to identify sectors in rotation
- Screen for pairs within outperforming sectors
- Pairs in leading sectors may have stronger trends
Technical Analyst Integration:
- - Confirm pair entry/exit with individual stock technicals
- Check support/resistance levels before entry
- Validate trend direction aligns with spread signal
Backtest Expert Integration:
- - Feed pair candidates to Backtest Expert for validation
- Test historical z-score entry/exit rules
- Optimize threshold parameters (entry z-score, stop loss)
- Walk-forward analysis for robustness
Market Environment Analysis Integration:
- - Avoid pair trading during extreme volatility (VIX > 30)
- Correlations break down in crisis periods
- Prefer pair trading in sideways/range-bound markets
Portfolio Manager Integration:
- - Track multiple pair positions
- Monitor overall market-neutral exposure
- Calculate portfolio-level pair trading P/L
- Rebalance hedge ratios periodically
Important Notes
- - All analysis and output in English
- Statistical foundation: No discretionary interpretation
- Market neutral focus: Minimize directional beta exposure
- Data quality critical: Garbage in, garbage out
- Requires FMP API key: Free tier sufficient for basic screening
- Python dependencies: pandas, numpy, scipy, statsmodels
Common Use Cases
Use Case 1: Technology Sector Pairs
CODEBLOCK19
Use Case 2: Specific Pair Analysis
CODEBLOCK20
Use Case 3: Regional Bank Pairs
CODEBLOCK21
Troubleshooting
Problem: No cointegrated pairs found
Solutions:
- - Expand universe (lower market cap threshold)
- Relax cointegration p-value to 0.10
- Try different sectors (Utilities often cointegrate well)
- Increase lookback period to 3 years
Problem: All z-scores near zero (no trade signals)
Solutions:
- - Normal market condition (pairs in equilibrium)
- Check back later or expand universe
- Lower entry threshold to ±1.5 instead of ±2.0
Problem: Pair correlation broke down
Solutions:
- - Check for corporate events (earnings, guidance changes)
- Verify no M&A activity or restructuring
- Remove pair from watchlist if structural break confirmed
- Monitor for 30 days before re-entering
API Requirements
- - Required: FMP API key (free tier sufficient)
- Rate Limits: ~250 requests/day on free tier
- Data Usage: ~2 requests per symbol for 2-year history
- Upgrade: Professional plan ($29/mo) recommended for frequent screening
Resources
- - FMP Historical Price API: https://site.financialmodelingprep.com/developer/docs/historical-price-full
- Stock Screener API: https://site.financialmodelingprep.com/developer/docs/stock-screener-api
- Statsmodels Documentation: https://www.statsmodels.org/stable/index.html
- Cointegration Paper: Engle & Granger (1987) - "Co-Integration and Error Correction"
Version: 1.0
Last Updated: 2025-11-08
Dependencies: Python 3.8+, pandas, numpy, scipy, statsmodels, requests
配对交易筛选器
概述
该技能通过配对交易识别和分析统计套利机会。配对交易是一种市场中性策略,通过两只相关证券的相对价格变动获利,不受整体市场方向影响。该技能使用严格的统计方法,包括相关分析和协整检验,以寻找稳健的交易对。
核心方法:
- - 识别具有高相关性和相似行业/板块敞口的股票对
- 检验协整性(长期统计关系)
- 计算价差Z分数以识别均值回归机会
- 基于统计阈值生成入场/出场信号
- 提供市场中性敞口的头寸规模建议
主要优势:
- - 市场中性:在上涨、下跌或横盘市场中均可获利
- 风险管理:对整体市场波动的敞口有限
- 统计基础:数据驱动,非主观判断
- 分散化:与传统纯多头策略不相关
何时使用该技能
在以下情况下使用该技能:
- - 用户询问配对交易机会
- 用户想要市场中性策略
- 用户请求统计套利筛选
- 用户询问哪些股票走势同步?
- 用户希望对冲行业敞口
- 用户请求均值回归交易思路
- 用户询问相对价值交易
用户请求示例:
- - 在科技板块寻找配对交易机会
- 哪些股票具有协整关系?
- 筛选统计套利机会
- 寻找均值回归配对
- 目前有哪些好的市场中性交易?
分析工作流程
步骤1:定义配对池
目标: 建立待分析股票池以寻找配对关系。
选项A:基于板块筛选(推荐)
选择特定板块进行筛选:
- - 科技
- 金融
- 医疗保健
- 非必需消费品
- 工业
- 能源
- 材料
- 必需消费品
- 公用事业
- 房地产
- 通信服务
选项B:自定义股票列表
用户提供特定股票代码进行分析:
示例:[AAPL, MSFT, GOOGL, META, NVDA]
选项C:特定行业
聚焦板块内的特定行业:
- - 示例:科技板块内的软件
- 示例:金融板块内的地区银行
筛选标准:
- - 最低市值:20亿美元(中盘股及以上)
- 最低平均成交量:100万股/天(流动性要求)
- 活跃交易:无退市或停牌股票
- 同一交易所偏好:避免跨交易所复杂问题
步骤2:获取历史价格数据
目标: 获取价格历史数据用于相关性和协整分析。
数据要求:
- - 时间范围:2年(至少252个交易日)
- 频率:每日收盘价
- 调整:已调整股票分割和股息
- 数据清洁:无缺口或缺失值
FMP API端点:
GET /v3/historical-price-full/{symbol}?apikey=YOURAPIKEY
数据验证:
- - 验证所有股票代码的日期范围一致
- 移除缺失数据超过10%的股票
- 使用前向填充法填补少量缺口
- 记录数据质量问题
脚本执行:
bash
python scripts/fetchpricedata.py --sector Technology --lookback 730
步骤3:计算相关性和贝塔系数
目标: 识别具有强线性关系的候选配对。
相关性分析:
对于池中的每对股票(i, j):
- 1. 计算皮尔逊相关系数(ρ)
- 计算滚动相关性(90天窗口)以检查稳定性
- 筛选ρ >= 0.70的配对(强正相关)
相关性解读:
- - ρ >= 0.90:非常强的相关性(最佳候选)
- ρ 0.70-0.90:强相关性(良好候选)
- ρ 0.50-0.70:中等相关性(边缘)
- ρ < 0.50:弱相关性(排除)
贝塔系数计算:
对于每个候选配对(股票A,股票B):
贝塔 = 协方差(A, B) / 方差(B)
贝塔表示对冲比率:
- - 贝塔 = 1.0:等额美元
- 贝塔 = 1.5:每1美元A对应1.50美元B
- 贝塔 = 0.8:每1美元A对应0.80美元B
相关性稳定性检查:
- - 计算多个时间段的相关性(6个月、1年、2年)
- 要求相关性保持稳定(不恶化)
- 标记近期相关性低于历史相关性超过0.15的配对
步骤4:协整检验
目标: 统计验证长期均衡关系。
为什么协整重要:
- - 相关性衡量短期共同变动
- 协整证明长期均衡关系
- 协整配对可预测地均值回归
- 非协整配对可能永久偏离
增广迪基-富勒(ADF)检验:
对于每个相关配对:
- 1. 计算价差:价差 = 价格A - (贝塔 × 价格B)
- 对价差序列运行ADF检验
- 检查p值:p < 0.05表示协整(拒绝单位根原假设)
- 提取ADF统计量用于强度排序
协整解读:
- - p值 < 0.01:非常强的协整(★★★)
- p值 0.01-0.05:中等协整(★★)
- p值 > 0.05:无协整(排除)
半衰期计算:
估计均值回归速度:
半衰期 = -log(2) / log(均值回归系数)
- - 半衰期 < 30天:快速均值回归(适合短期交易)
- 半衰期 30-60天:中等速度(标准)
- 半衰期 > 60天:慢速均值回归(持有期长)
Python实现:
python
from statsmodels.tsa.stattools import adfuller
计算价差
spread = price
a - (beta * priceb)
ADF检验
result = adfuller(spread)
adf_stat = result[0]
p_value = result[1]
解读
is
cointegrated = pvalue < 0.05
步骤5:价差分析和Z分数计算
目标: 量化当前价差与均衡的偏离程度。
价差计算:
两种常用方法:
方法1:价格差(加法)
价差 = 价格A - (贝塔 × 价格B)
最适合:价格水平相近的股票
方法2:价格比(乘法)
价差 = 价格A / 价格B
最适合:价格水平不同的股票,更易解读
Z分数计算:
衡量价差偏离均值多少个标准差:
Z分数 = (当前价差 - 均值价差) / 标准差价差
Z分数解读:
- - Z > +2.0:股票A相对B昂贵(做空A,做多B)
- Z > +1.5:中等昂贵(关注入场)
- Z -1.5至+1.5:正常范围(不交易)
- Z < -1.5:中等便宜(关注入场)
- Z < -2.0:股票A相对B便宜(做多A,做空B)
历史价差分析:
- - 计算90天滚动窗口的均值和标准差
- 绘制历史Z分数分布
- 识别最大历史Z分数偏离
- 检查结构性断裂(价差制度变化)
步骤6:生成入场/出场建议
目标: 提供可操作的交易信号及明确规则。
入场条件:
保守方法(Z ≥ ±2.0):
做多信号:
- - Z分数 < -2.0(价差低于均值2个以上标准差)
- 价差均值回归(协整p < 0.05)
- 半衰期 < 60天
→ 操作:买入股票A,做空股票B(对冲比率 = 贝塔)
做空信号:
- - Z分数 > +2.0(价差高于均值2个以上标准差)
- 价差均值回归(协整p < 0.05)
- 半衰期 < 60天
→ 操作:做空股票A,买入股票B(对冲比率 = 贝塔)
激进方法(Z ≥ ±1.5):
- - 更低阈值,交易更频繁
- 胜率更高但每笔交易平均利润较小
- 需要更严格的风险管理
出场条件:
主要出场:均值回归(Z = 0)
当价差回归均值时出场(Z分数穿越0)
→ 同时平仓两条腿
次要出场:部分获利了结
当Z分数达到±1.0时出场50%
剩余50%在Z分数=0时出场
止损:
如果Z分数超出±3.0则出场(极端偏离)