A Hyperliquid perp trading bot. Mostly trend on majors with a mean-reverting sleeve and a daily mining loop that promotes new buckets to live. Written up so the curious — quants, traders, builders — can tell me where the edge is fragile.
Last updated 2026-04-27Reading time ~12 minBy @samlogic
00Read this first
Frame
This is a working bot, not a finished one. It runs live on a small VPS and writes both shadow and live trades to a single SQLite ledger. I am sharing the thinking, not the keys.
What I want from anyone reading: a sparring read. The "Open questions" section (10) is where I think the edge is most fragile. I would rather you tear those up than be polite about anything.
Out of scope: code style, repo layout, infra. In scope: edge, signal, friction, sizing, regime, sample size, anything you would test first.
01Edge thesis
Hyperliquid is a perps DEX with thin retail flow on tail listings, fast funding-rate swings, and a maker rebate that most of the on-chain crowd ignores. The bot bets that directional setups (momentum after consolidation, mean reversion after exhaustion) clear realistic friction (4.5 bps taker, 5 bps slip, hourly funding) more often on this venue than on a CEX of comparable depth, because the marginal participant is less informed and less hedged. That is the prior. The job of every iteration is to test it harder.
02What's running
One Rust process, hl-bot, on a Hetzner VPS. Tick interval: 30s. Single SQLite ledger for shadow and live decisions.
Two coded strategies (momentum, mean_reversion) plus dynamically promoted buckets from the alpha-mining loop (ema_pullback, trend_alpha).
Live and shadow paths run together every tick. Shadow uses looser thresholds and feeds the daily training loop. Live uses strict thresholds and only fires if the candidate also matches a promoted bucket in live_candidates.json.
Read-only chain-of-thought HTTP server on 127.0.0.1:8931, exposing /decisions, /positions, /performance, /health. The "Live data" section below renders these once the tunnel is wired.
Daily cron at 03:00 UTC promotes new (strategy, side, session, regime, vol_bucket) tuples to live based on rolling-Sharpe. A second cron at 03:15 auto-implements queued ideas via Claude.
03The loop, one tick
1. BTC regime check → Long | Short | Neutral (single asset)
2. Account state → equity, drawdown, daily PnL
├─ drawdown kill switch? yes → flatten + halt
└─ daily loss halt? yes → no new entries
3. Manage open positions → stop / target / partial-TP@1.5R
→ BE move + trail (v4 only)
→ liquidation check
4. Market snapshot → funding rates + mark prices
5. Per-coin scan (every coin, every tick)
├─ momentum signal? record shadow if new candle
│ score × regime multiplier
├─ mean_reversion signal? record shadow if new candle
│ score × regime multiplier
└─ best signal wins
6. Live gate score ≥ threshold AND
matches live_candidates.json bucket
AND not shadow_only
7. Execute IOC limit, size = compute_size(risk_fraction)
reduce-only stop posted as backup order
Two things worth flagging. One: regime is a scoring input, not a hard gate. Mean reversion still runs in trend regimes, just with a 0.6 multiplier on score. Two: shadow inserts always happen, even when live is gated off, because the training loop runs on shadow data.
04Strategies
momentum
strategy.rs
EMA8/21 cross on 15m, confirmed by 20-bar breakout, RSI zone, volume spike, ATR floor, ADX ≥ 25 with rising slope, funding filter.
atr_stop
1.5×
atr_target
3.0×
rr
1:2
adx_min
25
vol_mult
1.5
rsi_long
40-80 (loose), with directional confirmation
Shadow: Same logic, ADX ≥ 15, vol ≥ 1.0×, RSI 40-80 LONG / 20-60 SHORT. Wider net for ML training data.
mean_reversion
strategy.rs
RSI extreme + price outside Bollinger Band + tight ATR stop. Skipped when bands are expanding (BB width > 5%, that is a trend, not a reversion).
Renamed from ema_cross when entry semantics changed (commit 95aa680). Now requires pullback to EMA, not raw cross. Promoted via the alpha-mining loop; lives in live_candidates.json gated by (session, regime, vol_bucket).
context
Currently active in EU_OPEN and US_MAIN sessions, low-vol bucket (vol_ratio < 0.6).
Shadow: Same.
trend_alpha
live_candidates.json
4H trend-following bucket on ETH / SOL / AVAX. Promoted from the mining loop, not coded as a first-class strategy in strategy.rs.
context
Live on the 4H timeframe basket. Recent commit history shows iteration on trail config (3R → 2R → revert).
Shadow: Same.
Real-vs-shadow split is the most important architectural choice in here. Shadow casts a wide net so the training loop has signal density. Live tightens every threshold. It is the cheapest way I have found to train on more data than I trade on without contaminating live PnL.
05Regime and scoring
Regime is computed off BTC alone (1H bias plus a longer-window classifier in strategy::btc_regime) and applied as a scalar multiplier on the score of every candidate signal across the basket:
This is a soft preference, not a filter. Mean reversion can still beat momentum on score in a trend regime if the setup is clean enough. The regime multiplier just makes that uphill. The honest critique I expect is that single-asset BTC regime is a coarse proxy, and that alts do their own thing during liq cascades. I have left this on the open-questions list.
06Friction model evolution
Every shadow row is tagged with a sim_model_version string so we can re-run analyses against any single friction model and never accidentally mix regimes. The current default is v4, mirroring the live engine. Pre-v4 paper stats are not directly comparable.
Tag
Added
Why
Learned
legacy
pre-2026-04
legacy_pre_friction_v2
First Rust port of the Python prototype. Toy execution model, no realistic costs.
Get a Rust loop running end-to-end and recording shadow trades to a single SQLite file.
PnL plotted in shadow was meaningfully better than what live trading produced. Friction was the lurking variable.
v2
2026-04
friction_v2_fee4p5_slip5_funding_hourly
Taker fee 4.5 bps, fixed slippage 5 bps, hourly funding cost. Sim entries and exits use these.
Stop calling shadow PnL "edge" until it survives realistic execution costs.
Most legacy "winners" were costs in disguise. Edge survival rate dropped roughly to a third.
Setup-key dedup. One economic setup, even if multiple strategy tags fire on the same candle, becomes one row in training data.
Earlier records double-counted: momentum and ema_cross would both fire on the same breakout, and the trainer treated them as two independent observations.
Effective sample size shrank, which made some "high-conf" buckets clearly overfit on rebased data.
Liquidation tracking, partial-TP at 1.5R for 50% of size, stop moves to breakeven on partial, trail kicks in further out.
Live engine had partial-TP and trail; shadow did not. Paper and live diverged on every winning trade. v4 mirrors live, so paper and live now share the same friction model.
Pre-v4 paper stats are not comparable. Last two weeks of commits are still iterating on partial vs no-partial (Option A revert in 4e08a53).
07Risk and exit management
Sizing. R-based. risk_fraction of equity per trade, where R is the entry-to-stop distance. compute_size in execution.rs.
Drawdown kill switch. Beyond a configured floor, the bot flattens and stops opening new positions until manual reset.
Daily loss halt. Soft halt: existing positions still managed, no new entries until UTC rollover.
Concurrent positions. Capped by max_positions. Currently no correlation-aware sizing penalty (open question 5).
Partial-TP and trail (v4). At 1.5R profit, take 50% off, move stop to breakeven on the runner. Trailing kicks in at a further R threshold (currently 2R, recently revisited). All gated on sim_model_version == friction_v4 so paper stays consistent with live.
Backup stop on the exchange. Live runs post a reduce-only stop order on the book in addition to the in-process check. Belt and braces, after one incident where a process restart left a position unguarded.
08Alpha mining loop
Every day at 03:00 UTC a Python loop on the same VPS reads the rolling shadow ledger, groups closed trades by (strategy, side, session, regime, vol_bucket), computes rolling-Sharpe over a window, and writes the surviving tuples to live_candidates.json. The bot reloads that file at startup. A second cron at 03:15 UTC takes any auto-implementable ideas from a queue and turns them into PRs.
The promotion gate is the part I am least confident in. Sample sizes per bucket are small and parameter counts are not. I have a hand-wavy "Sharpe over rolling window" rule rather than an adversarial one, and that is open question 8.
09Live data
Pulls from https://hl-cot.samlogic.org on page load. Auth-gated by Cloudflare Access; if a panel says "auth required", open https://hl-cot.samlogic.org/health once to sign in, then refresh.
Open positions
loading…
7-day performance
loading…
Last 5 closed trades
loading…
Latest decisions (open + closed)
loading…
Each closed-trade row answers three questions: which strategy fired, what the bot saw at decision time, and what happened. The strategy field is the link from "this trade" to "this paragraph above", so you can always trace a winner or a loser back to the rules that produced it.
No free-text rationale, but (strategy, regime, score, features) is sufficient to reproduce the "why" deterministically. If you see a trade that looks wrong, the strategy tag points you to which gate let it through.
10Open questions
Things I think are weak. Ranked rough by my own confidence that they matter, descending. Tear up freely.
01Single-asset regime classifier on BTC
Regime::Long/Short/Neutral is computed from BTC alone, then applied as a scoring multiplier across the basket. Alts decorrelate during liq cascades. Does this hold up on SOL/DOGE during BTC chop, or is the multiplier numerically right and economically wrong?
02Partial-TP at 1.5R, then BE+trail
Recent commits flip-flop: Option A disabled partial entirely with a looser 3R/1.5R trail, then reverted to partial=0.5 at 2R trail. Is partial structurally negative EV in trend regimes (you cap the right tail), or just noisy on the current sample? Should partial be regime-conditional?
03Slippage as a fixed 5 bps
Same slip applied to BTC ($M depth at 5bps) and DOGE ($k depth at 5bps). Probably too generous on tail names, too punitive on majors. Worth a per-coin or per-depth model? Or is constant-slip honest enough that the basket averages out?
04ADX rising-over-3-bars filter
Adds 45 minutes of latency on the 15m before momentum can fire. Walk-forward shows it filters losers, but is the filter cutting fresh trends with the bad ones?
05Concurrent correlated trades
Momentum on ETH + SOL + AVAX in a trend regime is functionally one trade with three legs. risk_fraction is per-position, not per-cluster. Should sizing penalize implied correlation, or is the basket a feature?
06Funding filter thresholds
Headwind cap at 0.05%/8h, tailwind bonus floor at 0.01%/8h with bonus capped at 20 pts. The bonus floor is small enough to be noise on most pairs. Is funding a real signal or a costs-only term?
07No book-side signals in the live path
book.rs and mm.rs exist but I have not yet wired their output into scoring. Closed-candle features are most of the signal. A ranked list of which microstructure features are worth the engineering cost is more useful than a yes/no.
08Promotion gate from shadow to live
Daily mining loop promotes (strategy, side, session, regime, vol_bucket) tuples to live_candidates.json. The gate is rolling-Sharpe over a window. Sample size and alpha threshold are not yet adversarial. What is the right minimum n and t-stat before any bucket touches real money?
11What I'd value from you
Read the open questions and tell me which are real and which are paper tigers.
Anywhere you see structural overfitting (parameter counts vs effective sample size) flag it.
If you had two weeks on this, what would you test first and why?
Anything I am not asking that I should be?
Notes via @samlogic on X are easiest. I keep this page updated, so a sharp comment one week can change a section the next.