Build Log

From $297 to an open-source truth detector

Three forex bots, fifteen production bugs, and a year of backtests that taught me one hard lesson: most trading "edges" are lucky streaks wearing a disguise. Here's the whole story — in order — ending with the tool I built to stop fooling myself.

The Gauntlet on GitHub →  ·  Live performance  ·  Long-form blog

The bot's backtest said PF 1.40. The live account's first 120 trades said PF 0.46. Closing that gap is what this journey has been about.

Timeline

Aug 2025 · V1 idea

A friend made $300,000 on OANDA

A friend of mine turned a small deposit into roughly $300k over a few years of careful, mostly-manual forex trading. I wanted to understand what he was doing well enough to automate a version of it on my own laptop — no VPS, no prop firm, no subscription signal service. Just a script that reads the market and takes the trades a disciplined person would.

Sep 2025 · V1 → V2

V2: A $297 live Python bot

The first real deployment was a ~4,200-line Python script running against an OANDA live micro-account with $297. Simple rules: EMA crossover with RSI confirmation, fixed percent risk, nothing fancy. It taught me more about spreads, slippage, and session liquidity in two months than any YouTube video.

Jan 2026 · V3

V3: Rebuilt on an $85K OANDA demo

V3 moved to an OANDA practice account with a sizeable demo balance so I could stress test features that need more room: six strategies, session filters, regime detection, per-pair blocks, ML entry filters, confluence scoring. The codebase grew to ~7,200 lines with a PyQt desktop GUI and rolling logs.

The backtest looked good. The live account didn't.

Apr 2026 · The audit

The 45-trade audit that exposed the gap

After 17 days of live trading, V3 had closed 45 trades with a profit factor of 0.54 — a losing system. I paused the bot and replayed every one of those trades in a simulator with the same candles, the same signals, but with a different exit: a wide stop, a break-even shield, and a fixed 2.75R take-profit instead of a tight trailing stop.

Same 45 entries. Different exits. PF 1.60. The edge was always there — the exit was giving it all back.

Apr 11, 2026 · The winning config

Three-phase exit, locked in

The fix is three phases:

  • Phase A · Wide stop at 2 × ATR (8–14 pips). Absorb the noise.
  • Phase B · Once the trade gains +10 pips, move the stop to entry + 3. Lock in a small win, keep upside open.
  • Phase C · Fixed take-profit at 2.75 R. No trailing — trailing was eating reversals.

Position sizing uses 1% flat risk calculated on the wide stop distance — so the dollar risk is constant even as the pip stop gets wider. Backtested the same way, simulated the same way, shipped live.

Apr 11 · Bug hunt

Fifteen bugs between "config saved" and "bot runs it"

Rolling the config out wasn't one commit — it was fifteen. Each one was a subtle mismatch between what the simulator did and what the live execution actually did. Any single one of them was enough to turn a winning config into a losing one. A sample:

BUG #1

Breakeven racing the shield

Legacy BE code fired at +3 pips and moved the stop, killing the shield at +10.

BUG #3

Partial TP half-closing

An old "take half off at 60% of TP" block was trimming every winner in half.

BUG #5

MFE override flipping exits

A favorable-excursion rule was swapping the fixed TP back to the old trailing one on winning trades.

BUG #6

max_sl_pips clamp

A 10-pip cap on stops was silently narrowing the wide stop back to the old one on volatile pairs.

BUG #11

Dynamic risk scaling

Kelly-VAPS, ML confidence, and streak multipliers compounded — on losing days, 1% risk became 0.47%.

BUG #14

Empty-dict deep merge

Config override of {} didn't clear the default — so a stale pair-risk override kept leaking through.

BUG #15

500K unit cap

On an $85k balance the 500,000-unit cap halved every trade's real risk to 0.5%. Would have halved expected return.

… + 8 more

Read the full list

Pyramiding logic, tick-level proportional BE, two separate MFE-tracking paths, an unscoped local variable, a missing account fallback. The blog post walks through every one.

Apr 12 · Winning config live

Relaunched on the demo account

With all 15 bugs fixed and a dry-run passing on $85,733 balance (all 7 pairs sizing at exactly 1.00% risk), V3 went live Sunday evening. The 7-day tab on the Journal page is watching this run in real time.

Apr 18, 2026 · V4

V4: A Rust rewrite for the hot path

V3's weakness isn't the strategy anymore — it's the Python GIL and the ~30ms latency between a price tick and a decision. V4 rebuilds the data plane in Rust: async OANDA streaming, lock-free indicator updates, sub-millisecond decide() latency, the same PyQt GUI reading shared state through JSON files and SQLite.

V4 runs in shadow mode first — every signal is logged as if the bot had traded it, but no orders are actually submitted. This lets us validate against live prices for a week before flipping execute_trades = true.

Apr 19 · V4 live, journal public

Publishing the journal instead of selling signals

This site used to advertise paid signal subscriptions on the strength of a backtest. The live numbers didn't back that up — so instead of selling signals, I published the full trade log. Every trade visible, wins and losses.

May 2026 · The deeper problem

It was never just one bug

Fixing the exit helped, but the gap between backtest and reality kept coming back in new forms. The real problem was bigger than any single bug: I kept finding "edges" that were really just lucky stretches. Mine enough variations of a strategy, keep the best-looking one, and a coin flip starts to look like skill. I was fooling myself — methodically.

May 2026 · The Gauntlet

I built a tool whose only job is to say no

So I stopped tuning strategies and built a validation framework with one purpose: to reject a strategy, not bless it. It runs a strategy's trades through a row of statistical kill-tests — a Deflated Sharpe that penalizes how many variants you tried, a bootstrap that exposes the real drawdown tail, a cost-stress test, a regime split, a parameter-plateau check.

Then I ran everything I'd ever built through it. Most of it died. That was the point — a backtest that can't fail you can't protect you either.

May–Jun 2026 · The honest negatives

What didn't survive (almost all of it)

Tested cleanly — real spreads, no look-ahead, multi-year samples:

  • Intraday scalping · died on the spread; the edge per trade was smaller than the cost.
  • Mean reversion · slightly negative; gating it to "ranging only" made it worse.
  • Trend-following on FX · basically zero, at every lookback.
  • Momentum, pairs/cointegration, intermarket, month-end flow · all lost, or didn't survive costs.
  • "Confluence" (only trade when everything agrees) · made it worse — averaging a real signal with noise just adds noise.

The pattern was undeniable: simple price-pattern edges on FX majors don't survive realistic costs. The good stretches were hot streaks that gave themselves back.

Jun 2026 · What actually survived

Two real edges — and a portfolio that beat them both

Carry (earning the interest-rate differential) was small but real. Trend-following on a diversified basket of indices, metals and bonds was the strongest, most robust thing I found. And stacking a few genuinely uncorrelated thin edges into one portfolio beat any of them alone — the best result of the entire project.

Jun 2026 · The catch that explained the whole year

The one edge that worked is one I can't trade

I queried my own live account to be certain. US OANDA lets US retail traders trade spot FX only — no indices, no oil, no bonds, not even gold. The diversified trend edge, the strongest thing I found, runs on instruments my account literally cannot touch.

That reframed everything. The year of struggle wasn't a strategy problem — it was an access problem. I'd been forced to fish in the one market (hyper-efficient FX majors) where a small mechanical edge is hardest to find.

Jun 2026 · Now

Open-sourcing the thing that told the truth

The most valuable thing I built this year wasn't a money-making strategy. It was the tool that reliably told me when I didn't have one — and kept me from betting real money on luck. So I'm giving it away.

The Gauntlet is on GitHub → One file, zero dependencies. Run the demo and watch it pass a real edge and kill a data-snooped fake in about a minute.

What we learned

A backtest is a hypothesis, not a result

Our backtest and our live bot ran different code. Same repo, same config file — different code paths. The only way to catch that is to replay real trades against the live engine, not just against a tidy simulator loop.

Exits matter more than entries

The same 45 entries produced a losing system with one exit and a winning one with another. We spent months tuning signal filters when the leverage was in the exit logic.

Hidden multipliers eat edge

Dynamic risk scaling sounds responsible and reasonable. In practice it meant our losing-streak trades were half-sized by the time the winning trade showed up. A flat 1% is boring, and boring is profitable.

Session-aware blocks beat universal tuning

EUR/GBP lost 6 straight on our account. Rather than re-tune it, we blocked it — and the next winning trade somewhere else paid for the analysis. Blocking a pair is cheap; re-tuning is expensive.

Most "edges" are just luck in disguise

Try enough variations and keep the best, and randomness alone hands you a great-looking backtest. The antidote is a Deflated Sharpe that charges you for every variant you tried — it's the gate that killed most of my "winners."

Access beats strategy

The strongest edge I found needed instruments my US spot-FX account can't trade. A whole year of pain turned out to be the wrong market, not the wrong code. Where you're allowed to trade matters more than how clever the rules are.

The negative results were the product

I didn't find a money printer. I found a reliable way to know when I don't have an edge — worth more than another hopeful strategy. An honest "no" is the rarest and most valuable answer in trading.

See the tool, or read the long version

The Gauntlet is open source — one file, no dependencies, with a runnable demo that passes a real edge and kills a data-snooped fake in a minute. And I'm writing up each phase on the blog with the real code and the real numbers.

The Gauntlet on GitHub →    Read on WordPress