hckrnws
Show HN: March Madness Bracket Challenge for AI Agents Only
by bwade818
I built a March Madness bracket challenge for AI agents, not humans. The human prompts their agent with the URL, and the agent reads the API docs, registers itself, picks all 63 games, and submits a bracket autonomously. A leaderboard tracks which AI picks the best bracket through the tournament.
The interesting design problem was building for an agent-first user. I came up with a solution where Agents who hit the homepage receive plain-text API instructions and Humans get the normal visual site. Early on I found most agents were trying to use Playwright to browse the site instead of just reading the docs. I made some changes to detect HeadlessChrome and serve specific html readable to agents. This forced me to think about agent UX even more - I think there are some really cool ideas to pull on.
The timeline introduced an interesting dynamic. I had to launch the challenge shortly after the brackets were announced on Sunday afternoon to start getting users by the Thursday morning deadline. While I could test on the 2025 bracket, I wouldn't be able to get feedback on my MVP. So I used AI to create user personas and agents as test users to run through the signup and management process. It gave me valuable reps to feel confident launching.
The stack is Next.js 16, TypeScript, Supabase, Tailwind v4, Vercel, Resend, and finally Claude Code for ~95% of the build.
Works with any model that can call an API — Claude, GPT, Gemini, open source, whatever. Brackets are due Thursday morning before the First Round tips off.
Bracketmadness.ai
By the way, for anyone wondering/unfamiliar, the scale of this tournament is such that it's not realistically possible to enumerate all possible outcomes, let alone submit them to a site like this. With 63 games, there are 2^63 possible brackets, and it takes 63 bits to encode each possibility.
2^63 brackets * 8 bytes/bracket ~= 74 exabytes - just to list all possible combinations!
There are many combinations that are completely unlikely, but even if you could reduce this by 90% (I doubt it) it's still infeasible to even list all the combinations.
Someday, in another 20-30 years, this might be achievable. Somehow I feel like it will be a sad day when that happens. Of course the tournament will probably have expanded to 128 by then making it safely out of reach of computation.
apparently there has never been a perfect submitted bracket. according to the NCAA, the further someone has gone is 49 games in 2019
https://www.ncaa.com/news/basketball-men/bracketiq/2026-02-2...
It would be interesting to have a couple of "control" brackets, like one that simply picks a random winner for each game and one that always picks the highest seed as the winner for each game.
I would actually be neat to have human-picked brackets in here too, or at least import a few expert-picked brackets from various sources for comparison.
I wonder if the edge here is not going to come down to which model you choose, but which sources of information you give it. You'll want stats on every team and player, injuries, and expert analysis, because none of this season is going to be in the training sets.
the edge is going to come down to variance just as God intended
lol
good idea!
Love it! Just this morning I asked my claw to fill out a bracket on ESPN and invited it to join a group with me. It was a bit clunky (Disney's signup within an iframe was tricky and navigating the bracket to make picks with JS took a few repeated tries) but felt pretty science-fiction when it actually worked.
I thought about using claw but felt like overkill and wonder if an AI browser (atlas etc) would do the trick.
For sure it was overkill/not the most efficient approach - really I was more just curious if it would work. The answer was "kind of", but even that is pretty amazing. I can't imagine telling myself 5 years ago that I could text a computer and have it fill out its own bracket on a commercial site like ESPN.
its going to be cool when you put in your todo list in the morning that you need to fill out your espn bracket and by lunch your agent will have 3 different versions ready for your review
Designing interactions for autonomous agents is tricky — you can’t assume a human will click through a UI. I’ve been experimenting with autonomous scientific agents: a lightweight Python system that uses sparse regression to derive physical laws from raw data. It was able to estimate the Sun’s ~27‑day rotation period to within 93 % accuracy and even found a temperature ∝ v^3.40 power law in solar‑wind measurements. Experiences like yours building an API‑first bracket challenge mirror the same need: build clear machine‑readable interfaces so agents can focus on analysis, not wrestling with front‑end logic.
thats really cool what you are doing, although i wont pretend to understand it lol
the higher end models and agents seem to get it, but even my plain English api instructions trip up browser-based ai like chatgpt and gemini
Thanks! The key insight: don't fight the model's limitations, design around them.
Our agents never touch retrieval or search — that's all deterministic code (FTS, sparse regression, power-law fitting). The LLM only comes in at the end to synthesize results it can verify against the data.
The "plain English instructions trip up browser AI" problem mostly comes from those models trying to do too many things at once.
Narrow the scope, nail the output format, and even mid-tier models get reliable.
amen
There isn't an LLM inside of my code. The agents need to submit a perfectly sturctured json, and then the code verifies it
Yeah exactly — that “no LLM in the loop” constraint forces everything to be explicit and verifiable.
I put together a few experiments where the system rediscovers known laws directly from raw data (solar wind, exoplanets, etc).
Happy to share if you’re curious — still very early but interesting to see what emerges.
Interesting idea! Funny enough I tasked Opus 4.6 this morning with online research and to create me a bracket. I submitted it to my friends telling them I had AI make it. It mostly picked top seed teams but chose 7 strategic upsets based on whatever research it did. I literally just kicked it off, continued with my real job, and punched in the results when it was done; so I’m not sure exactly its stated methodology.
I'm usually pretty opinionated on using AI for reasons I generally view as productive - for example, not moltbook - however this is actually really neat and doesn't require a ton of token usage assuming you don't instruct your agent to do multiple turns of analysis on the stats :)
It'll be interesting to see what strategies agents choose to implement & whether there are any meaningful trends.
Thanks - me too! We'll see what strategies rise to the top. But you can also do weird things like pick the team by tallest center, which your agent can figure out in a few minutes! Or alphabetical order in each match up.
Really cool idea. My son is using different LLMs to fill out brackets for his 4th grade science experiment, and then we are going to compare them to the experts. I like your idea of Strategy/Inspiration prompting, we had to tell them that "upsets happen" because all the favorites were picked on first pass.
Tangentially, I wonder if we are going to see AI predictions impact point spreads.
I know multiple people that are building arbitrage models with their agents. i bet it makes the markets pretty efficient
it's interesting because even expert analysts are consistently wrong, upsets happen all the time, there are 'playoff teams' that are trash in regular season but 'overperform' in playoffs. wonder how most people will approach this without the agents just guessing.
Tried it out, works flawlessly. The basic process cycle is clean and easy to follow. Kept it to CC Haiku with a bit of discussion on approach.
Only thing that wasn't 100% clear was the locking mode. Do I have to lock before games start or will it just auto-lock whatever I have? Claude assumed it would auto-lock.
it does auto lock - i dont know why i included that feature.
thanks for the feedback!
[dead]
yeah - right now you need to build the scaffolding but i assume we are weeks/days away from the labs fixing this
my only stumble was there already being an api key in the skill repo… was that intentional?
It was supposed to be an example API key, but then i think it got reset in the backend when i updated the bracket for 2026 Sunday night. I'll take a look.
what did you think about the /skill feature? that was a UI side quest, but i want to explore this UX further
If you – like me – have no idea what this is about, it is about some basketball competition.
HA true
What makes agents unique to traditional stats backed money-lines on sports books?
very cool and well done.
i wonder if we will see a materially larger number of brackets filled this year than the recent trajectory would indicate (as a very coarse indicator of agent-filled brackets).
Not to be that guy, but your 'solution where Agents who hit the homepage receive plain-text API instructions and Humans get the normal visual site' is defeated by curl -L
curl bracketmadness.ai -L
# AI Agent Bracket Challenge Welcome! You're an AI agent invited to compete in the March Madness Bracket Challenge.
## Fastest Way to Play (Claude Code & Codex)
If you're running in Claude Code or OpenAI Codex, clone our skills repo and open it as your working directory:
(cont) ...
I like the idea of presenting different information to agents vs humans. I just don't think this is bulletproof, which is fine for most applications. Keeping something 'agent-only' does not seem to be one of them.
Fair point!
I was trying to balance having UX for humans and having the data easily available for agents. But yes, you could technically navigate the API calls yourself.
such a fun idea! I like your solution for detecting when agents are reading your page vs. humans.
oh I love this, MoltFire about to wax that ass! What's first get? $100 in Claude tokens?
bragging rights!
Very cool. I was trying to do something similar (not for march madness brackets), but ran into a problem with chatbots in that they wouldn't follow URLs that weren't provided directly by the user (claude would but only whitelisted sites), so I couldn't get it to do actual POSTs etc. for authentication. Claude.ai would instead create react app (fragments). I eventually built a remote MCP for it, but a HATEOS styled REST API would be far preferable.
Any tips?
No - I think we are months (weeks/Days?) away from chatbots being able to interact with apis, so thats why i limited it to just agents that have abilities to write apis.
I tried to get it so that people could paste chatbot written json into a submission form but that is less elegant. So now i have a zoom call set up with my dad so he can install CC lol
OK so the same issue I ran into. I ended up creating a remote MCP (basically like a REST API) that does oauth 2.1. If interested you can check it out here: https://github.com/pairshaped/gleam-mcp-todo
Thats cool!
somebody use mirofish and see if it really predicts everything
[dead]
[dead]
[dead]
Crafted by Rajat
Source Code