How I built The Water Files — build notes

June 12, 2026

Technical notes on aliensgov.fyi: a hand-curated index of the government's own UAP records, plus a live dashboard of who is talking about them. One static page, three Python scripts, no framework, no database, no login.

The premise

The U.S. government is publishing its UAP archive — the PURSUE releases on war.gov. Everyone reports the releases. Nobody reads the metadata.

I read the metadata.

The site makes one narrow claim and proves it: of all the records the government has published, this many put the object at a body of water. Not "near the coast." Not "vibes." The government's own incident location or its own description has to say it: a sea, a gulf, a lake, "over water," "sea skim mode," "in and out of water." Their files, their servers, their words. My count.

Rule one: the count must survive being checked

Everything else follows from this. A claim like "49 of 294 records put the object at the water" is only worth something if a hostile reader can verify it and fail to break it. So:

Curation is by hand, in code, with receipts. The generator

(`tools/build_water_page.py`) holds three explicit ID sets — `STRICT_IDS`, `SHORE_IDS`, `REJECTED_IDS` — and every ID carries a comment saying why it's there. A regex over the official descriptions only suggests candidates for review. It never auto-includes anything.

A Shoreline tier holds the temptations. Coast Guard platforms,

"East Coast" incidents, a debriefing that happened aboard a recovery ship in the North Atlantic while the actual sightings happened in orbit. All of it is close. None of it is counted. The page says so out loud.

The build fails loudly rather than publish quietly wrong. Integrity

guards abort the build on: a curated ID that no longer exists in the data (typo, or the government renamed a record), an ID sitting in two tiers at once, a video record with no embed ID, a release date the code doesn't know yet. A wrong count never ships because the script never finishes.

Numbers appear in exactly one place. The social-card generator imports

its counts from the page generator. Early on I had "47/294" hardcoded in two files; they drifted within a day. Never again.

The recount story is the best advertisement for the rule. When I widened the suggestion regex (it didn't know the word "strait"), it instantly surfaced two Mission Reports with incident location Strait of Hormuz that had been sitting in Release 1 the whole time. The count went from 47 to 49 — up, honestly, with the reasoning in a code comment. An inflated count can only ever be corrected downward, in shame. An honest one gets to go up.

Getting the data when the source blocks you

war.gov returns 403 to anything that smells like a script. The index behind the records page is a CSV (`uap-data.csv`), refreshed in place with each release, and on release day no archive has it yet.

What worked: drive a real browser. Playwright launches an actual Chrome, the page loads like a normal visit, and then a `fetch()` from inside the page pulls the CSV with the site's own cookies and fingerprint. Twenty seconds, 364KB, done. The Wayback Machine's raw captures (`id_` suffix on the snapshot URL) are the fallback for between-release refreshes.

Release 3 dropped on June 12 — 72 new records. The press preview said the tranche would be USO-focused. The actual metadata said: mostly historical CIA and FBI paper, and exactly one new water record — FBI-UAP-PR003, "Orbs Over the Pond," where the FBI's own description reads "hovered just above the water and did not appear consistent with a surface reflection." The site carried the corrected story the same afternoon. Read the data, not the coverage.

The dashboard problem: every public timeline is a fossil

The second half of the site is THE SIGNAL — the top recent X posts on the subject, ranked by live engagement, with follower counts. Building that without an API key turned into an archaeology dig. For the record, as of June 2026:

| Route | Status | |---|---| | X syndication timelines (`syndication.twitter.com/srv/timeline-profile/…`) | Frozen cache. Serves a snapshot that stops in Aug 2025. Looks alive, isn't. | | Guest-token GraphQL `UserTweets` | Frozen cache. Activates fine, returns 100 tweets… newest from Jan 2026. | | Nitter instances | Dead, 503, or empty shells. | | Reddit JSON API | 403 for non-browser clients. | | X per-tweet endpoint (`cdn.syndication.twimg.com/tweet-result?id=…`) | LIVE. Text, like count, reply count, timestamp, author — current to the minute. | | Guest-token GraphQL `UserByScreenName` | LIVE. Follower counts, current. | | Reddit RSS (`/r/<sub>/hot.rss`) | LIVE. Patient clients only — it rate-limits fast. |

So: every way to list tweets is dead, but every way to look up one tweet is alive. That inverts the problem. I don't need a search engine — I need a source of tweet IDs worth looking up.

The crowd is the search engine. The UFO communities on Reddit — r/UFOs, r/aliens, r/UAP, r/HighStrangeness — spend all day finding the viral posts and linking them. Their RSS feeds are public. Extract every `x.com/…/status/…` link from the hot pages, hydrate each ID against the live per-tweet endpoint, attach live follower counts (cached seven days — followers don't move fast), filter to the subject and the last ten days, rank by likes + replies, keep the top twelve. Three stages, three scripts' worth of patience, zero credentials.

The ranking rule is printed on the page itself, same as the water count's methodology: likes + 2×reposts + replies, the numbers are theirs, not mine, I only decide what counts as the subject. A dashboard that hides its ranking rule is an opinion wearing a lab coat.

The pipeline is fail-closed at every stage: feeds unreachable → skip them; zero qualifying tweets → keep yesterday's data; integrity guard trips → no deploy. The failure mode is "slightly stale," never "wrong" and never "blank."

And yes — the dashboard always carries my own latest posts on the subject, green-bordered, labeled RECOGNITION LAYER. They are selected by recency, not engagement, because the dashboard's job is to carry the current beat. The curator does not pretend to be neutral about the curator. Everyone else gets ranked by the numbers; I get ranked by showing up today.

Lessons that generalize

1. Verify the freshness of any "working" endpoint. Three routes returned HTTP 200 with plausible, well-formed, months-old data. A status code is not a pulse. Check the dates before you build on it. 2. When lists are blocked, find IDs elsewhere and hydrate. Discovery and lookup fail independently. Communities, feeds, news coverage — anything that emits IDs turns a dead search API into a live dataset. 3. Hand-curate the load-bearing claim; automate everything around it. The regex suggests, the human-readable ID lists decide, the guards enforce. Automation that adds to a count without review is how counts die. 4. Fail loudly or stay stale — never publish quietly wrong. Cheap to build, priceless the first time a guard trips. 5. A real browser is the universal key. If a server wants a human, send it something indistinguishable from one and ask politely from inside the page. 6. Put your methodology where your numbers are. The invitation to check the count is the brand. It is the difference between me and the people who made you look up.

Stack

Three Python scripts and a static deploy: `build_water_page.py` (curation + page), `build_og.py` (social cards, Pillow, counts imported), `fetch_signal.py` (discovery → hydration → ranking → `signal.json`). One HTML file out, inline CSS, no JS framework, CRT-phosphor aesthetic. Vercel for hosting; a deploy is six seconds. The whole site rebuilds from two data files and three ID sets, and every number on it can be recomputed by anyone with the patience to read a CSV.

The ocean kept the spacecraft. The metadata kept the score.