2
yrs running
20.5K
PHP LOC
5649
Python LOC
5
automation modules
0
editorial committee

A working Vikings beat that ships every day. The frontend is hand-written PHP with mobile-first CSS Grid; the backend is four cron-driven Python agents that scout RSS, draft articles through Claude with a pre-publish lint gate, refresh cap-state context, and ingest social embeds for review. The site is live, indexed in Google News, and accountable: drift in personnel facts is surfaced in an admin dashboard so unpublished articles can be requalified before they ever face a reader.

Tech scope

  • Hand-written PHP per page (index.php, article.php, roster.php, scores.php, draft.php, cap.php, schedule.php, ...) plus 19 admin views. No framework; templates are PHP includes.
  • SEO surfaces are first-class: JSON-LD per article, dynamic XML sitemap with Google News tags, RSS, an internal-link injector (inject_internal_links()), corrections page, author byline pages.
  • Custom search overlay bound to ⌘K / Ctrl+K / / with a debounced live-results modal.
  • Schema is the single source of truth: column-name gotchas (excerpt not summary, body_html not body) are documented in .claude/rules/schema.md and consulted before any SQL change.

Automation

Four agents run on a 30-minute cron from automation/automation_engine.py:

  • Scout (Agent A) — 7 RSS feeds (4 Vikings-dedicated + 3 NFL keyword-filtered) write into a scraped_sources queue.
  • Writer (Agent B) — one unprocessed source per 48h gets drafted through the Anthropic Claude API into an article (is_published=0) and then must clear the pre-publish lint gate.
  • CapScraper (Agent C) — refreshes cap-state context into site_settings[''cap_intel''].
  • SocialScout (Agent D) — ingests Reddit JSON, YouTube channel RSS, and RSS bridges into a social_embeds review queue.

An accuracy dashboard piggy-backs on the same 30-minute tick. A self-hosted RSS hub on the MasterAgent box runs every 15 minutes, FTPs RSS XML and a manifest JSON to the site, and SocialScout reads from there.

Editorial gates

One lint function enforces canonical-fact validation across every surface that publishes — the publish gate, the drift dashboard, and the truth engine all call the same code path. When a player’s status changes (released, traded, retired), articles that reference the stale fact are unpublished automatically until an editor inserts the right qualifier ("former", "ex", "then-").

upstream sources draft-scraper automation rss-hub PHP frontend CDN
as of 2026-04-26
PHP20560 HTML9014 Python5649 CSS3665
PHP frontend + Python automation · as of 2026-04-26
public (PHP+HTML) · 28178 design_handoff · 4920 automation draft-scraper · 1518 rss-hub · 1168
top-level dirs · as of 2026-04-26

Numbers

The numbers reflect a one-operator beat. Two languages (PHP for the frontend, Python for the automation), one cadence (the NFL calendar), and a publishing rhythm tied to game weeks rather than a uniform schedule. The automation modules are scheduled, idempotent, and fail loudly — missing a week is preferable to publishing a hallucinated stat. The PHP-heavy line count above is the public site, not application logic; most of the actual decisions live in the smaller Python automation tree.

Constraints

No third-party comment system, no analytics beyond server logs, no signed-in surface. Automation is allowed to summarize box scores and roster moves but never to invent quotes or interpret intent. When LLM rewrites are used (sparingly), they're constrained to copy-edit passes against text the operator has approved — not generation from prompts alone. The ‘0 editorial committee’ plate is structural: the entire publishing decision belongs to one person, which is what makes the cadence sustainable in the first place.

:/ ESC