01The scorecard
Measured over HTTP, June 2026, against each project's documentation page
(e.g. sorbet.org/docs, docs.avohq.io). Each column is a checkable signal of LLM
discoverability: ✓ good, ✗ missing.
“Crawlable” fetches as a Common Crawl bot to catch Cloudflare/WAF blocks. The last column shows
Common Crawl coverage as pages found / sitemap total (— = not
sampled). Click any heading to sort.
| Resource (docs) ▲ | robots allows AI ▲ |
crawlable (no WAF) ▲ |
sitemap ▲ | llms.txt ▲ | content negotiation ▲ |
.md routes ▲ |
Common Crawl ▲ |
|---|---|---|---|---|---|---|---|
| Core (Ruby Central / Rails Foundation / community-run) | |||||||
| Ruby (language) | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 4,004/— |
| Rails Guides | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 165/— |
| Rails API | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 4,968/— |
| RubyGems Guides | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 23/— |
| Bundler | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 160/1,440 |
| RubyDoc.info | ✗blocks ccbot, gptbot, claudebot, google-extended, applebot-extended | ✓ | ✗ | ✗ | ✗ | ✗ | 2/— |
| Frontend & View | |||||||
| Hotwire | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 6/— |
| Turbo | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 6/— |
| Stimulus | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 6/— |
| ViewComponent | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 9/— |
| Phlex | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 9/40 |
| Ruby UI | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | 62/63 |
| Inertia Rails | ✓ | ✓ | ✗ | ✓ | ✗ | ✓ | 19/— |
| Lookbook | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 17/— |
| Vite Ruby | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 6/— |
| Web Frameworks | |||||||
| Roda | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 31/— |
| Sinatra | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 22/— |
| Hanami | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | 28/— |
| Bridgetown | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 34/— |
| Jekyll | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 30/210 |
| Rage | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | 0/33 |
| Data & ORM | |||||||
| Sequel | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 181/— |
| ROM | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | 16/— |
| dry-rb | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | 46/— |
| AI | |||||||
| RubyLLM | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | 11/23 |
| Background, Realtime & Deploy | |||||||
| Sidekiq | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 2/— |
| AnyCable | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 42/— |
| Kamal | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 81/— |
| Falcon | ✓ | 404 | ✗ | ✓ | ✗ | ✗ | 2/— |
| Karafka | ✗blocks ccbot, gptbot, google-extended | ✓ | ✓ | ✓ | ✗ | ✓ | 1/— |
| Heroku | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1/— |
| Fly.io | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 30/— |
| Render | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1/— |
| Railway | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1/— |
| DigitalOcean | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | 1/— |
| AWS Elastic Beanstalk | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 66/— |
| PlanetScale | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1/— |
| Supabase | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1/— |
| Neon | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1/— |
| Tooling & Types | |||||||
| Sorbet | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 15/86 |
| RuboCop | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 360/1,583 |
| RSpec | ✗blocks gptbot | ✓ | ✗ | ✗ | ✗ | ✗ | 223/— |
| TestProf | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 35/— |
| Pry | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 1/— |
| Brakeman | ✓ | 000 | ✗ | ✗ | ✗ | ✗ | 1/— |
| Standard | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 1/— |
| Sentry | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 142/— |
| AppSignal | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 15/— |
| New Relic | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | 19/— |
| Datadog | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 1/— |
| Honeybadger | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | 16/— |
| Rollbar | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | 1/— |
| Bugsnag | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 8/— |
| Scout APM | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | 1/— |
| Skylight | ✓ | ✓ | ✗ | ✗ | ✗ | ✓ | 69/— |
| Better Stack | ✓ | ✓ | ✓ | ✗ | ✗ | ✓ | 1/— |
| Papertrail | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 1/— |
| GitHub Actions | ✓ | ✓ | ✗ | ✓ | ✓ | ✓ | 1/— |
| Libraries | |||||||
| GraphQL Ruby | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 1,595/— |
| Rodauth | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 54/— |
| Action Policy | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 4/— |
| Shrine | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 4/124 |
| Avo | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 71/— |
| ActiveAdmin | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 10/— |
| Ransack | ✓ | 404 | ✗ | ✓ | ✗ | ✗ | 2/— |
| Pagy | ✓ | 404 | ✗ | ✓ | ✗ | ✗ | 2/— |
| Nokogiri | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 14/— |
| Faraday | ✓ | 404 | ✗ | ✓ | ✗ | ✗ | 1/— |
| Capistrano | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 16/— |
| Trailblazer | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 83/— |
| imgproxy | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ | 168/529 |
| Flipper | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | 35/70 |
| Devise | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 2/— |
| Pundit | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 1/— |
| CanCanCan | ✓ | ✓ | ✗ | ✓ | ✗ | ✗ | 1/— |
| Community & Resources | |||||||
| GoRails | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 1,516/— |
| Drifting Ruby | ✗blocks gptbot | ✗WAF block | ✓ | ✗ | ✗ | ✗ | 2/— |
| RubyEvents | ✗blocks ccbot, gptbot, claudebot, google-extended, applebot-extended | ✗WAF block | ✓ | ✗ | ✗ | ✗ | 1/15,775 |
| Rails at Scale (Shopify) | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 28/114 |
| Ruby Weekly | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 281/— |
| Short Ruby | ✓ | ✓ | ✓ | ✓ | ✗ | ✗ | 211/293 |
| This Week in Rails | ✓ | ✓ | ✓ | ✓ | ✗ | ✓ | 26/— |
| Evil Martians | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 161/— |
| Hotwire Weekly | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 1/— |
| Write Software Well | ✗blocks ccbot, gptbot, claudebot, google-extended, applebot-extended | ✓ | ✗ | ✗ | ✗ | ✗ | 2/— |
| Thoughtbot | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 1,050/— |
| AppSignal Blog | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 333/1,063 |
| Riding Rails | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 58/— |
| Joe Masilotti | ✗blocks ccbot, gptbot, claudebot, google-extended, applebot-extended | ✓ | ✓ | ✗ | ✗ | ✗ | 2/134 |
| Code with Jason | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 138/532 |
| Maintainable | ✓ | ✓ | ✓ | ✗ | ✗ | ✗ | 74/241 |
| Ruby News | ✓ | ✓ | ✗ | ✗ | ✗ | ✗ | 153/— |
02What will move the needle
Four levers, ordered by depth, each acting on the same number. Rails is plural by design (omakase defaults and swappable adapters), so the job is to strengthen the default and agree on shared conventions. Each layer shows its goal as a live gauge; together they feed the final boss below.
Layer 0: get into the corpus at all ship now
- Unblock AI crawlers (CCBot, GPTBot, ClaudeBot, Google-Extended) in robots.txt and at the WAF. The RubyEvents one-line fix alone unlocks ~15,775 pages of talks.
- Add sitemaps, server-render, link internally, earn high-authority backlinks.
- CC-license and transcribe conference video; Google trains Gemini on YouTube transcripts and CC-licensed talks flow into open corpora.
Layer 1: win retrieval and publish comparisons (content) ship now
Win retrieval
Why: an agent fetching a Rails or gem doc at request time should get current Markdown it can read, instead of HTML it has to scrape.
- Serve Markdown via content negotiation and
.mdroutes (Mime::Type.register "text/markdown", :md). A real HTTP standard agents already use, the durable bet. Ship llms.txt too, cheaply. - Make rdoc emit Markdown + content negotiation by default; the keystone that lifts every gem at once.
Publish the missing comparisons
Why: a model reaches for what the corpus argues for, and today almost nothing argues, with numbers, that Rails is the better build for these product shapes. 0 of 12 comparisons are solid (current, task-specific, with real numbers); the rest are generic framework takes or absent.
| Build … in Rails | vs JS fullstack Next.js / Node | vs Python FastAPI / Django |
|---|---|---|
| B2B SaaS multi-tenant, team invites, billing | missing | missing |
| Online store catalog, cart, checkout, admin | generic1 | generic1 |
| Team inbox shared inbox, collaborative replies | missing | missing |
| Issue tracker projects, issues, statuses, comments | generic2 | missing |
| Approval workflow upload, route for sign-off, track | missing | missing |
| Internal admin staff-login CRUD dashboard | generic2 | generic3 |
Sources: 1 monterail.com · 2 kunalganglani.com · 3 goudeketting.nl
Snapshot 2026-06, by web search per task × stack, then judged. solid current + task-specific + numbers · generic framework pros/cons or boilerplates · missing nothing credible. It's a web search, refreshed each pass.
Layer 2: make agents fluent in the gems (tools) ship now
- Agree a convention so any gem maintainer ships agent-discoverable tooling, an MCP endpoint or a skill, the way they already ship a README.
- Converge a Rails MCP server: let agents introspect the app (gems, versions, schema, routes) and pull current per-gem docs on demand.
- Agree a shared Agent Skills convention so skill packs interoperate.
- Copy Laravel Boost (official MCP, version-pinned guidelines, on-demand skills, tools). Rails has the parts (fast-mcp, Tidewave, rails-mcp-server) on the official Ruby MCP SDK.
Why: standard Rails is in the training set; the gems, and anything past the cutoff, are where agents guess. A maintainer convention is what scales the fix across that long tail.
Layer 3: change the training default long game
- Contribute real Rails repos to Multi-SWE-bench (the repo-level agentic benchmark, which takes open contributions) and publish an open idiomatic-Rails eval. Ruby is in MultiPL-E's HumanEval/MBPP puzzles but absent from the agentic benchmarks, where modern coding ability is measured; adding a language to an eval measurably improves models on it (MultiPL-T, Bridge-Coder).
- Publish an open, idiomatic-Rails instruction dataset; contribute permissively-licensed Ruby content to open corpora like Common Corpus.
- Keep the public whichlang benchmark as the scoreboard for the final boss below, and re-run it on each new model.
★ The final boss
Frontier models reach for Ruby on their own. The single metric every layer above serves, measured by the public whichlang benchmark: given a free choice of language across 13 models, Ruby was picked 0 times in 1,267 generated solutions (the defaults are Python, JavaScript, and Go). Win condition: that zero starts climbing, model after model.
03Methodology
All indicators probed over HTTP, June 2026, against each project's documentation URL:
robots.txt parsed for AI user-agents (CCBot, GPTBot, ClaudeBot, Google-Extended) with
Disallow: /; crawlability tested by fetching as CCBot (to catch Cloudflare/WAF blocks); content
negotiation via Accept: text/markdown; .md routes and llms.txt checked for a 200.
The language-choice figure is from the open whichlang benchmark (13 models, 1,267 classified solutions, 0
Ruby; github.com/chad/whichlang). That the same models write
Rails competently when instructed is our own informal observation.
Why Common Crawl? It's the open web crawl that seeds most LLM pretraining corpora (C4, RefinedWeb, FineWeb, behind GPT, Llama, and others) and feeds many retrieval indexes. A project's CC coverage is a proxy for whether a model has seen its docs at all, a separate question from whether the live site is crawlable today. It's the one column you cannot fix this quarter, since it reflects crawls already taken, which is why getting in (sitemaps, unblocking bots, backlinks) is Layer 0. The usual reason a page is absent is a missing sitemap: with no manifest to discover from, the crawler never reaches it.