RankedAGI
AI models ranked by latest benchmarks
- 1 commit
Refreshed benchmark data.
- 14 commits
Shipped the /engine page explaining the simulated-data scoring methodology and wired the hybrid v1+v2 estimator into production, backed by a cross-validation harness and a v2 factor-model estimator; refreshed model data with Terminal-Bench 2.1.
- 5 commits
Added Opus 4.8 and Qwen 3.7 Max benchmark data and switched dev to portless run --name for worktree-friendly URLs.
- 2 commits
Added branding assets and updated the DeepSeek price.
- 1 commit
Added Gemini 3.5 Flash.
- 1 commit
Added Composer 1 and optimized the icon.
- 18 commits
Built out AI discoverability: build-time Markdown companions served via content negotiation, an llms.txt overview, site structured data and JSON-LD, and a permissive crawler signal. Expanded the sources methodology page with the composite score formula, a data-access FAQ, and simulated-evidence controls.
- 14 commits
Added global-rank reveal: rank on hover plus tap-to-reveal on mobile, with a faster forward transition. Surfaced model size next to name and org across the homepage and admin lists, linked admin model names to their edit pages, and refreshed the model dataset.
- 13 commits
Reworked the simulated-data UI: per-profile scoring with simulated data, soft-orange accent, simplified controls, a hover rank style with tooltip, and an icon-only columns button. Added an admin sort dropdown and collapsible sidebar, plus a benchmark dataset refresh.
- 34 commits
Major scoring work: added v4 and v6 pairwise-Elo backends, wired v4 through admin previews and the leaderboard, and refined the V5 frontier simulation with controlled extrapolation and sparse-data handling. Redesigned the lazy-fetched simulated-data toggle with smooth transitions and a sparkle-marked cell, and added a Visualizations lab of score-decomposition variants.
- 2 commits
Reorganized the admin Models routes and refreshed the sidebar, dashboard thresholds, and model data.
- 2 commits
Refreshed model data and added Grok 4.3.
- 2 commits
Added Mistral Medium 3.5 and minor UI tweaks.
- 1 commit
Added more GPT 5.5 benchmark data.
- 1 commit
Added Mimo open-source model data and updated the dev setup.
- 4 commits
Refreshed benchmark data with a DeepSeek price update and asset cleanup.
- 15 commits
Drove composite-column styling from a shared category config, routing the matrix palette and admin list through it and tinting headers by category. Added per-profile display mode (percentage or rank) with a smooth hover swap, refreshed the logo and OG image, and skipped search autofocus on touch devices.
- 35 commits
Built the admin benchmark matrix: moved it to a top-level route, added an open-source license filter, and redesigned the dashboard around weakest-coverage models with a live contribution panel. Solved a thorny SPA-nav picker focus bug, and hardened the data layer with rolling backups and a per-file write mutex.
- 15 commits
Refined the benchmark sheet's composite chips: a Lucide star icon for proper sizing, letter-only collapse when squeezed, and tint that follows category. Synced the dot palette to the sheet, drove profile visibility from data, unhid the Overall column publicly, and added GPT 5.5 plus Toolathlon and DesignArena Code.
- 1 commit
A small data refresh.
- 1 commit
Tuned the inset shadow on table surfaces.
- 6 commits
Pinned the site footer to the viewport bottom, extracted it to its own component, and made it sticky only on the home page while tuning the table height. Fixed the model detail page width, and added Qwen 3.6 Max and Kimi K2.6 plus a data refresh.
- 21 commits
Rebuilt the benchmark sheet as a spreadsheet-style viewport that saves without reloading, with sticky thead/tfoot separators via inset shadows, sort that respects saved patches, click-sort, and editable model-meta columns. Stopped the dev server reloading when the lab writes data files, and added a portless setup.
- 21 commits
Built the model-form lab through many modes (Command Center, Benchmark Grid, Leaderboard Editor, Inspector, Compare, Dial, Command) with a shared live-scoring helper, then scaffolded the fill-one-benchmark-across-models sheet with sort, fill, and paste-match. Fixed the Save-all 404 by repointing forms, refined benchmark inputs, and added Muse Spark with new benchmarks.
- 12 commits
Restructured the admin lab into tabbed sub-routes with a polished, grouped, collapsible sidebar. Added the RAGI Agentic composite and removed the legacy computedFrom system, applied the squircle style across admin buttons and the search bar, fixed a11y warnings, renamed assets to rankedagi, and added GPT 5.4 Pro.
- 6 commits
Promoted the v3 scoring to production with its gamma curve, coverage knobs, and lab page, then sorted the public site by the configured RAGI profile. Fixed a silent delete button in the model edit dialog, rearranged benchmarks, and added Opus 4.7.
- 3 commits
Added windowed rendering to the admin models table for better performance, and sped up admin navigation by caching JSON reads on mtime and size with hover preloading.
- 21 commits
Made the v2 split-pane editor the canonical model editor, retiring the old edit route, with snapshot reset and forced remount on model switch. Converted the admin top nav to a collapsible left sidebar, added a read-only aesthetic preview page, and fixed RAGI benchmark visibility and a model-edit form leak.
- 16 commits
Rebuilt RAGI Studio with live preview, a Lean scoring mode, and unified controls, redesigning the source table with ChipSelect controls and aligning its design language with the rankings preview. Standardized benchmark keys to lowercase-with-dashes, unified admin model form styling, and handled currency negatives.
- 23 commits
Added a dedicated RAGI studio with scoring controls, added the Vending Bench 2 benchmark with currency-format support and negative values, and polished the model edit form. Refined the mobile layout: a brain icon for thinking, compact dates, tighter padding, and a hidden version badge.
- 33 commits
Reworked the admin: redesigned the benchmarks page as a table, added delete-model and unsaved-changes warnings, persisted table sort, and replaced license/type/version dropdowns with segmented toggles plus a ThinkingEffort field. Settled column drag-and-drop after several reverts, and refined mobile table layout.
- 23 commits
Migrated the rankings grid from an HTML table to CSS Grid with subgrid, with conditional frozen-pane and header shadows on scroll. Added drag-and-drop column reordering with persistent visibility, fixed lint and svelte-check errors across files, and improved the model edit form.
- 46 commits
Fixed an O(N^2) DataTable bottleneck and replaced $effect anti-patterns with idiomatic Svelte 5, removed 8 unused shadcn families, and dropped date-fns for a native relative-time helper. Added benchmark archiving, model filters, and a multi-link array, unified benchmark category sort order, and centralized admin styling.
- 6 commits
Refactored JSON data access for SvelteKit, fixed benchmark form save behavior, sorted admin models by release date while keeping edits in place, and added the GLM 5.1 model.
- 1 commit
Made admin updates.
- 10 commits
Added model table header sorting and a delete dialog, refactored the admin layout and styling for responsiveness and dark mode, set a 404 fallback for unknown URLs on static hosts, and sourced benchmark columns from benchmarks.json.
- 13 commits
Dropped Supabase for local data functions, adding model helpers and a pre-sorted models.json. Adopted a new font and color stack with dark mode, moved filters to bindable state with $derived.by, and migrated to $app/state and resolve()-based navigation.
- 1 commit
Updated the app shell.
- 1 commit
Updated npm dependencies.
- 1 commit
Updated the leaderboard page.
- 1 commit
Formatting pass.
- 1 commit
Focused the search bar on load.
- 6 commits
Added sticky columns to the leaderboard table with a capped height and refined header styling.
- 1 commit
Refactored table state to Svelte 5 runes.
- 2 commits
Added the SvelteBench benchmark with a column label and visibility control.
- 2 commits
Added an About link to the footer and fixed the heading color in dark mode.
- 1 commit
Improved DataTable spacing for layout consistency.
- 1 commit
Refined FilterDropdown and TableHeader styling.
- 5 commits
Refined the table header and filter component styling and clarified the LiveBench column.
- 1 commit
Changed the Preview badge color from yellow to fuchsia.
- 10 commits
Reworked benchmark columns: added Humanity, swapped LiveCodeBench versions, and tuned column visibility and labels.
- 1 commit
Removed the deprecated Aider column.
- 1 commit
Switched the table header to a columnVisibility store for visibility control.
- 10 commits
Reworked model filters into grouped, checkbox-based accordions with responsive mobile rendering, grouped the column selector into toggleable categories, and added a shared ChevronIcon.
- 1 commit
Clarified the LiveBench code-column version subtitle.
- 1 commit
Clarified the LiveBench code-column version subtitle.
- 2 commits
Reworked the LiveBench column keys and tooltips for 2024 and 2025.
- 4 commits
Standardized benchmark column keys to lowercase-with-underscores and added the ragi_code column.
- 2 commits
Updated the Open Graph and Twitter images to the new OG image with better shadows.
- 1 commit
Added MMMU and tooltip descriptions for the MMLU columns.
- 3 commits
Moved the logo config to its own file and conditionally showed organization logos on the model page.
- 1 commit
Clarified the About page's ownership and mission and improved the contact info.
- 1 commit
Set an explicit icon height for display.
- 5 commits
Rebranded from RankedAI to RankedAGI across the project and README and added the NYT Connections Extended column.
- 3 commits
Refined model link styling for visibility in light and dark mode and added svelte-render-scan.
- 3 commits
Updated the navigation icons.
- 12 commits
Improved the leaderboard: better model contrast and underlines, search-bar refinements, the leaderboard in the nav, and new vendor logos.
- 5 commits
Added categories and subcategories, logos in the Model column, and header polish.
- 7 commits
Added nav to the homepage, a cleaner filter, and search-bar updates.
- 2 commits
Added search inside the DataTable.
- 2 commits
Reworded copy from we to I and polished the UI.
- 4 commits
Expanded the README with data sources and cleaned up.
- 4 commits
Updated the benchmark card for light mode.
- 6 commits
Added a Sources page wired into the sitemap, plus metadata fixes.
- 4 commits
Updated the card design for dark mode.
- 3 commits
Added LiveCodeBench and a relative last-updated time.
- 3 commits
Added a new benchmark and reordered GPQA and IFEval.
- 2 commits
Refined transitions and icons.
- 6 commits
Reworked navigation and the search bar, fixed model ordering, and cleaned up components.
- 10 commits
Made entire cells clickable, made the sort dropdown responsive, and did Svelte 5 cleanup and UI polish.
- 26 commits
Upgraded to Tailwind 4, made the leaderboard responsive, added model logos and an improved model page, and refined the filters and color palette.
- 4 commits
Got filters working and prevented the browser back-gesture while scrolling.
- 2 commits
Updated the page title.
- 1 commit
Fixed a Safari z-index issue.
- 1 commit
Added IndexNow.
- 1 commit
Added AIME 2025.
- 6 commits
Added smooth transitions and aligned sorting on the Models page.
- 13 commits
Set up a single source of truth for site data with per-page OG and a dynamic sitemap, fixed the canonical link, and refined the card defaults and UI.
- 8 commits
Added ranks for dates and context, reworked the cards, and polished styles.
- 6 commits
Added a dynamic sitemap and robots, improved color contrast, and removed Ahrefs analytics.
- 16 commits
Added more benchmarks, built out the model page, fixed multiple-H1 issues, and refreshed the footer and UI.
- 3 commits
Added subheadings and reduced the font size.
- 1 commit
Added AIME.
- 1 commit
Removed the Artificial Analysis score.
- 2 commits
Added cache-hit cost and switched the Type column to License.
- 10 commits
Upgraded search with exclusion operators and multi-term exclusion, added verified-commit signing, and improved the dark-mode badge.
- 6 commits
Improved sorting, trimmed duplicate H1s on the Models page, and tightened the meta descriptions.
- 10 commits
Added WebDevArena with style control, added ranks, and tightened the UI and accessibility.
- 2 commits
Tried a hex logo, then reverted.
- 6 commits
Reworked the Models page and added robots.txt and Ahrefs verification.
- 1 commit
Updated the README.
- 10 commits
Connected Supabase as the data source, cleaned up the Airtable data, and made size sortable.
- 21 commits
Added Cmd+K search with Windows support and refactored the search bar, added AidanBench, lazy-loaded logos, and refreshed the OG and X card metadata.
- 3 commits
Added the SWEBench, Codeforces, and AiderPoly benchmarks and improved the wordmark in dark mode.
- 3 commits
Added Open Graph and Twitter Card metadata.
- 1 commit
Formatted code.
- 6 commits
Rebranded, reordered the Hallucination column, and added a logo hover effect and a Microsoft icon.
- 1 commit
Added more benchmarks.
- 1 commit
Split the logo wordmark into its own component.
- 1 commit
Updated the version badge.
- 4 commits
Consolidated code and fixed a unique-column bug.
- 1 commit
Updated styles.
- 1 commit
Fixed font linking.
- 4 commits
Added Amazon logos, five ranks, and clear-search on Esc.
- 2 commits
Switched numbers to a mono font and moved all logos to SVG.
- 1 commit
Marked the version as Beta.
- 1 commit
Updated the icon.
- 5 commits
Updated analytics and security headers.
- 7 commits
Migrated from Svelte 4 to 5, added tooltips and analytics, and refreshed the UI and icons.
- 4 commits
Restructured the page and removed search autofocus.
- 5 commits
Added a release-type column with conditional styling and UI improvements.
- 1 commit
Updated the leaderboard page.
- 2 commits
Updated the env example and UI.
- 23 commits
Built a git-commit-based versioning system after several iterations, applied production-only security headers, and refreshed the UI and rebranding.
- 4 commits
Polished UI symmetry.
- 17 commits
Optimized ranking, separated the formatters, added release dates and logos, and cleaned up.
- 28 commits
Built out the leaderboard with dark mode, a responsive layout, logos, a favicon, and SEO.
- 3 commits
Started the project with an initial commit.

