Theme

RankedAGI

AI models ranked by latest benchmarks

  1. 1 commit

    Refreshed benchmark data.

  2. 14 commits

    Shipped the /engine page explaining the simulated-data scoring methodology and wired the hybrid v1+v2 estimator into production, backed by a cross-validation harness and a v2 factor-model estimator; refreshed model data with Terminal-Bench 2.1.

  3. 5 commits

    Added Opus 4.8 and Qwen 3.7 Max benchmark data and switched dev to portless run --name for worktree-friendly URLs.

  4. 2 commits

    Added branding assets and updated the DeepSeek price.

  5. 1 commit

    Added Gemini 3.5 Flash.

  6. 1 commit

    Added Composer 1 and optimized the icon.

  7. 18 commits

    Built out AI discoverability: build-time Markdown companions served via content negotiation, an llms.txt overview, site structured data and JSON-LD, and a permissive crawler signal. Expanded the sources methodology page with the composite score formula, a data-access FAQ, and simulated-evidence controls.

  8. 14 commits

    Added global-rank reveal: rank on hover plus tap-to-reveal on mobile, with a faster forward transition. Surfaced model size next to name and org across the homepage and admin lists, linked admin model names to their edit pages, and refreshed the model dataset.

  9. 13 commits

    Reworked the simulated-data UI: per-profile scoring with simulated data, soft-orange accent, simplified controls, a hover rank style with tooltip, and an icon-only columns button. Added an admin sort dropdown and collapsible sidebar, plus a benchmark dataset refresh.

  10. 34 commits

    Major scoring work: added v4 and v6 pairwise-Elo backends, wired v4 through admin previews and the leaderboard, and refined the V5 frontier simulation with controlled extrapolation and sparse-data handling. Redesigned the lazy-fetched simulated-data toggle with smooth transitions and a sparkle-marked cell, and added a Visualizations lab of score-decomposition variants.

  11. 2 commits

    Reorganized the admin Models routes and refreshed the sidebar, dashboard thresholds, and model data.

  12. 2 commits

    Refreshed model data and added Grok 4.3.

  13. 2 commits

    Added Mistral Medium 3.5 and minor UI tweaks.

  14. 1 commit

    Added more GPT 5.5 benchmark data.

  15. 1 commit

    Added Mimo open-source model data and updated the dev setup.

  16. 4 commits

    Refreshed benchmark data with a DeepSeek price update and asset cleanup.

  17. 15 commits

    Drove composite-column styling from a shared category config, routing the matrix palette and admin list through it and tinting headers by category. Added per-profile display mode (percentage or rank) with a smooth hover swap, refreshed the logo and OG image, and skipped search autofocus on touch devices.

  18. 35 commits

    Built the admin benchmark matrix: moved it to a top-level route, added an open-source license filter, and redesigned the dashboard around weakest-coverage models with a live contribution panel. Solved a thorny SPA-nav picker focus bug, and hardened the data layer with rolling backups and a per-file write mutex.

  19. 15 commits

    Refined the benchmark sheet's composite chips: a Lucide star icon for proper sizing, letter-only collapse when squeezed, and tint that follows category. Synced the dot palette to the sheet, drove profile visibility from data, unhid the Overall column publicly, and added GPT 5.5 plus Toolathlon and DesignArena Code.

  20. 1 commit

    A small data refresh.

  21. 1 commit

    Tuned the inset shadow on table surfaces.

  22. 6 commits

    Pinned the site footer to the viewport bottom, extracted it to its own component, and made it sticky only on the home page while tuning the table height. Fixed the model detail page width, and added Qwen 3.6 Max and Kimi K2.6 plus a data refresh.

  23. 21 commits

    Rebuilt the benchmark sheet as a spreadsheet-style viewport that saves without reloading, with sticky thead/tfoot separators via inset shadows, sort that respects saved patches, click-sort, and editable model-meta columns. Stopped the dev server reloading when the lab writes data files, and added a portless setup.

  24. 21 commits

    Built the model-form lab through many modes (Command Center, Benchmark Grid, Leaderboard Editor, Inspector, Compare, Dial, Command) with a shared live-scoring helper, then scaffolded the fill-one-benchmark-across-models sheet with sort, fill, and paste-match. Fixed the Save-all 404 by repointing forms, refined benchmark inputs, and added Muse Spark with new benchmarks.

  25. 12 commits

    Restructured the admin lab into tabbed sub-routes with a polished, grouped, collapsible sidebar. Added the RAGI Agentic composite and removed the legacy computedFrom system, applied the squircle style across admin buttons and the search bar, fixed a11y warnings, renamed assets to rankedagi, and added GPT 5.4 Pro.

  26. 6 commits

    Promoted the v3 scoring to production with its gamma curve, coverage knobs, and lab page, then sorted the public site by the configured RAGI profile. Fixed a silent delete button in the model edit dialog, rearranged benchmarks, and added Opus 4.7.

  27. 3 commits

    Added windowed rendering to the admin models table for better performance, and sped up admin navigation by caching JSON reads on mtime and size with hover preloading.

  28. 21 commits

    Made the v2 split-pane editor the canonical model editor, retiring the old edit route, with snapshot reset and forced remount on model switch. Converted the admin top nav to a collapsible left sidebar, added a read-only aesthetic preview page, and fixed RAGI benchmark visibility and a model-edit form leak.

  29. 16 commits

    Rebuilt RAGI Studio with live preview, a Lean scoring mode, and unified controls, redesigning the source table with ChipSelect controls and aligning its design language with the rankings preview. Standardized benchmark keys to lowercase-with-dashes, unified admin model form styling, and handled currency negatives.

  30. 23 commits

    Added a dedicated RAGI studio with scoring controls, added the Vending Bench 2 benchmark with currency-format support and negative values, and polished the model edit form. Refined the mobile layout: a brain icon for thinking, compact dates, tighter padding, and a hidden version badge.

  31. 33 commits

    Reworked the admin: redesigned the benchmarks page as a table, added delete-model and unsaved-changes warnings, persisted table sort, and replaced license/type/version dropdowns with segmented toggles plus a ThinkingEffort field. Settled column drag-and-drop after several reverts, and refined mobile table layout.

  32. 23 commits

    Migrated the rankings grid from an HTML table to CSS Grid with subgrid, with conditional frozen-pane and header shadows on scroll. Added drag-and-drop column reordering with persistent visibility, fixed lint and svelte-check errors across files, and improved the model edit form.

  33. 46 commits

    Fixed an O(N^2) DataTable bottleneck and replaced $effect anti-patterns with idiomatic Svelte 5, removed 8 unused shadcn families, and dropped date-fns for a native relative-time helper. Added benchmark archiving, model filters, and a multi-link array, unified benchmark category sort order, and centralized admin styling.

  34. 6 commits

    Refactored JSON data access for SvelteKit, fixed benchmark form save behavior, sorted admin models by release date while keeping edits in place, and added the GLM 5.1 model.

  35. 1 commit

    Made admin updates.

  36. 10 commits

    Added model table header sorting and a delete dialog, refactored the admin layout and styling for responsiveness and dark mode, set a 404 fallback for unknown URLs on static hosts, and sourced benchmark columns from benchmarks.json.

  37. 13 commits

    Dropped Supabase for local data functions, adding model helpers and a pre-sorted models.json. Adopted a new font and color stack with dark mode, moved filters to bindable state with $derived.by, and migrated to $app/state and resolve()-based navigation.

  38. 1 commit

    Updated the app shell.

  39. 1 commit

    Updated npm dependencies.

  40. 1 commit

    Updated the leaderboard page.

  41. 1 commit

    Formatting pass.

  42. 1 commit

    Focused the search bar on load.

  43. 6 commits

    Added sticky columns to the leaderboard table with a capped height and refined header styling.

  44. 1 commit

    Refactored table state to Svelte 5 runes.

  45. 2 commits

    Added the SvelteBench benchmark with a column label and visibility control.

  46. 2 commits

    Added an About link to the footer and fixed the heading color in dark mode.

  47. 1 commit

    Improved DataTable spacing for layout consistency.

  48. 1 commit

    Refined FilterDropdown and TableHeader styling.

  49. 5 commits

    Refined the table header and filter component styling and clarified the LiveBench column.

  50. 1 commit

    Changed the Preview badge color from yellow to fuchsia.

  51. 10 commits

    Reworked benchmark columns: added Humanity, swapped LiveCodeBench versions, and tuned column visibility and labels.

  52. 1 commit

    Removed the deprecated Aider column.

  53. 1 commit

    Switched the table header to a columnVisibility store for visibility control.

  54. 10 commits

    Reworked model filters into grouped, checkbox-based accordions with responsive mobile rendering, grouped the column selector into toggleable categories, and added a shared ChevronIcon.

  55. 1 commit

    Clarified the LiveBench code-column version subtitle.

  56. 1 commit

    Clarified the LiveBench code-column version subtitle.

  57. 2 commits

    Reworked the LiveBench column keys and tooltips for 2024 and 2025.

  58. 4 commits

    Standardized benchmark column keys to lowercase-with-underscores and added the ragi_code column.

  59. 2 commits

    Updated the Open Graph and Twitter images to the new OG image with better shadows.

  60. 1 commit

    Added MMMU and tooltip descriptions for the MMLU columns.

  61. 3 commits

    Moved the logo config to its own file and conditionally showed organization logos on the model page.

  62. 1 commit

    Clarified the About page's ownership and mission and improved the contact info.

  63. 1 commit

    Set an explicit icon height for display.

  64. 5 commits

    Rebranded from RankedAI to RankedAGI across the project and README and added the NYT Connections Extended column.

  65. 3 commits

    Refined model link styling for visibility in light and dark mode and added svelte-render-scan.

  66. 3 commits

    Updated the navigation icons.

  67. 12 commits

    Improved the leaderboard: better model contrast and underlines, search-bar refinements, the leaderboard in the nav, and new vendor logos.

  68. 5 commits

    Added categories and subcategories, logos in the Model column, and header polish.

  69. 7 commits

    Added nav to the homepage, a cleaner filter, and search-bar updates.

  70. 2 commits

    Added search inside the DataTable.

  71. 2 commits

    Reworded copy from we to I and polished the UI.

  72. 4 commits

    Expanded the README with data sources and cleaned up.

  73. 4 commits

    Updated the benchmark card for light mode.

  74. 6 commits

    Added a Sources page wired into the sitemap, plus metadata fixes.

  75. 4 commits

    Updated the card design for dark mode.

  76. 3 commits

    Added LiveCodeBench and a relative last-updated time.

  77. 3 commits

    Added a new benchmark and reordered GPQA and IFEval.

  78. 2 commits

    Refined transitions and icons.

  79. 6 commits

    Reworked navigation and the search bar, fixed model ordering, and cleaned up components.

  80. 10 commits

    Made entire cells clickable, made the sort dropdown responsive, and did Svelte 5 cleanup and UI polish.

  81. 26 commits

    Upgraded to Tailwind 4, made the leaderboard responsive, added model logos and an improved model page, and refined the filters and color palette.

  82. 4 commits

    Got filters working and prevented the browser back-gesture while scrolling.

  83. 2 commits

    Updated the page title.

  84. 1 commit

    Fixed a Safari z-index issue.

  85. 1 commit

    Added IndexNow.

  86. 1 commit

    Added AIME 2025.

  87. 6 commits

    Added smooth transitions and aligned sorting on the Models page.

  88. 13 commits

    Set up a single source of truth for site data with per-page OG and a dynamic sitemap, fixed the canonical link, and refined the card defaults and UI.

  89. 8 commits

    Added ranks for dates and context, reworked the cards, and polished styles.

  90. 6 commits

    Added a dynamic sitemap and robots, improved color contrast, and removed Ahrefs analytics.

  91. 16 commits

    Added more benchmarks, built out the model page, fixed multiple-H1 issues, and refreshed the footer and UI.

  92. 3 commits

    Added subheadings and reduced the font size.

  93. 1 commit

    Added AIME.

  94. 1 commit

    Removed the Artificial Analysis score.

  95. 2 commits

    Added cache-hit cost and switched the Type column to License.

  96. 10 commits

    Upgraded search with exclusion operators and multi-term exclusion, added verified-commit signing, and improved the dark-mode badge.

  97. 6 commits

    Improved sorting, trimmed duplicate H1s on the Models page, and tightened the meta descriptions.

  98. 10 commits

    Added WebDevArena with style control, added ranks, and tightened the UI and accessibility.

  99. 2 commits

    Tried a hex logo, then reverted.

  100. 6 commits

    Reworked the Models page and added robots.txt and Ahrefs verification.

  101. 1 commit

    Updated the README.

  102. 10 commits

    Connected Supabase as the data source, cleaned up the Airtable data, and made size sortable.

  103. 21 commits

    Added Cmd+K search with Windows support and refactored the search bar, added AidanBench, lazy-loaded logos, and refreshed the OG and X card metadata.

  104. 3 commits

    Added the SWEBench, Codeforces, and AiderPoly benchmarks and improved the wordmark in dark mode.

  105. 3 commits

    Added Open Graph and Twitter Card metadata.

  106. 1 commit

    Formatted code.

  107. 6 commits

    Rebranded, reordered the Hallucination column, and added a logo hover effect and a Microsoft icon.

  108. 1 commit

    Added more benchmarks.

  109. 1 commit

    Split the logo wordmark into its own component.

  110. 1 commit

    Updated the version badge.

  111. 4 commits

    Consolidated code and fixed a unique-column bug.

  112. 1 commit

    Updated styles.

  113. 1 commit

    Fixed font linking.

  114. 4 commits

    Added Amazon logos, five ranks, and clear-search on Esc.

  115. 2 commits

    Switched numbers to a mono font and moved all logos to SVG.

  116. 1 commit

    Marked the version as Beta.

  117. 1 commit

    Updated the icon.

  118. 5 commits

    Updated analytics and security headers.

  119. 7 commits

    Migrated from Svelte 4 to 5, added tooltips and analytics, and refreshed the UI and icons.

  120. 4 commits

    Restructured the page and removed search autofocus.

  121. 5 commits

    Added a release-type column with conditional styling and UI improvements.

  122. 1 commit

    Updated the leaderboard page.

  123. 2 commits

    Updated the env example and UI.

  124. 23 commits

    Built a git-commit-based versioning system after several iterations, applied production-only security headers, and refreshed the UI and rebranding.

  125. 4 commits

    Polished UI symmetry.

  126. 17 commits

    Optimized ranking, separated the formatters, added release dates and logos, and cleaned up.

  127. 28 commits

    Built out the leaderboard with dark mode, a responsive layout, logos, a favicon, and SEO.

  128. 3 commits

    Started the project with an initial commit.