What fix should you pick for your n+1 queries problem?
When you are trying to solve an n+1 problem, hopefully you will find different ways to solve it. But what is the one that you should use?
You can search the internet, but opinions can be overwhelming and confusing, because… “Window functions are great option here"… or "Window functions will work, but they can be horrible on performance"… "I’d go with caching (in a column)"… "You should use Russian-doll caching"… "I’d write custom SQL"…
Talking specifically about the fixes that I share to you to solve the "latest comment” n+1 queries problem… I think you can use the first one, “the association with default order”, unless your metrics are telling you that you need something different.
But any way, I think that been aware of the performance implications of your code and understanding the tradeoffs is a good thing…
That’s why I want to share with you some benchmarks to the 5 fixes in different scenarios, to help you understand which solution is better in each case, and which solution would be enough for your case.
Get it here:
Benchmarks for the fixes to the latest-comment n+1 problem
You will have results for this scenarios…
Each scenario will include the memory and the “iterations per second” benchmarks.
With this “reports"…
sorted_comments
- Default order for the "has many” associationlatest_comment
- A “has one” association for the latest commentfeed looping
- Looping through the latest comment for each postcached feed
- A version of the looping where we cache feed in memoryrussian_doll
- Russian doll-caching (with a simpler cache mechanism)cached_comment
- Caching the latest comment in a columnAll the reports where tested in a MacBook Pro (13-inch 2020) with:
This is the code used for the benchmark…
require "./config"
require "./seeds"
require "benchmark"
require "benchmark/memory"
require "benchmark/ips"
class Post < ActiveRecord::Base
has_many :comments
has_many :sorted_comments, -> { order(:id) }, class_name: "Comment"
has_one :latest_comment, -> { Comment.latest_comments_for_posts }, class_name: "Comment"
belongs_to :cached_comment, class_name: "Comment"
end
class Comment < ActiveRecord::Base
belongs_to :post
after_commit :update_cached_comment
def self.latest_comments_for_posts
where(id: select("max(id)").group(:post_id))
end
private
def update_cached_comment
post.update(cached_comment_id: id)
end
end
class Feed
def posts
@posts ||= fetch_posts
end
def fetch_posts
posts = Post.all
comments = Comment.latest_comments_for_posts.group_by(&:post_id)
posts.map { |post| FeedPost.new(post, comments[post.id]&.first) }
end
end
class FeedPost
attr_reader :latest_comment
def initialize(post, latest_comment)
@post = post
@latest_comment = latest_comment
end
end
def cache(post)
@cache ||= {}
@cache[post.id] ||= yield
end
def clear_cache!
@cache = {}
end
Seeds.run(posts_count: 20, comments_count: 1000)
n = 1
puts ""
puts "Memory Benchmark"
puts "---------"
Benchmark.memory do |benchmark|
benchmark.report("sorted_comments") do
n.times do
Post.includes(:sorted_comments).all.each do |post|
post.sorted_comments.last.body
end
end
end
benchmark.report("latest_comment") do
n.times do
Post.includes(:latest_comment).all.each do |post|
post.latest_comment.body
end
end
end
benchmark.report("feed looping") do
n.times do
Feed.new.posts.each do |post|
post.latest_comment.body
end
end
end
feed = Feed.new
benchmark.report("cached feed") do
n.times do
feed.posts.each do |post|
post.latest_comment.body
end
end
end
benchmark.report("cached_comment") do
n.times do
Post.includes(:cached_comment).all.each do |post|
post.cached_comment.body
end
end
end
benchmark.report("russian doll") do
n.times do
Post.all.each do |post|
cache(post) do
post.latest_comment.body
end
end
end
end
clear_cache!
benchmark.compare!
end
puts ""
puts "Benchmark ips"
puts "---------"
Benchmark.ips do |benchmark|
benchmark.config(:time => 5, :warmup => 1)
benchmark.report("sorted_comments") do
Post.includes(:sorted_comments).all.each do |post|
post.sorted_comments.last.body
end
end
benchmark.report("latest_comment") do
Post.includes(:latest_comment).all.each do |post|
post.latest_comment.body
end
end
benchmark.report("feed looping") do
Feed.new.posts.each do |post|
post.latest_comment.body
end
end
feed = Feed.new
benchmark.report("cached feed") do
n.times do
feed.posts.each do |post|
post.latest_comment.body
end
end
end
benchmark.report("cached_comment") do
Post.includes(:cached_comment).all.each do |post|
post.cached_comment.body
end
end
benchmark.report("russian doll") do
Post.all.each do |post|
cache(post) do
post.latest_comment.body
end
end
end
benchmark.compare!
end
You can see the code on github at bhserna/latest_comment_n-1_fixes_benchmark
Now here you have the reports…
Memory Benchmark
---------
Calculating -------------------------------------
sorted_comments 484.348k memsize ( 15.950k retained)
5.630k objects ( 146.000 retained)
50.000 strings ( 50.000 retained)
latest_comment 116.855k memsize ( 2.065k retained)
1.426k objects ( 16.000 retained)
50.000 strings ( 15.000 retained)
feed looping 55.176k memsize ( 0.000 retained)
702.000 objects ( 0.000 retained)
50.000 strings ( 0.000 retained)
cached feed 55.136k memsize ( 35.840k retained)
701.000 objects ( 390.000 retained)
50.000 strings ( 45.000 retained)
cached_comment 70.032k memsize ( 40.000 retained)
883.000 objects ( 1.000 retained)
50.000 strings ( 1.000 retained)
russian doll 488.744k memsize ( 1.304k retained)
7.808k objects ( 24.000 retained)
50.000 strings ( 20.000 retained)
Comparison:
cached feed: 55136 allocated
feed looping: 55176 allocated - 1.00x more
cached_comment: 70032 allocated - 1.27x more
latest_comment: 116855 allocated - 2.12x more
sorted_comments: 484348 allocated - 8.78x more
russian doll: 488744 allocated - 8.86x more
Benchmark ips
---------
Warming up --------------------------------------
sorted_comments 29.000 i/100ms
latest_comment 58.000 i/100ms
feed looping 92.000 i/100ms
cached feed 14.814k i/100ms
cached_comment 76.000 i/100ms
russian doll 324.000 i/100ms
Calculating -------------------------------------
sorted_comments 317.872 (± 4.4%) i/s - 1.595k in 5.026951s
latest_comment 606.529 (± 4.1%) i/s - 3.074k in 5.077027s
feed looping 1.020k (± 3.4%) i/s - 5.152k in 5.056148s
cached feed 153.459k (± 2.5%) i/s - 770.328k in 5.022960s
cached_comment 803.337 (± 2.7%) i/s - 4.028k in 5.017891s
russian doll 3.322k (± 2.1%) i/s - 16.848k in 5.073899s
Comparison:
cached feed: 153459.3 i/s
russian doll: 3322.0 i/s - 46.19x (± 0.00) slower
feed looping: 1020.1 i/s - 150.43x (± 0.00) slower
cached_comment: 803.3 i/s - 191.03x (± 0.00) slower
latest_comment: 606.5 i/s - 253.01x (± 0.00) slower
sorted_comments: 317.9 i/s - 482.77x (± 0.00) slower
Memory Benchmark
---------
Calculating -------------------------------------
sorted_comments 586.444k memsize ( 15.950k retained)
6.930k objects ( 146.000 retained)
50.000 strings ( 50.000 retained)
latest_comment 116.855k memsize ( 2.065k retained)
1.426k objects ( 16.000 retained)
50.000 strings ( 15.000 retained)
feed looping 55.176k memsize ( 0.000 retained)
702.000 objects ( 0.000 retained)
50.000 strings ( 0.000 retained)
cached feed 55.136k memsize ( 35.840k retained)
701.000 objects ( 390.000 retained)
50.000 strings ( 45.000 retained)
cached_comment 70.032k memsize ( 40.000 retained)
883.000 objects ( 1.000 retained)
50.000 strings ( 1.000 retained)
russian doll 488.744k memsize ( 1.304k retained)
7.808k objects ( 24.000 retained)
50.000 strings ( 20.000 retained)
Comparison:
cached feed: 55136 allocated
feed looping: 55176 allocated - 1.00x more
cached_comment: 70032 allocated - 1.27x more
latest_comment: 116855 allocated - 2.12x more
russian doll: 488744 allocated - 8.86x more
sorted_comments: 586444 allocated - 10.64x more
Benchmark ips
---------
Warming up --------------------------------------
sorted_comments 25.000 i/100ms
latest_comment 52.000 i/100ms
feed looping 95.000 i/100ms
cached feed 15.271k i/100ms
cached_comment 78.000 i/100ms
russian doll 325.000 i/100ms
Calculating -------------------------------------
sorted_comments 240.019 (± 5.0%) i/s - 1.200k in 5.011152s
latest_comment 532.712 (± 4.1%) i/s - 2.704k in 5.085499s
feed looping 987.484 (± 2.8%) i/s - 4.940k in 5.006529s
cached feed 156.682k (± 1.7%) i/s - 794.092k in 5.069703s
cached_comment 744.650 (± 7.4%) i/s - 3.744k in 5.059516s
russian doll 3.135k (± 4.9%) i/s - 15.925k in 5.092878s
Comparison:
cached feed: 156681.7 i/s
russian doll: 3134.7 i/s - 49.98x (± 0.00) slower
feed looping: 987.5 i/s - 158.67x (± 0.00) slower
cached_comment: 744.7 i/s - 210.41x (± 0.00) slower
latest_comment: 532.7 i/s - 294.12x (± 0.00) slower
sorted_comments: 240.0 i/s - 652.79x (± 0.00) slower
Memory Benchmark
---------
Calculating -------------------------------------
sorted_comments 1.408M memsize ( 15.950k retained)
17.330k objects ( 146.000 retained)
50.000 strings ( 50.000 retained)
latest_comment 116.855k memsize ( 2.065k retained)
1.426k objects ( 16.000 retained)
50.000 strings ( 15.000 retained)
feed looping 55.176k memsize ( 0.000 retained)
702.000 objects ( 0.000 retained)
50.000 strings ( 0.000 retained)
cached feed 55.136k memsize ( 35.840k retained)
701.000 objects ( 390.000 retained)
50.000 strings ( 45.000 retained)
cached_comment 70.032k memsize ( 40.000 retained)
883.000 objects ( 1.000 retained)
50.000 strings ( 1.000 retained)
russian doll 488.744k memsize ( 1.304k retained)
7.808k objects ( 24.000 retained)
50.000 strings ( 20.000 retained)
Comparison:
cached feed: 55136 allocated
feed looping: 55176 allocated - 1.00x more
cached_comment: 70032 allocated - 1.27x more
latest_comment: 116855 allocated - 2.12x more
russian doll: 488744 allocated - 8.86x more
sorted_comments: 1408084 allocated - 25.54x more
Benchmark ips
---------
Warming up --------------------------------------
sorted_comments 8.000 i/100ms
latest_comment 29.000 i/100ms
feed looping 92.000 i/100ms
cached feed 15.385k i/100ms
cached_comment 76.000 i/100ms
russian doll 343.000 i/100ms
Calculating -------------------------------------
sorted_comments 87.033 (± 3.4%) i/s - 440.000 in 5.060406s
latest_comment 287.003 (± 3.8%) i/s - 1.450k in 5.060856s
feed looping 877.671 (± 3.0%) i/s - 4.416k in 5.036340s
cached feed 152.899k (± 2.3%) i/s - 769.250k in 5.033712s
cached_comment 719.988 (± 5.6%) i/s - 3.648k in 5.084290s
russian doll 3.284k (± 6.0%) i/s - 16.464k in 5.035281s
Comparison:
cached feed: 152899.4 i/s
russian doll: 3284.4 i/s - 46.55x (± 0.00) slower
feed looping: 877.7 i/s - 174.21x (± 0.00) slower
cached_comment: 720.0 i/s - 212.36x (± 0.00) slower
latest_comment: 287.0 i/s - 532.75x (± 0.00) slower
sorted_comments: 87.0 i/s - 1756.80x (± 0.00) slower
Memory Benchmark
---------
Calculating -------------------------------------
sorted_comments 2.434M memsize ( 15.950k retained)
30.330k objects ( 146.000 retained)
50.000 strings ( 50.000 retained)
latest_comment 116.855k memsize ( 2.065k retained)
1.426k objects ( 16.000 retained)
50.000 strings ( 15.000 retained)
feed looping 55.176k memsize ( 0.000 retained)
702.000 objects ( 0.000 retained)
50.000 strings ( 0.000 retained)
cached feed 55.136k memsize ( 35.840k retained)
701.000 objects ( 390.000 retained)
50.000 strings ( 45.000 retained)
cached_comment 70.032k memsize ( 40.000 retained)
883.000 objects ( 1.000 retained)
50.000 strings ( 1.000 retained)
russian doll 488.744k memsize ( 1.304k retained)
7.808k objects ( 24.000 retained)
50.000 strings ( 20.000 retained)
Comparison:
cached feed: 55136 allocated
feed looping: 55176 allocated - 1.00x more
cached_comment: 70032 allocated - 1.27x more
latest_comment: 116855 allocated - 2.12x more
russian doll: 488744 allocated - 8.86x more
sorted_comments: 2433508 allocated - 44.14x more
Benchmark ips
---------
Warming up --------------------------------------
sorted_comments 4.000 i/100ms
latest_comment 19.000 i/100ms
feed looping 79.000 i/100ms
cached feed 15.535k i/100ms
cached_comment 85.000 i/100ms
russian doll 353.000 i/100ms
Calculating -------------------------------------
sorted_comments 47.354 (± 4.2%) i/s - 240.000 in 5.075750s
latest_comment 185.682 (± 2.2%) i/s - 931.000 in 5.016461s
feed looping 754.015 (± 1.7%) i/s - 3.792k in 5.030455s
cached feed 154.707k (± 1.1%) i/s - 776.750k in 5.021351s
cached_comment 819.179 (± 2.0%) i/s - 4.165k in 5.086336s
russian doll 3.390k (± 2.6%) i/s - 17.297k in 5.104940s
Comparison:
cached feed: 154707.3 i/s
russian doll: 3390.5 i/s - 45.63x (± 0.00) slower
cached_comment: 819.2 i/s - 188.86x (± 0.00) slower
feed looping: 754.0 i/s - 205.18x (± 0.00) slower
latest_comment: 185.7 i/s - 833.18x (± 0.00) slower
sorted_comments: 47.4 i/s - 3267.07x (± 0.00) slower
Memory Benchmark
---------
Calculating -------------------------------------
sorted_comments 20.977M memsize ( 15.950k retained)
264.330k objects ( 146.000 retained)
50.000 strings ( 50.000 retained)
latest_comment 116.855k memsize ( 2.065k retained)
1.426k objects ( 16.000 retained)
50.000 strings ( 15.000 retained)
feed looping 55.176k memsize ( 0.000 retained)
702.000 objects ( 0.000 retained)
50.000 strings ( 0.000 retained)
cached feed 55.136k memsize ( 35.840k retained)
701.000 objects ( 390.000 retained)
50.000 strings ( 45.000 retained)
cached_comment 70.032k memsize ( 40.000 retained)
883.000 objects ( 1.000 retained)
50.000 strings ( 1.000 retained)
russian doll 488.744k memsize ( 1.304k retained)
7.808k objects ( 24.000 retained)
50.000 strings ( 20.000 retained)
Comparison:
cached feed: 55136 allocated
feed looping: 55176 allocated - 1.00x more
cached_comment: 70032 allocated - 1.27x more
latest_comment: 116855 allocated - 2.12x more
russian doll: 488744 allocated - 8.86x more
sorted_comments: 20977100 allocated - 380.46x more
Benchmark ips
---------
Warming up --------------------------------------
sorted_comments 1.000 i/100ms
latest_comment 22.000 i/100ms
feed looping 25.000 i/100ms
cached feed 14.818k i/100ms
cached_comment 57.000 i/100ms
russian doll 288.000 i/100ms
Calculating -------------------------------------
sorted_comments 4.665 (± 0.0%) i/s - 24.000 in 5.177711s
latest_comment 206.056 (± 7.3%) i/s - 1.034k in 5.050651s
feed looping 239.214 (± 4.2%) i/s - 1.200k in 5.026543s
cached feed 146.296k (± 5.1%) i/s - 740.900k in 5.079435s
cached_comment 706.969 (±10.8%) i/s - 3.534k in 5.064015s
russian doll 3.235k (± 4.9%) i/s - 16.416k in 5.087702s
Comparison:
cached feed: 146296.1 i/s
russian doll: 3235.0 i/s - 45.22x (± 0.00) slower
cached_comment: 707.0 i/s - 206.93x (± 0.00) slower
feed looping: 239.2 i/s - 611.57x (± 0.00) slower
latest_comment: 206.1 i/s - 709.98x (± 0.00) slower
sorted_comments: 4.7 i/s - 31362.27x (± 0.00) slower
I think that you can use the simplest solution, “the association with default order”, maybe until you are in the 50 comments per post, after that maybe you could try other options like “fragment caching” or “looping through the latest comment for each post”.
You can also download a cheatsheet to help you decide which fix to pick in different contexts, based mostly on my interpretation of the benchmarks and the complexity of each solution.
Get it here: Cheatsheet for the “latest-comment” n+1 queries problem
Learn just enough fundamentals to be fluent preloading associations with ActiveRecord, and start helping your team to avoid n+1 queries on production.