Do you have a method that works fine with a single record, but when you use it on a list causes N+1 queries?
One example of this problem is trying to get the latest comment on a list of posts, but there are others, like the last review in a list of products, or the cheapest price, etc…
But let’s explain the problem with…
Imagine, that you have a model like this…
class Posts
has_many :comments
def latest_comment
comments.order(:created_at).last
end
end
When you are fetching a single post there is no problem…
post = Post.find(id)
puts post.latest_comment
But when you try to fetch a list of posts, the method seems to ignore the
includes
and runs a query for each post to get the latest_comment
for each post!
Post.includes(:comments).each do |post|
puts post.latest_comment
end
You can write a different method to get the latest_comment
with ruby (not
ActiveRecord), but it feels inelegant, and maybe it can be confusing later…
def self.latest_comment(post)
post.comments.sort_by(&:created_at).last
end
The good news is that there are other solutions =)… And I will share with you some of them. So, you can select the one that is better for the current state of your app.
Instead of sorting the comments on the latest_comment
method, you can sort
the comments at the association level and just ask for the last
sorted
comment.
class Post
has_many :sorted_comments, -> { order(:created_at) }, class_name: "Comment"
def latest_comment
sorted_comments.last
end
end
Post.includes(:sorted_comments).each do |post|
puts post.latest_comment
end
Instead of loading all the comments for all the posts, you can use a has_one
association for the latest_comment
, providing a scope with just the
latest_comment
for each post
.
class Post < ActiveRecord::Base
has_many :comments
has_one :latest_comment, -> { Comment.latest_comments_for_posts }, class_name: "Comment"
end
class Comment < ActiveRecord::Base
belongs_to :post
def self.latest_comments_for_posts
latest_comments_ids = select("max(id)").group(:post_id)
where(id: latest_comments_ids)
end
end
Post.includes(:latest_comment).all.each do |post|
puts post.latest_comment.body
end
You can query the database for a list with the latest_comment
for each post
,
group them by post_id
and then pick the right comment
for each post
.
You can also use some specific objects the task, If you want to avoid polluting
you Post
and Comment
models.
class Post < ActiveRecord::Base
has_many :comments
end
class Comment < ActiveRecord::Base
belongs_to :post
def self.latest_comments_for_posts
latest_comments_ids = select("max(id)").group(:post_id)
where(id: latest_comments_ids)
end
end
class Feed
def posts
posts = Post.all
comments = Comment.latest_comments_for_posts.group_by(&:post_id)
posts.map { |post| FeedPost.new(post, comments[post.id]&.first) }
end
end
class FeedPost
attr_reader :latest_comment
def initialize(post, latest_comment)
@post = post
@latest_comment = latest_comment
end
end
feed = Feed.new
feed.posts.map do |post|
puts post.latest_comment.body
end
If you are already using fragment cache on your app, you can embrace the “Russian doll caching” and think of n+1 as a feature, and let the n+1 queries run just the first time and use the cache after that.
class Post < ActiveRecord::Base
has_many :comments
def latest_comment
comments.order(:id).last
end
end
class Comment < ActiveRecord::Base
belongs_to :post
end
In your view…
<% #posts/index.html.erb %>
<%= render partial: 'posts/post', collection: @posts, cached: true %>
<% #posts/_post.html.erb %>
<% cache post %>
<%= post.title %>
<%= post.latest_comment.body %>
<% end %>
You can also add a new column for the last_comment_id
and a belongs_to
association, and keep the last comment in sync “manually” (maybe with an
after_commit
callback).
ActiveRecord::Schema.define do
create_table :posts do |t|
t.column :title, :string
t.column :body, :text
t.column :latest_comment_id, :integer
end
create_table :comments do |t|
t.column :body, :text
t.column :post_id, :integer
end
end
class Post < ActiveRecord::Base
has_many :comments
belongs_to :latest_comment, class_name: "Comment"
end
class Comment < ActiveRecord::Base
belongs_to :post
after_commit :update_latest_comment
private
def update_latest_comment
post.update(latest_comment_id: id)
end
end
Post.includes(:latest_comment).all.each do |post|
puts post.latest_comment.body
end
You can try other options for the query. I used one that worked on postgres and sqlite3, but maybe you want to try other depending on your database.
This work on sqlite3 and postgres (used in the examples)…
def self.latest_comments_for_posts
latest_comments_ids = select("max(id)").group(:post_id)
where(id: latest_comments_ids)
end
This works on postgres…
def self.latest_comments_for_posts
from(order(:post_id, id: :desc).arel.distinct_on(arel_table[:post_id]).as("comments"))
end
And this works on sqlite3…
def self.latest_comments_for_posts
from(order(id: :desc), "comments").group(:post_id)
end
If you want to see how this fixes behave in different scenarios, you can take a look to the benchmarks for this 5 fixes.
Or play with the code for those benchmarks. You can see the code on github at bhserna/latest_comment_n-1_fixes_benchmark
You can see the code on github at bhserna/latest_comment_n-1_examples.
You can use it to run the examples and play with the code and seed values, to pick the right solution for you current case.
Now you know 5 difference ways to solve the problem, and you can pick the one that better fits your app =)
If you are not sure, I think you can use the first one, “the association with default order”, unless your metrics are telling you that you need something different.
You can also download a cheatsheet to help you decide which fix to pick in different contexts, based mostly on my interpretation of the benchmarks and the complexity of each solution.
Get it here: Cheatsheet for the “latest-comment” n+1 queries problem
Learn just enough fundamentals to be fluent preloading associations with ActiveRecord, and start helping your team to avoid n+1 queries on production.