Do you have a method that works fine with a single record, but when you use it on a list causes N+1 queries?
Imagine that you have a model like this…
class Posts
has_many :comments
def latest_comment
comments.order(:created_at).last
end
end
When you are fetching a single post there is no problem…
post = Post.find(id)
puts post.latest_comment
But when you try to fetch a list of posts, the method seems to ignore the includes
, and runs a query for each post to get the latest_comment
for each post!
Post.includes(:comments).each do |post|
puts post.latest_comment
end
Why???
I think that the first step, to fix this kind of problems, is to understand how the method chaining in Active Record works.
Active Record implements “method chaining” which allow us tu use multiple Active Record methods together.
You can chain methods in a statement when the previous method called returns an ActiveRecord::Relation
, like all
, where
, includes
, joins
and order
.
You can’t chain Active Record methods, after a call to a method that does not return an ActiveRecord::Relation
, like to_a
, find
or last
. You need to put those methods at the end of the statement.
When you are fetching a single post there is no problem, because all the methods before the call to last
return an ActiveRecord::Relation
post = Post.find(id)
puts post.latest_comment
If you try each link of the chain, you will see that just the call to last
does not return an ActiveRecord::Relation.
relation = ActiveRecord::Relation
comments = Post.find(id).comments
puts comments.is_a?(relation) #=> true
puts comments.order(:id).is_a?(relation) #=> true
puts comments.order(:id).last.is_a?(relation) #=> false
But when you try to fetch the list…
Post.includes(:comments).each do |post|
puts post.latest_comment
end
At the moment you call each
on the ActiveRecord::Relation
, it will execute the query, and in your logs you will see something like this…
Post Load (0.2ms) SELECT "posts".* FROM "posts"
Comment Load (0.4ms) SELECT "comments".* FROM "comments" WHERE "comments"."post_id" IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10) ORDER BY "comments"."id" ASC [["post_id", 2001], ["post_id", 2002], ["post_id", 2003], ["post_id", 2004], ["post_id", 2005], ["post_id", 2006], ["post_id", 2007], ["post_id", 2008], ["post_id", 2009], ["post_id", 2010]]
But then for each post it will execute a new query, because although you already have the comments loaded, with order(:created_at).last
you are building a new query to fetch
the latest comment with a different order.
That is the reason why when you try to fetch a list of posts, the method seems to ignore the includes, and runs a query for each post to get the latest_comment
for each post.
Maybe your problem is not exactly like the example, but if you are also trying to fetch the latest “X” in a list of records I think that you can use an association with a default order.
Something like this…
class Post
has_many :comments, -> { order(:created_at) }
def latest_comment
comments.last
end
end
Post.includes(:comments).each do |post|
puts post.latest_comment
end
It won’t execute n+1 queries, because you are fetching the comments already on the right order, and you don’t need to ask the database to sort them again.
If you want to see other options you can see this post with some other ways to solve the problem
Now that you understand when the query will be execute, maybe you will want to do a little excercise to practice this ability and try to make it automatic.
You can try this little excercise with 10 examples to help you learn when your ActiveRecord query will be executed.
Learn just enough fundamentals to be fluent preloading associations with ActiveRecord, and start helping your team to avoid n+1 queries on production.