Benito Serna
Ruby/Rails, TDD, Software...

5 ways to fix the latest-comment n+1 problem

Do you have a method that works fine with a single record, but when you use it on a list causes N+1 queries?

One example of this problem is trying to get the latest comment on a list of posts, but there are others, like the last review in a list of products, or the cheapest price, etc…

But let’s explain the problem with…

The latest comment on a list of posts

Imagine, that you have a model like this…

class Posts
  has_many :comments

  def latest_comment
    comments.order(:created_at).last
  end
end

When you are fetching a single post there is no problem…

post = Post.find(id)
puts post.latest_comment

But when you try to fetch a list of posts, the method seems to ignore the includes, and runs a query for each post to get the latest_comment for each post!

Post.includes(:comments).each do |post|
  puts post.latest_comment
end

You can write a different method to get the latest_comment with ruby (not ActiveRecord), but it feels inelegant, and maybe it can be confusing later…

def self.latest_comment(post)
  post.comments.sort_by(&:created_at).last
end

The good news is that there are other solutions =)… And I will share with you some of them. So, you can select the one that is better for the current state of your app.

1. A default order for the “has_many” association

Instead of sorting the comments on the latest_comment method, you can sort the comments at the association level and just ask for the last sorted comment.

class Post
  has_many :sorted_comments, -> { order(:created_at) }

  def latest_comment
    sorted_comments.last
  end
end
Post.includes(:sorted_comments).each do |post|
  puts post.latest_comment
end

2. A “has_one” association for the latest comment

Instead of loading all the comments for all the posts, you can use a has_one association for the latest_comment, providing a scope with just the latest_comment for each post.

class Post < ActiveRecord::Base
  has_many :comments
  has_one :latest_comment, -> { Comment.latest_comments_for_posts }, class_name: "Comment"
end
class Comment < ActiveRecord::Base
  belongs_to :post

  def self.latest_comments_for_posts
    latest_comments_ids = select("max(id)").group(:post_id)
    where(id: latest_comments_ids)
  end
end
Post.includes(:latest_comment).all.each do |post|
  puts post.latest_comment.body
end

3. Looping through the latest comment for each post

You can query the database for a list with the latest_comment for each post, group them by post_id and then pick the right comment for each post.

You can also use some specific objects the task, If you want to avoid polluting you Post and Comment models.

class Post < ActiveRecord::Base
  has_many :comments
end
class Comment < ActiveRecord::Base
  belongs_to :post

  def self.latest_comments_for_posts
    latest_comments_ids = select("max(id)").group(:post_id)
    where(id: latest_comments_ids)
  end
end
class Feed
  def posts
    posts = Post.all
    comments = Comment.latest_comments_for_posts.group_by(&:post_id)
    posts.map { |post| FeedPost.new(post, comments[post.id]&.first) }
  end
end
class FeedPost
  attr_reader :latest_comment

  def initialize(post, latest_comment)
    @post = post
    @latest_comment = latest_comment
  end
end
feed = Feed.new
feed.posts.map do |post|
  puts post.latest_comment.body
end

4. Russian doll-caching

If you are already using fragment cache on your app, you can embrace the “Russian doll caching” and think of n+1 as a feature, and let the n+1 queries run just the first time and use the cache after that.

class Post < ActiveRecord::Base
  has_many :comments

  def latest_comment
    comments.order(:id).last
  end
end
class Comment < ActiveRecord::Base
  belongs_to :post
end

In your view…

<% #posts/index.html.erb %>
<%= render partial: 'posts/post', collection: @posts, cached: true %>
<% #posts/_post.html.erb %>
<% cache post %>
  <%= post.title %>
  <%= post.latest_comment.body %>
<% end %>

5. Caching the latest comment

You can also add a new column for the last_comment_id and a belongs_to association, and keep the last comment in sync “manually” (maybe with an after_commit callback).

ActiveRecord::Schema.define do
  create_table :posts do |t|
    t.column :title, :string
    t.column :body, :text
    t.column :latest_comment_id, :integer
  end

  create_table :comments do |t|
    t.column :body, :text
    t.column :post_id, :integer
  end
end
class Post < ActiveRecord::Base
  has_many :comments
  belongs_to :latest_comment, class_name: "Comment"
end
class Comment < ActiveRecord::Base
  belongs_to :post
  after_commit :update_latest_comment

  private

  def update_latest_comment
    post.update(latest_comment_id: id)
  end
end
Post.includes(:latest_comment).all.each do |post|
  puts post.latest_comment.body
end

Other options for the “latest comments for posts” query

You can try other options for the query. I used one that worked on postgres and sqlite3, but maybe you want to try other depending on your database.

This work on sqlite3 and postgres (used in the examples)…

def self.latest_comments_for_posts
  latest_comments_ids = select("max(id)").group(:post_id)
  where(id: latest_comments_ids)
end

This works on postgres…

def self.latest_comments_for_posts
  from(order(:post_id, id: :desc).arel.distinct_on(arel_table[:post_id]).as("comments"))
end

And this works on sqlite3…

def self.latest_comments_for_posts
  from(order(id: :desc), "comments").group(:post_id)
end

Do you want to see some benchmarks?

If you want to see how this fixes behave in different scenarios, you can take a look to the benchmarks for this 5 fixes.

Or play with the code for those benchmarks. You can see the code on github at bhserna/latest_comment_n-1_fixes_benchmark

Do you want to run the examples?

You can see the code on github at bhserna/latest_comment_n-1_examples.

You can use it to run the examples and play with the code and seed values, to pick the right solution for you current case.

Now you can pick one

Now you know 5 difference ways to solve the problem, and you can pick the one that better fits your app =)

If you are not sure, I think you can use the first one, “the association with default order”, unless your metrics are telling you that you need something different.

Download a cheatsheet

You can also download a cheatsheet to help you decide which fix to pick in different contexts, based mostly on my interpretation of the benchmarks and the complexity of each solution.

Get it here: Cheatsheet for the “latest-comment” n+1 queries problem