Benito Serna Tips and tools for Ruby on Rails developers

Using an objet to represent a complex preload

August 23, 2021

Sometimes just using preload or includes is not possible… or maybe it is, but you just can’t figure it out how to do it.

Maybe you want to preload some records matching two keys, or preload a grouped relation.

Maybe you know how to represent the association with a has_many :through, but you need something faster, or to use less memory.

In that kind of situations, one thing that you can do is to write a custom object to represent that preload.

Here I will try to explain how and when you can do it with an example.

Index

Example

For example, given the next model:

class Post < ApplicationRecord
  belongs_to :author, class_name: "User"
  has_many :comments
end

class Comment < ApplicationRecord
  belongs_to :post
  belongs_to :user
  has_many :votes, class_name: "CommentVote"
end

class CommentVote < ApplicationRecord
  belongs_to :voter, class_name: "User"
  belongs_to :comment
end

class User < ApplicationRecord
  has_many :comment_votes, foreign_key: :voter_id
end

Imagine that you want to fetch the people that have a vote on a comment to a post, the “comment voters”.

To implement it, you can have a method in the post to do that…

class Post < ApplicationRecord
  belongs_to :author, class_name: "User"
  has_many :comments

  def comment_voters
    comments.preload(votes: :voter)
      .flat_map(&:votes)
      .flat_map(&:voter)
      .uniq
  end
end

And it will correctly preload the votes and voters to compute the result.

post = Post.first
post.comment_voters

But if you try to do it for a list of post, it will execute n+1 queries… What could you do?

Preload with has_many :through

In this particular case you can try to re-shape your associations to solve it with a has_many :through.

You can add a has_many :voters to Comment and a has_many :comment_voters to Post.

Like this:

class Post < ApplicationRecord
  belongs_to :author, class_name: "User"
  has_many :comments
  has_many :comment_voters, -> { distinct }, through: :comments, source: :voters
end

class Comment < ApplicationRecord
  belongs_to :post
  belongs_to :user
  has_many :votes, class_name: "CommentVote"
  has_many :voters, through: :votes
end

class CommentVote < ApplicationRecord
  belongs_to :voter, class_name: "User"
  belongs_to :comment
end

class User < ApplicationRecord
end

This will correctly compute the data in just one query

post = Post.first
post.comment_voters

And, you will also be able to preload the comment_voters just as any other association.

Post.preload(:comment_voters).all.each do |post|
  puts post.comment_voters.to_a
end

But…

If you see the executed queries, it could be not as performant as you need…

irb(main):055:0> Post.preload(:comment_voters).to_a.count

Post Load (0.7ms)  SELECT "posts".* FROM "posts"

Comment Load (3.5ms)  SELECT "comments".* FROM "comments" WHERE
"comments"."post_id" IN (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?, ?)  [["post_id", 1], ["post_id", 2], ["post_id", 3], ["post_id", 4],
["post_id", 5], ["post_id", 6], ["post_id", 7], ["post_id", 8], ["post_id", 9],
["post_id", 10], ["post_id", 11], ["post_id", 12], ["post_id", 13], ["post_id",
14], ["post_id", 15], ["post_id", 16]...

CommentVote Load (18.3ms)  SELECT "comment_votes".* FROM "comment_votes" WHERE
"comment_votes"."comment_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,
55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94,
95, 96, 97, 98, 99, 100...

User Load (0.2ms)  SELECT DISTINCT "users".* FROM "users" WHERE "users"."id" IN
(?, ?, ?, ?, ?, ?, ?, ?, ?, ?)  [["id", 8], ["id", 5], ["id", 1], ["id", 6],
["id", 4], ["id", 7], ["id", 3], ["id", 10], ["id", 2], ["id", 9]]

Maybe you can try to eager_load, but maybe it could not be what you need either…

irb(main):059:0> posts = Post.eager_load(:comment_voters).to_a

SQL (47.0ms)  SELECT "posts"."id" AS t0_r0, "posts"."author_id" AS t0_r1,
"posts"."title" AS t0_r2, "posts"."body" AS t0_r3, "posts"."created_at" AS
t0_r4, "posts"."updated_at" AS t0_r5, "users"."id" AS t1_r0, "users"."name" AS
t1_r1, "users"."created_at" AS t1_r2, "users"."updated_at" AS t1_r3 FROM
"posts" LEFT OUTER JOIN "comments" ON "comments"."post_id" = "posts"."id" LEFT
OUTER JOIN "comment_votes" ON "comment_votes"."comment_id" = "comments"."id"
LEFT OUTER JOIN "users" ON "users"."id" = "comment_votes"."voter_id"

Trying a custom query

Maybe what you want is to write a custom query, because you don’t want to fetch data that you don’t need.

Let’s imagine that what you need is to fetch just the users with voted comments on a collection of posts. Without preloading anything else, like this:

posts = Post.limit(30)
comment_voters = User
  .select("users.*, comments.post_id") # selects the user attributes and the post_id of the comment.
  .joins(comment_votes: [:comment])
  .distinct
  .where(comments: { post_id: posts })
  .to_a
User Load (16.5ms)  SELECT DISTINCT users.*, comments.post_id FROM "users"
INNER JOIN "comment_votes" ON "comment_votes"."voter_id" = "users"."id" INNER
JOIN "comments" ON "comments"."id" = "comment_votes"."comment_id" WHERE
"comments"."post_id" IN (SELECT "posts"."id" FROM "posts")

You can then group this voters by post_id, and then in the view you can pick the voters for each post.

posts = Post.limit(30)
comment_voters = User
  .select("users.*, comments.post_id") # selects the user attributes and the post_id of the comment.
  .joins(comment_votes: [:comment])
  .where(comments: { post_id: posts })
  .distinct
  .group_by(&:post_id)

posts.each do |post|
  comment_voters[post.id].map(&:name)
end

That’s ok, but…

What if you don’t want to expose that code in your controller or view?… What if instead we move that to a model object?… Where should you put that code?

A preload object

There are many options, but one of them, is to create a new object to help you with two things:

For this example you can call the object Post::CommentVotersPreload and use it like this:

posts = Post.limit(30)
comment_voters = Post::CommentVotersPreload.new(posts)

posts.each do |post|
  comment_voters.for_post(post).map(&:name)
end

And the implementation could look something like this:

class Post::CommentVotersPreload
  def initialize(posts)
    @posts = posts
  end

  def for_post(post)
    comment_voters[post.id]
  end

  private

  def comment_voters
    @comment_voters ||= fetch_records.group_by(&:post_id)
  end

  def fetch_records
    User.select("users.*, comments.post_id")
      .joins(comment_votes: [:comment])
      .where(comments: { post_id: @posts })
      .distinct
  end
end

You can use this kind of objects in many different problems.

What you will need is:

Other examples

If you need more examples to understand the idea, you can check this other examples:

When should you use it?

The answer to this question is not that simple (at least for me).

What I could recommend you is to first try to solve the problem in a “railsy” way… Try to use the mechanisms, provided by rails. Normally those mechanisms will compose better with other parts of the framework.

If you don’t know how to do it, because you don’t know enough about rails, try to study the rails guides, specially the associations guide, and the query interface guide. You can also study my guide for preloading associations.

But if you already know what rails provides, and you can’t find a way to do what you need to do… or you need something more efficient… or you just think that the final code is better…

Then go for it and use a “preload object”, just try to have a reason.

Related articles

No more… “Why active record is ignoring my includes?”

Get for free the first part of the ebook Fix n+1 queries on Rails that will help you:

  • Explain what is an n+1 queries problem
  • Identify when ActiveRecord will execute a query
  • Solve the latest comment example
  • Detect n+1 queries by watching the logs
  • Learn the tools to detect n+1 queries
Get the first part of the ebook for free