Benito Serna Tips and tools for Ruby on Rails developers

Fetching the top n per group with a lateral join with rails

August 16, 2022

Have you ever needed to get the most recent N posts for each user in rails, but didn’t know how to do it?

Or maybe something similar like:

Sometimes it could be ok to just fetch all elements and filter with ruby, but sometimes it is not possible. Also it can cause n+1 queries if your are not careful.

Here I want to show you how you can solve this problem using a lateral join.

An example problem

Imagine you have a User and a Post record and a User has many Posts, like this:

class User < ActiveRecord::Base
  has_many :posts
end

class Post < ActiveRecord::Base
  belongs_to :user
end

And you want to fetch a list of users with their last 3 posts, but for performance reasons you don’t want to fetch all posts for all users in the list, you just want to ask the database for the 3 posts per user that you want to use.

Building the SQL query

We need to select from users and build a lateral join to a subquery that will select from posts but filtering the posts where the posts.user_id is the users.id and then order those posts to be able to limit the selection to just the 3 posts that we want to fetch.

SELECT selected_posts.* FROM users JOIN LATERAL (
  SELECT * FROM posts
  WHERE user_id = users.id
  ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE

This will return as selected_posts a list with just the last three posts for each user.

Lateral join with rails with a fixed n

Now we need to pass this query to rails in a way that it could build Post objects from this information.

This is how I made it work.

class Post < ActiveRecord::Base
  belongs_to :user

  scope :last_n_per_user, -> {
    sql = <<-SQL
      JOIN LATERAL (
        SELECT * FROM posts
        WHERE user_id = users.id
        ORDER BY id DESC LIMIT 3
      ) AS selected_posts ON TRUE
    SQL

    selected_posts = User
      .select("selected_posts.*")
      .joins(sql)

    from(selected_posts, "posts")
  }
end

We start the query from the User model and then select the selected_posts. Then comes the lateral join.

Then we use Post.from to tell rails to evaluate the sql as a subquery and with it build the Post objects.

At the end it builds this sql:

SELECT "posts".* FROM (SELECT selected_posts.* FROM "users"
JOIN LATERAL (
  SELECT * FROM posts
  WHERE user_id = users.id
  ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE) posts

Lateral join with rails with a variable n

If you want to make the n variable, you can use the sanitize the sql like this:

class Post < ActiveRecord::Base
  belongs_to :user

  scope :last_n_per_user, ->(n) {
    sql = <<-SQL
      JOIN LATERAL (
        SELECT * FROM posts
        WHERE user_id = users.id
        ORDER BY id DESC LIMIT :limit
      ) AS selected_posts ON TRUE
    SQL

    selected_posts = User
      .select("selected_posts.*")
      .joins(User.sanitize_sql([sql, limit: n]))

    from(selected_posts, "posts")
  }
end

Put it in a has_many association

Now that you have build the scope, you can build a has many association passing that scope, like this:

class User < ActiveRecord::Base
  has_many :posts
  has_many :last_posts, -> { last_n_per_user(3) },
    class_name: "Post"
end

And now you will be able to preload just the last_posts like this:

users = User.preload(:last_posts).limit(5)

users.each do |user|
  puts user.last_posts.map(&:id).inspect
end

Code to run the example

Here is the full code that you can use to run with the Active Record Playground Runner.

schema do
  create_table :users do |t|
    t.string :name
  end

  create_table :posts do |t|
    t.integer :user_id
    t.string :title
  end

  add_index :posts, :user_id
end

seeds do
  users = create_list(User, count: 10) do
    { name: FFaker::Name.name }
  end

  create_list_for_each_record(Post, records: users, count: 100) do |user|
    { user_id: user.id, title: FFaker::CheesyLingo.title }
  end
end

models do
  class User < ActiveRecord::Base
    has_many :posts
  end

  class Post < ActiveRecord::Base
    belongs_to :user
  end
end

example "SQL query" do
  sql = <<-SQL
    SELECT selected_posts.* FROM users JOIN LATERAL (
      SELECT * FROM posts
      WHERE user_id = users.id
      ORDER BY id DESC LIMIT 3
    ) AS selected_posts ON TRUE
  SQL

  result = ActiveRecord::Base.connection.execute(sql)
  puts result.to_a.inspect
end

example "Lateral join with rails with a fixed n" do
  class Post < ActiveRecord::Base
    scope :last_n_per_user, -> {
      sql = <<-SQL
        JOIN LATERAL (
          SELECT * FROM posts
          WHERE user_id = users.id
          ORDER BY id DESC LIMIT 3
        ) AS selected_posts ON TRUE
      SQL

      selected_posts = User
        .select("selected_posts.*")
        .joins(sql)

      from(selected_posts, "posts")
    }
  end

  posts = Post.last_n_per_user.preload(:user)
  pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end

example "Lateral join with rails with a variable n" do
  class Post < ActiveRecord::Base
    scope :last_n_per_user, ->(n) {
      sql = <<-SQL
        JOIN LATERAL (
          SELECT * FROM posts
          WHERE user_id = users.id
          ORDER BY id DESC LIMIT :limit
        ) AS selected_posts ON TRUE
      SQL

      selected_posts = User
        .select("selected_posts.*")
        .joins(User.sanitize_sql([sql, limit: n]))

      from(selected_posts, "posts")
    }
  end

  posts = Post.last_n_per_user(3).preload(:user)
  pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end

example "In a has many association" do
  class User < ActiveRecord::Base
    has_many :posts
    has_many :last_posts, -> { last_n_per_user(3) }, class_name: "Post"
  end

  class Post < ActiveRecord::Base
    scope :last_n_per_user, ->(n) {
      sql = <<-SQL
        JOIN LATERAL (
          SELECT * FROM posts
          WHERE user_id = users.id
          ORDER BY id DESC LIMIT :limit
        ) AS selected_posts ON TRUE
      SQL

      selected_posts = User
        .select("selected_posts.*")
        .joins(User.sanitize_sql([sql, limit: n]))

      from(selected_posts, "posts")
    }
  end

  users = User.preload(:last_posts).limit(5)
  pp users.map { |user| [user.name, user.last_posts.map(&:id)] }
end

You can find the code also on: github.com/bhserna/last_n_per_user_lateral_join

Related articles

Free ebook "N+1 queries on Rails basics"

Just enough fundamentals to be fluent preloading associations with ActiveRecord, and start helping your team to avoid n+1 queries on production.

Read more