Benito Serna Tips and tools for Ruby on Rails developers

Fetching the top n per group with a lateral join with rails

Have you ever needed to get the most recent N posts for each user in rails, but didn’t know how to do it?

Or maybe something similar like:

Sometimes it could be ok to just fetch all elements and filter with ruby, but sometimes it is not possible. Also it can cause n+1 queries if your are not careful.

Here I want to show you how you can solve this problem using a lateral join.

An example problem

Imagine you have a User and a Post record and a User has many Posts, like this:

class User < ActiveRecord::Base
  has_many :posts
end

class Post < ActiveRecord::Base
  belongs_to :user
end

And you want to fetch a list of users with their last 3 posts, but for performance reasons you don’t want to fetch all posts for all users in the list, you just want to ask the database for the 3 posts per user that you want to use.

Building the SQL query

We need to select from users and build a lateral join to a subquery that will select from posts but filtering the posts where the posts.user_id is the users.id and then order those posts to be able to limit the selection to just the 3 posts that we want to fetch.

SELECT selected_posts.* FROM users JOIN LATERAL (
  SELECT * FROM posts
  WHERE user_id = users.id
  ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE

This will return as selected_posts a list with just the last three posts for each user.

Lateral join with rails with a fixed n

Now we need to pass this query to rails in a way that it could build Post objects from this information.

This is how I made it work.

class Post < ActiveRecord::Base
  belongs_to :user

  scope :last_n_per_user, -> {
    sql = <<-SQL
      JOIN LATERAL (
        SELECT * FROM posts
        WHERE user_id = users.id
        ORDER BY id DESC LIMIT 3
      ) AS selected_posts ON TRUE
    SQL

    selected_posts = User
      .select("selected_posts.*")
      .joins(sql)

    from(selected_posts, "posts")
  }
end

We start the query from the User model and then select the selected_posts. Then comes the lateral join.

Then we use Post.from to tell rails to evaluate the sql as a subquery and with it build the Post objects.

At the end it builds this sql:

SELECT "posts".* FROM (SELECT selected_posts.* FROM "users"
JOIN LATERAL (
  SELECT * FROM posts
  WHERE user_id = users.id
  ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE) posts

Lateral join with rails with a variable n

If you want to make the n variable, you can use the sanitize the sql like this:

class Post < ActiveRecord::Base
  belongs_to :user

  scope :last_n_per_user, ->(n) {
    sql = <<-SQL
      JOIN LATERAL (
        SELECT * FROM posts
        WHERE user_id = users.id
        ORDER BY id DESC LIMIT :limit
      ) AS selected_posts ON TRUE
    SQL

    selected_posts = User
      .select("selected_posts.*")
      .joins(User.sanitize_sql([sql, limit: n]))

    from(selected_posts, "posts")
  }
end

Put it in a has_many association

Now that you have build the scope, you can build a has many association passing that scope, like this:

class User < ActiveRecord::Base
  has_many :posts
  has_many :last_posts, -> { last_n_per_user(3) },
    class_name: "Post"
end

And now you will be able to preload just the last_posts like this:

users = User.preload(:last_posts).limit(5)

users.each do |user|
  puts user.last_posts.map(&:id).inspect
end

Code to run the example

Here is the full code that you can use to run with the Active Record Playground Runner.

schema do
  create_table :users do |t|
    t.string :name
  end

  create_table :posts do |t|
    t.integer :user_id
    t.string :title
  end

  add_index :posts, :user_id
end

seeds do
  users = create_list(User, count: 10) do
    { name: FFaker::Name.name }
  end

  create_list_for_each_record(Post, records: users, count: 100) do |user|
    { user_id: user.id, title: FFaker::CheesyLingo.title }
  end
end

models do
  class User < ActiveRecord::Base
    has_many :posts
  end

  class Post < ActiveRecord::Base
    belongs_to :user
  end
end

example "SQL query" do
  sql = <<-SQL
    SELECT selected_posts.* FROM users JOIN LATERAL (
      SELECT * FROM posts
      WHERE user_id = users.id
      ORDER BY id DESC LIMIT 3
    ) AS selected_posts ON TRUE
  SQL

  result = ActiveRecord::Base.connection.execute(sql)
  puts result.to_a.inspect
end

example "Lateral join with rails with a fixed n" do
  class Post < ActiveRecord::Base
    scope :last_n_per_user, -> {
      sql = <<-SQL
        JOIN LATERAL (
          SELECT * FROM posts
          WHERE user_id = users.id
          ORDER BY id DESC LIMIT 3
        ) AS selected_posts ON TRUE
      SQL

      selected_posts = User
        .select("selected_posts.*")
        .joins(sql)

      from(selected_posts, "posts")
    }
  end

  posts = Post.last_n_per_user.preload(:user)
  pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end

example "Lateral join with rails with a variable n" do
  class Post < ActiveRecord::Base
    scope :last_n_per_user, ->(n) {
      sql = <<-SQL
        JOIN LATERAL (
          SELECT * FROM posts
          WHERE user_id = users.id
          ORDER BY id DESC LIMIT :limit
        ) AS selected_posts ON TRUE
      SQL

      selected_posts = User
        .select("selected_posts.*")
        .joins(User.sanitize_sql([sql, limit: n]))

      from(selected_posts, "posts")
    }
  end

  posts = Post.last_n_per_user(3).preload(:user)
  pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end

example "In a has many association" do
  class User < ActiveRecord::Base
    has_many :posts
    has_many :last_posts, -> { last_n_per_user(3) }, class_name: "Post"
  end

  class Post < ActiveRecord::Base
    scope :last_n_per_user, ->(n) {
      sql = <<-SQL
        JOIN LATERAL (
          SELECT * FROM posts
          WHERE user_id = users.id
          ORDER BY id DESC LIMIT :limit
        ) AS selected_posts ON TRUE
      SQL

      selected_posts = User
        .select("selected_posts.*")
        .joins(User.sanitize_sql([sql, limit: n]))

      from(selected_posts, "posts")
    }
  end

  users = User.preload(:last_posts).limit(5)
  pp users.map { |user| [user.name, user.last_posts.map(&:id)] }
end

You can find the code also on: github.com/bhserna/last_n_per_user_lateral_join

Related articles

Do you feel bad for asking for help to fix an n+1 queries problem?

Making things work isn't enough for you any more? Now you need to consider performance and scalability?

... But you normally have troubles fixing n+1 queries and trying to find why active record is ignoring your "includes"?

Are you are worried because you feel unqualified to tackle tasks with complex data models?

Sign up to learn how to fix n+1 queries on Rails