Have you ever needed to get the most recent N posts for each user in rails, but didn’t know how to do it?
Or maybe something similar like:
Sometimes it could be ok to just fetch all elements and filter with ruby, but sometimes it is not possible. Also it can cause n+1 queries if your are not careful.
Here I want to show you how you can solve this problem using a lateral join.
Imagine you have a User
and a Post
record and a User
has many Posts
, like this:
class User < ActiveRecord::Base
has_many :posts
end
class Post < ActiveRecord::Base
belongs_to :user
end
And you want to fetch a list of users
with their last 3 posts
, but for performance reasons you don’t want to fetch all posts
for all users
in the list, you just want to ask the database for the 3 posts
per user
that you want to use.
We need to select from users
and build a lateral join to a subquery that will select from posts
but filtering the posts
where the posts.user_id
is the users.id
and then order those posts
to be able to limit the selection to just the 3 posts
that we want to fetch.
SELECT selected_posts.* FROM users JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE
This will return as selected_posts
a list with just the last three posts
for each user
.
Now we need to pass this query to rails in a way that it could build Post
objects from this information.
This is how I made it work.
class Post < ActiveRecord::Base
belongs_to :user
scope :last_n_per_user, -> {
sql = <<-SQL
JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE
SQL
selected_posts = User
.select("selected_posts.*")
.joins(sql)
from(selected_posts, "posts")
}
end
We start the query from the User
model and then select the selected_posts
. Then comes the lateral join.
Then we use Post.from
to tell rails to evaluate the sql as a subquery and with it build the Post
objects.
At the end it builds this sql:
SELECT "posts".* FROM (SELECT selected_posts.* FROM "users"
JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE) posts
If you want to make the n
variable, you can use the sanitize the sql like this:
class Post < ActiveRecord::Base
belongs_to :user
scope :last_n_per_user, ->(n) {
sql = <<-SQL
JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT :limit
) AS selected_posts ON TRUE
SQL
selected_posts = User
.select("selected_posts.*")
.joins(User.sanitize_sql([sql, limit: n]))
from(selected_posts, "posts")
}
end
Now that you have build the scope, you can build a has many association passing that scope, like this:
class User < ActiveRecord::Base
has_many :posts
has_many :last_posts, -> { last_n_per_user(3) },
class_name: "Post"
end
And now you will be able to preload just the last_posts
like this:
users = User.preload(:last_posts).limit(5)
users.each do |user|
puts user.last_posts.map(&:id).inspect
end
Here is the full code that you can use to run with the Active Record Playground Runner.
schema do
create_table :users do |t|
t.string :name
end
create_table :posts do |t|
t.integer :user_id
t.string :title
end
add_index :posts, :user_id
end
seeds do
users = create_list(User, count: 10) do
{ name: FFaker::Name.name }
end
create_list_for_each_record(Post, records: users, count: 100) do |user|
{ user_id: user.id, title: FFaker::CheesyLingo.title }
end
end
models do
class User < ActiveRecord::Base
has_many :posts
end
class Post < ActiveRecord::Base
belongs_to :user
end
end
example "SQL query" do
sql = <<-SQL
SELECT selected_posts.* FROM users JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE
SQL
result = ActiveRecord::Base.connection.execute(sql)
puts result.to_a.inspect
end
example "Lateral join with rails with a fixed n" do
class Post < ActiveRecord::Base
scope :last_n_per_user, -> {
sql = <<-SQL
JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT 3
) AS selected_posts ON TRUE
SQL
selected_posts = User
.select("selected_posts.*")
.joins(sql)
from(selected_posts, "posts")
}
end
posts = Post.last_n_per_user.preload(:user)
pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end
example "Lateral join with rails with a variable n" do
class Post < ActiveRecord::Base
scope :last_n_per_user, ->(n) {
sql = <<-SQL
JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT :limit
) AS selected_posts ON TRUE
SQL
selected_posts = User
.select("selected_posts.*")
.joins(User.sanitize_sql([sql, limit: n]))
from(selected_posts, "posts")
}
end
posts = Post.last_n_per_user(3).preload(:user)
pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end
example "In a has many association" do
class User < ActiveRecord::Base
has_many :posts
has_many :last_posts, -> { last_n_per_user(3) }, class_name: "Post"
end
class Post < ActiveRecord::Base
scope :last_n_per_user, ->(n) {
sql = <<-SQL
JOIN LATERAL (
SELECT * FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT :limit
) AS selected_posts ON TRUE
SQL
selected_posts = User
.select("selected_posts.*")
.joins(User.sanitize_sql([sql, limit: n]))
from(selected_posts, "posts")
}
end
users = User.preload(:last_posts).limit(5)
pp users.map { |user| [user.name, user.last_posts.map(&:id)] }
end
You can find the code also on: github.com/bhserna/last_n_per_user_lateral_join
Making things work isn't enough for you any more? Now you need to consider performance and scalability?
... But you normally have troubles fixing n+1 queries and trying to find why active record is ignoring your "includes"?
Are you are worried because you feel unqualified to tackle tasks with complex data models?
Sign up to learn how to fix n+1 queries on Rails