Have you ever needed to get the most recent N posts for each user in rails, but didn’t know how to do it without using map?
Or maybe something similar like:
Sometimes could be ok to just fetch all elements and filter with ruby, but sometimes it is not possible. Also it can cause n+1 queries if your are not careful.
Here I want to show you how you can solve this problem using window functions.
Imagine you have a User
and a Post
record and a User
has many Posts
, like this:
class User < ActiveRecord::Base
has_many :posts
end
class Post < ActiveRecord::Base
belongs_to :user
end
And you want to fetch a list of users
with their last 3 posts
, but for performance reasons you don’t want to fetch all posts
for all users
in the list, you just want to ask the database for the 3 posts
per user
that you want to use.
First we need to order the posts
and produce a rank for each user.
SELECT posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
FROM posts
And then use that query as a subquery to now select just the posts
with rank less than or equals to 3.
SELECT * FROM (
SELECT posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
FROM posts
) AS ranked_posts
WHERE posts_rank <= 3
This will return a list with just the last three posts
for each user
.
Now we need to pass this query to rails in a way that it could build Post
objects from this information.
This is how I made it work.
class Post < ActiveRecord::Base
scope :last_n_per_user, -> {
select_sql = <<-SQL
posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
SQL
ranked_posts = select(select_sql)
from(ranked_posts, "posts").where("posts_rank <= 3")
}
end
From the sql we need the part that selects the posts
with the rank, to then use it in the rails select
method.
Then we use from
to use the ranked_posts
relation as subquery and then use where
to define the post_rank
that we want to filter.
The post_rank
is what defines the “n”. The number of posts per user that we want to fetch.
At the end the sql it builds is very similar to our original sql:
SELECT posts.*
FROM (
SELECT posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
FROM posts
) posts
WHERE posts_rank <= 3
If you want to make the “n” variable, you can pass it as a parameter to the where
method:
class Post < ActiveRecord::Base
scope :last_n_per_user, ->(n) {
select_sql = <<-SQL
posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
SQL
ranked_posts = select(select_sql)
from(ranked_posts, "posts").where("posts_rank <= ?", n)
}
end
Now that you have build the scope, you can build a has_many association passing that scope, like this:
class User < ActiveRecord::Base
has_many :posts
has_many :last_posts, -> { last_n_per_user(3) },
class_name: "Post"
end
And now you will be able to preload just the last_posts like this:
users = User.preload(:last_posts).limit(5)
users.each do |user|
puts user.last_posts.map(&:id).inspect
end
Here is the full code that you can use to run with the Active Record Playground Runner.
schema do
create_table :users do |t|
t.string :name
end
create_table :posts do |t|
t.integer :user_id
t.string :title
end
add_index :posts, :user_id
end
seeds do
users = create_list(User, count: 10) do
{ name: FFaker::Name.name }
end
create_list_for_each_record(Post, records: users, count: 100) do |user|
{ user_id: user.id, title: FFaker::CheesyLingo.title }
end
end
models do
class User < ActiveRecord::Base
has_many :posts
end
class Post < ActiveRecord::Base
belongs_to :user
end
end
example "SQL query" do
sql = <<-SQL
SELECT * FROM (
SELECT posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
FROM posts
) AS ranked_posts
WHERE posts_rank <= 3
SQL
result = ActiveRecord::Base.connection.execute(sql)
puts result.to_a.inspect
end
example "With a fixed n" do
class Post < ActiveRecord::Base
scope :last_n_per_user, -> {
select_sql = <<-SQL
posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
SQL
ranked_posts = select(select_sql)
from(ranked_posts, "posts").where("posts_rank <= 3")
}
end
posts = Post.last_n_per_user.preload(:user)
pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end
example "With a variable n" do
class Post < ActiveRecord::Base
scope :last_n_per_user, ->(n) {
select_sql = <<-SQL
posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
SQL
ranked_posts = select(select_sql)
from(ranked_posts, "posts").where("posts_rank <= ?", n)
}
end
posts = Post.last_n_per_user(3).preload(:user)
pp posts.group_by(&:user).map { |user, posts| [user.name, posts.map(&:id)] }
end
example "In a has many association" do
class User < ActiveRecord::Base
has_many :posts
has_many :last_posts, -> { last_n_per_user(3) }, class_name: "Post"
end
class Post < ActiveRecord::Base
scope :last_n_per_user, ->(n) {
select_sql = <<-SQL
posts.*, dense_rank() OVER (
PARTITION BY posts.user_id
ORDER BY posts.id DESC
) AS posts_rank
SQL
ranked_posts = select(select_sql)
from(ranked_posts, "posts").where("posts_rank <= ?", n)
}
end
users = User.preload(:last_posts).limit(5)
pp users.map { |user| [user.name, user.last_posts.map(&:id)] }
end
You can find the code also on: github.com/bhserna/last_n_per_user_window_functions
Learn just enough fundamentals to be fluent preloading associations with ActiveRecord, and start helping your team to avoid n+1 queries on production.