Ruby on Rails

Do not use map or pluck to pre-select records


I recently learned about an optimization for Rails queries that’s quite subtle. I was implementing Rubocop and got warnings from this new check they added.

At first I didn’t really understand what the problem was, take this example:

Post.where(user_id: User.active.pluck(:id))

This would translate into these queries similar to these:

SELECT "user"."id" FROM "users" WHERE "users"."active" = $1 [["active", true]]

SELECT "posts".* FROM "posts" WHERE "posts"."user_id" IN ($1, $2, $3, $4) LIMIT $5 [["user_id", 1], ["user_id", 2], ["user_id", 3], ["user_id", 4], ["LIMIT", 10]]

Turns out there are a couple of potential problems with this:

  • It always requires two queries (one to grab the ids and one to grab the posts).
  • When you have a lot of users the second query gets very large, as every id is passed in as a separate argument.
  • It doesn’t use ActiveRecord’s power to construct the best query.

The better solution would be to just pass in an ActiveRecord::Relation object, which gets turned into a subquery automatically.

Post.where(user: User.active)

Turns into:

SELECT "posts".* FROM "posts" WHERE "posts"."user_id" IN (SELECT "user"."id" FROM "users" WHERE "users"."active" = $1) [["active", true]]

I wasn’t aware of this simple optimization done by Active Record, and happy to have learned about it from running Rubocop.