Jeremy Smith on November 12, 2021

Full Text Search in Postgres with View-Backed Models in Rails

I often work on projects where, in the near-term, experimentation and rapid iteration are more important than stability and performance. This is often the case for small organizations, startups, prototypes, and even side projects.

When time is of the essence, it can be difficult to know what corners to cut and what choices will come back to haunt you. Over the years, I’ve come to believe that surfacing and clearly expressing your application’s domain concepts is essential and not worth sacrificing. Here’s an example…

A few months back, I was working on a greenfield Rails project where I needed to allow for faceted, full text search of user profiles.

The searchable data was spread across multiple models (e.g. name on User, skills and location on Profile, bio on ActionText::RichText). And I wanted to be able to filter searches based on various attributes (e.g. status, roles, etc.) across those models.

I was using PostgreSQL as my main data store and I wanted to avoid introducing a new service, like Elasticsearch. Adding additional services increases development and operational complexity and creates lock-in that is difficult to break free of later.

I decided to take advantage of PostgreSQL’s built-in full text search, using the PgSearch gem. PgSearch supports two different approaches to search:

The first technique is multi-search, in which records of many different Active Record classes can be mixed together into one global search index across your entire application. Most sites that want to support a generic search page will want to use this feature.

The other technique is search scopes, which allow you to do more advanced searching against only one Active Record class. This is more useful for building things like autocompleters or filtering a list of items in a faceted search.

While I did need to search across several Active Record classes, I didn’t want a search feature that would return separate results for User records, Profile records, and so on. I only wanted to search and return user profiles. But to do that, I needed data available on other records through the profile associations. So I chose the second technique, search scopes.

Now, at this point, I could have followed the documentation and inserted the search functionality directly into my Profile model. But my deliberation over which search technique to use was hinting at an important underlying domain issue: the Profile Active Record model was different than the thing I wanted to be able to query and return results for. This other thing wanted to be it’s own domain concept. It didn’t matter whether or not the data was persisted in the same database (or even tables or columns), the concerns and considerations were distinct enough to warrant a separate concept.

This led me to implement a view-backed ProfileSearch model, using the Scenic gem for view management. Here’s what the Scenic documentation has to say:

Using view-backed models can help promote concepts hidden in your relational data to first-class domain objects and can clean up complex ActiveRecord or ARel queries. As far as ActiveRecord is concerned, a view is no different than a table.

First, I generated the Scenic view below and ran rails db:migrate:

SELECT
  profiles.id AS id,
  profiles.user_id AS user_id,
  profiles.status AS status,
  profiles.description AS description,
  profiles.skills AS skills,
  profiles.industries AS industries,
  profiles.location AS location,
  users.first_name AS first_name,
  users.last_name AS last_name,
  action_text_rich_texts.body AS bio
FROM profiles
INNER JOIN users ON profiles.user_id = users.id
LEFT OUTER JOIN action_text_rich_texts ON profiles.id = action_text_rich_texts.record_id AND action_text_rich_texts.record_type = 'Profile' AND action_text_rich_texts.name = 'bio';

Then I created a new Active Record model to use the new view:

class ProfileSearch < ApplicationRecord
  self.primary_key = :id

  include PgSearch::Model
  pg_search_scope(
    :profile_search,
    against: {
      description: "A",
      skills: "A",
      bio: "B",
      industries: "C",
      location: "C",
      first_name: "C",
      last_name: "C"
    },
    using: {
      tsearch: { prefix: true, any_word: true, dictionary: "english" }
    }
  )

  belongs_to :user
  has_and_belongs_to_many :roles, join_table: :profiles_roles, foreign_key: :profile_id

  enum status: { hidden: 0, private_access: 1, public_access: 2 }

  def readonly?
    true
  end
end

The Scenic docs explain why setting self.primary_key is necessary:

ActiveRecord’s find method expects to query based on your model’s primary key, but views do not have primary keys. Additionally, the first and last methods will produce queries that attempt to sort based on the primary key.

You can get around these issues by setting the primary key column on your Rails model.

The ProfileSearch model needs some of the details found on the Profile model. Specifically, the associations and status enum. Some of these could be collected into a concern and included in both. Though the has_and_belongs_to_many must declare the proper join table and foreign key to use, since they can’t be inferred from the model name.

Next, I implemented the search in a controller (using the Pagy for pagination):

class ProfilesController < ApplicationController
  include Pagy::Backend
  skip_before_action :authenticate_user!

  def index
    if params[:q].present?
      scoped = ProfileSearch.profile_search(params[:q])
    else
      scoped = ProfileSearch.all
    end

    if params[:role].present?
      scoped = scoped.where(roles: { id: params[:role].to_i })
                     .left_joins(:roles)
    end

    @pagy, @profiles = pagy(
      scoped.where(status: visible_statuses)
            .includes({ user: { avatar_attachment: :blob } }, :roles)
    )
  end

  private

  def visible_statuses
    user_signed_in? ? [:private_access, :public_access] : [:public_access]
  end
end

And finally, I implemented the view. Here’s a simplified version:

<%= simple_form_for :search, url: profiles_path, method: :get do |f| -%>
  <%= f.input :role, label: false, collection: Role.all, include_blank: "Role", selected: params[:role], input_html: { name: :role, id: :role } %>
  <%= f.input :q, label: false, placeholder: "Keywords", input_html: { name: :q, id: :q, value: params[:q] } %>
  <%= f.button :submit, "Filter" %>
<% end %>

<% if @pagy.count > 0 %>
  <%= render @profiles %>
<% else %>
  <h1>
    Sorry, no results found.
  </h1>
<% end %>

<%== render partial: 'pagy/nav', locals: { pagy: @pagy } if @pagy.pages > 1 %>

Taking this approach allowed me to elevate a domain concept (the ProfileSearch) in the codebase, while also shipping a feature quickly. It also set a path for progressive improvement.

For example, the big downside of this current approach is that the search query is not very performant. In development, searching across 1000 profile records (with associations) took around 80-100ms.

As Leigh Halliday explains in Full Text Search in Milliseconds with Rails and PostgreSQL, “the slowness is in re-parsing and normalizing the document into a tsvector data type every single time the query is executed.” He suggests adding a pre-calculated ts_vector column that is automatically written on each INSERT/UPDATE, adding an index to use the new column, and then configuring pg_search_scope to set the proper tsvector_column.

This approach is the clear next step to improve performance. And the ProfileSearch model doesn’t change much, as it relies on the view’s interface to the underlying data.


Need help building or maintaining a Rails app?

Jeremy is currently booked until mid-2023, but always happy to chat.

Email Jeremy