Related articles

No items found.
The French newsletter for Ruby on Rails developers. Find similar content for free every month in your inbox!
Register
Share:
Blog
>

Why avoid nested STIs | ActiveRecord, Rails 6

Inheriting nested single tables does not work well. Here's what you need to know to make it work or bypass it.

Some Context for the Illustration

I recently came across the following scenario.

Initial specifications : A project owner creates a project and donors can contribute to this project for any amount of money.

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
end

class User < ApplicationRecord
  # ...
end

class User::ProjectOwner < User
  # ...
end

class User::Donor < User
  # ...
end

class Project < ApplicationRecord
  # ...
end

class Contribution < ApplicationRecord
  # ...
end

Later, A Small Change Has Been Made to the Specifications : a donor can be either a natural person (a human individual), or a legal entity (a company or any other type of legal entity).

Since both are donors and will share a significant amount of logic, it seems obvious that they are both a specialization of User: :Donor, so:

class User::Donor::Natural < User::Donor
  # ...
end

class User::Donor::Legal < User::Donor
  # ...
end

So far, it's classic OOP and we rely on ActiveRecord's STI mechanism to do its magic. (.find type inference and so forth).

Spoiler alert: it's not working.

STI Doesn't Play Well With Lazy Code Loading

This part is not specific to STI (nested) or ActiveRecord but it is useful to know it.

Given a database without registration (I am working on a new project):

User.count
# => 0

User.descendants
# => []

It's unexpected. I thought User.descendants Would give me an array of all the subclasses of User (%i [User: :ProjectOwner User: :Donor User: :Donor User: :Donor User: :Donor: :Natural User: :Donor: :Legal]) But I don't have any of that. Why?

You don't expect a constant to exist if it hasn't been defined, right? Well, unless you load the file that defines it, it won't exist.

Here's basically how it goes:

Me: …start a rails console
Me: User.descendants
Me: #=> []

Me: puts "Did you know: you can clap for this article up to 50 times ;)" if User::Donor.is_a?(User)
Code loader: Oh, this `User::Donor` const does not exist yet, let me infer which file is supposed to define it and try to load it for you.
Code loader: Ok I found it and loaded it, you can proceed
Me: #=> "Did you know: you can clap for this article up to 50 times ;)"

Me: User.descendants
Me: #=> [User::Donor]

Me: puts "Another Brick In The Wall" if User::Pink.is_a?(User)
Code loader: Oh, this `User::Pink` const does not exist yet, let me infer which file is supposed to define it and try to load it for you.
Code loader: Sorry, this `User::Pink` is nowhere to be found, I hope you know how to rescue from NameError.
Me: #=> NameError (uninitialized constant #<Class:0x00007fb42cb92ef8>::Pink)

Now you understand why lazy loading is not compatible with Single Table Inheritance: unless you have already accessed each of the constant names of your STI subclasses to preload them, your application will not know them.

It's not that STI doesn't work at all, it's just a bit frustrating because we often need to list the STI hierarchy and there's no easy, out-of-the-box way to do that.

The Ruby on Rails guide mentions this problem and offers a (incomplete) solution: https://guides.rubyonrails.org/autoloading_and_reloading_constants.html#single-table-inheritance

TL; DR: use a concern that collects all types of inheritance_column and forcibly pre-charge them.

Why it's incomplete: because a subtype that doesn't have a record yet won't be pre-loaded, which means there are things you won't be able to do. For example, you can't rely on inflection to generate selection options because unregistered types won't be listed in your options.

Another solution (really not recommended) would be to pre-load all the classes in your application. It's like killing a fly with a hammer.

My solution is based on the issue suggested by the Rails guide but instead of collecting types from inheritance_column, I am using a table that contains all the ITS subclasses. That way, I can use inflection at will. I agree that It's not a 100% SOLID customer, but it is a compromise that I am prepared to make.

That being said, let's talk about the main topic of this article.

STI + lazy loading + nested models = unpredictable behavior

Single Array Inheritance is designed for one base class and as many subclasses as you want, As long as they all inherit directly from the base class.

Take a look at the following two examples. The former works perfectly well, while the latter will give you headaches.

# Working example

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
end

class User < ApplicationRecord
end

class User::ProjectOwner < User
  has_many :projects
end

class User::Donor < User
  has_many :contributions
end

class Project < ApplicationRecord
  belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end

class Contribution < ApplicationRecord
  belongs_to :project
  belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end
# Not working example

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
end

class User < ApplicationRecord
end

class User::ProjectOwner < User
  has_many :projects
end

class User::Donor < User
  has_many :contributions
end

class User::Donor::Natural < User::Donor
end

class User::Donor::Legal < User::Donor
end

class Project < ApplicationRecord
  belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end

class Contribution < ApplicationRecord
  belongs_to :project
  belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end

Why does the first one work of Predictable way And not the second? Find out for yourself by paying attention to SQL queries:

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
end

class User < ApplicationRecord
end

class User::ProjectOwner < User
  has_many :projects
end

class User::Donor < User
  has_many :contributions
end

class Project < ApplicationRecord
  belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end

class Contribution < ApplicationRecord
  belongs_to :project
  belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end

# ...open a rails console...

project_owner = User::ProjectOwner.create
# => User::ProjectOwner(id: 1)

project = Project.create(project_owner: project_owner)
# => Project(id: 1, project_owner_id: 1)

donor = User::Donor.create
# => User::Donor(id: 1)

contribution = Contribution.create(donor: donor, project: project, amount: 100)
# => Contribution(id: 1, user_id: 1, project_id: 1, amount: 100)

# ...CLOSE the current rails console...

# ...OPEN a NEW rails console...

Contribution.last.donor
  Contribution Load (0.5ms)  SELECT "contributions".* FROM "contributions" ORDER BY "contributions"."id" DESC LIMIT $1  [["LIMIT", 1]]
  User::Donor Load (0.3ms)  SELECT "users".* FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3  [["type", "User::Donor"], ["id", 1], ["LIMIT", 1]]
# => User::Donor(id: 1)

Now with a nested STI (base class, middle level subclass, and leaf level subclasses):

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
end

class User < ApplicationRecord
end

class User::ProjectOwner < User
  has_many :projects
end

class User::Donor < User
  has_many :contributions
end

class User::Donor::Natural < User::Donor
end

class User::Donor::Legal < User::Donor
end

class Project < ApplicationRecord
  belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end

class Contribution < ApplicationRecord
  belongs_to :project
  belongs_to :donor, class_name: 'User::Donor', foreign_key: 'user_id'
end

# ...open a rails console...

project_owner = User::ProjectOwner.create
# => User::ProjectOwner(id: 1)

project = Project.create(project_owner: project_owner)
# => Project(id: 1, project_owner_id: 1)

donor = User::Donor::Natural.create
# => User::Donor::Natural(id: 1)

contribution = Contribution.create(donor: donor, project: project, amount: 100)
# => Contribution(id: 1, user_id: 1, project_id: 1, amount: 100)

# ...CLOSE the current rails console...

# ...OPEN a NEW rails console...

Contribution.last.donor
  Contribution Load (0.5ms)  SELECT "contributions".* FROM "contributions" ORDER BY "contributions"."id" DESC LIMIT $1  [["LIMIT", 1]]
  User::Donor Load (0.3ms)  SELECT "users".* FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3  [["type", "User::Donor"], ["id", 1], ["LIMIT", 1]]
# => nil

Do you see? The SQL query to find the contributor associated with the contribution looks for the typeUser: :Donor. As my donor is aUser: :Donor: :Natural, the record was not found.ActiveRecord Does Not Know That User: :Donor: :Natural Is a subclass of User: :Donor In the context of an STI, unless I charge it first.

irb(main):001:0> User.all.pluck :id
   (0.9ms)  SELECT "users"."id" FROM "users"
=> [2, 1]
irb(main):002:0> User.exists?(1)
  User Exists? (0.3ms)  SELECT 1 AS one FROM "users" WHERE "users"."id" = $1 LIMIT $2  [["id", 1], ["LIMIT", 1]]
=> true
irb(main):003:0> User::Donor.exists?(1)
  User::Donor Exists? (0.7ms)  SELECT 1 AS one FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3  [["type", "User::Donor"], ["id", 1], ["LIMIT", 1]]
=> false
irb(main):004:0> User::Donor::Natural.exists?(1)
  User::Donor::Natural Exists? (1.3ms)  SELECT 1 AS one FROM "users" WHERE "users"."type" = $1 AND "users"."id" = $2 LIMIT $3  [["type", "User::Donor::Natural"], ["id", 1], ["LIMIT", 1]]
=> true
irb(main):005:0> User::Donor.exists?(1)
  User::Donor Exists? (2.1ms)  SELECT 1 AS one FROM "users" WHERE "users"."type" IN ($1, $2) AND "users"."id" = $3 LIMIT $4  [["type", "User::Donor"], ["type", "User::Donor::Natural"], ["id", 1], ["LIMIT", 1]]
=> true

It does not follow me. I prefer not to take the risk of choosing an architecture whose behavior is uncertain because it is subject to pre-loading the code.

ActiveRecord could have been designed to produce the following SQL statement:

SELECT * FROM users WHERE "users"."type" = "User::Donor" OR "users"."type" LIKE "User::Donor::%" AND "users"."id" = 1

This would allow me to:

  • Request User.all and retrieve the records of type: User, User: :ProjecTowner, User: :Donor, User: :Donor: :Natural, User: :Donor: :Legal
  • Request User: :Donor.all and retrieve the records of type: User: :Donor, User: :Donor: :Natural, User: :Donor: :Legal Without Code Preloading
  • Request User: :Donor: :natural.all and retrieve the records of type: User: :Donor: :Natural
  • Request User: :Donor: :Legal.all and retrieve the records of type: User: :Donor: :Legal

But it behaves differently:

SELECT * FROM users WHERE "users"."type" = "User::Donor" AND "users"."id" = 1

It was only when I pre-loaded the subclasses ofUser: :Donor's that he starts to allow me to askUser: :Donor.alland retrieve the records of type:User: :Donor, User: :Donor: :Natural, User: :Donor: :Legal.

SELECT * FROM users WHERE "users"."type" IN ($1, $2, $3) AND "users"."id" = 1 [["type", "User::Donor"], ["type", "User::Donor::Natural"], ["type", "User::Donor::Legal"]]

You can blame it on loading lazy code, but I don't. If I agree that inflection and lazy code loading cannot work hand-in-hand as it is, and since we can't have predictable/stable behavior from a mid-level model, it would be best if the AR documentation consistently discouraged nested ITS.

I'd rather not have a feature than one I can't rely on.

Why does it work well starting from the base class of a regular STI and not from a mid-level STI?

The answer can be found in the ActiveRecord source code.

When accessing the relationship, ActiveRecord adds a type condition if necessary:

# https://github.com/rails/rails/blob/6bc7c478ba469ad4b033125d6798d48f36d6be3e/activerecord/lib/active_record/core.rb#L306

def relation
  relation = Relation.create(self)

  if finder_needs_type_condition? && !ignore_default_scope?
    relation.where!(type_condition)
    relation.create_with!(inheritance_column.to_s => sti_name)
  else
    relation
  end
end

To determine if the type condition is needed, it does some checks regarding the distance between the current class and ActiveRecord: :Base as well as the presence of an inheritance column.

# https://github.com/rails/rails/blob/6bc7c478ba469ad4b033125d6798d48f36d6be3e/activerecord/lib/active_record/inheritance.rb#L74

# Returns +true+ if this does not need STI type condition. Returns
# +false+ if STI type condition needs to be applied.
def descends_from_active_record?
  if self == Base
    false
  elsif superclass.abstract_class?
    superclass.descends_from_active_record?
  else
    superclass == Base || !columns_hash.include?(inheritance_column)
  end
end

def finder_needs_type_condition? #:nodoc:
  # This is like this because benchmarking justifies the strange :false stuff
  :true == (@finder_needs_type_condition ||= descends_from_active_record? ? :false : :true)
end

The type condition is constructed as follows:

# https://github.com/rails/rails/blob/6bc7c478ba469ad4b033125d6798d48f36d6be3e/activerecord/lib/active_record/inheritance.rb#L262

def type_condition(table = arel_table)
  sti_column = arel_attribute(inheritance_column, table)
  sti_names  = ([self] + descendants).map(&:sti_name)

  predicate_builder.build(sti_column, sti_names)
end

To summarize:

  • During a request Starting from the Base class (in my example: User), No type conditions are added.

Since it lists all the records in the table, it gives access to all the records whose class is or inherits from User. Perfect.

  • During a request From a sheet subclass, the exact type must match for the record to be found. Logical.
  • When from a mid-level subclass such as User: :Donor (Nor the base class User Not a sheet User: :Donor: :Natural), It Depends. As Expected, Records of Type User: :Donor Are loaded. On the other hand, records whose class inherits from User: :Donor Will only be selected if their class is pre-loaded.

Is there a workaround?

There is always one.

We might consider changing ActiveRecord So that it uses LIKE in the SQL query as an alternative condition to the strict comparison of strings. Problem: I did not do a benchmark but it will certainly slow down the reading of the database. While this solution works, it's inefficient, requires a lot of work to patch ActiveRecord, and frankly, we're not even sure if the core Rails team would agree to such a patch.

Another Solution Would Be To Override the option's default scope User: :Donor So that it uses a LIKE statement as described above. I'm not a big fan of the default scopes because there always comes a day when you have to use .unscope And lo and behold, it doesn't work anymore. It is not a sustainable solution.

Another Solution Could Be To preload subclasses, for example with the solution discussed earlier. I suppose that's an acceptable solution.

Another solution is to Go Back to a Simpler Architecture That leaves no room for behavior changes: no mid-level subclasses, no pre-loading required. How not to repeat myself for the common code shared by User: :Donor: :Natural And User: :Donor: :Legal, you ask?

Use of concerns.

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
end

class User < ApplicationRecord
  scope :donors, -> { where(type: ['User::DonorNatural', 'User::DonorLegal']) }
  scope :project_owners, -> { where(type: 'User::ProjectOwner') }
end

class User::ProjectOwner < User
end

class User::DonorNatural < User
  include User::DonorConcern
end

class User::DonorLegal < User
  include User::DonorConcern
end

module User::DonorConcern
  extend ActiveSupport::Concern

  included do
    has_many :contributions, foreign_key: 'user_id', inverse_of: :donor
  end
end

class Project < ApplicationRecord
  belongs_to :project_owner, class_name: 'User::ProjectOwner', foreign_key: 'user_id'
end

class Contribution < ApplicationRecord
  belongs_to :project
  belongs_to :donor, class_name: 'User', foreign_key: 'user_id', inverse_of: :contributions
end

There is still room for improvement (this code is oversimplified, no validations) to make this article easier to read, my goal being to give you the essential information so that you can choose your own favorite solution in an informed way.

My favorite solutions

When possible, I prefer to have a simpler architecture (no middle layers). The less complex it is, the less headaches I have.

When I need to have this middle layer, I pre-load all the subclasses of my STI to avoid Everything random behavior. And I mean all the subclasses in my STI, not just the ones that have records in the database.

module UserStiPreloadConcern
  unless Rails.application.config.eager_load
    extend ActiveSupport::Concern

    included do
      cattr_accessor :preloaded, instance_accessor: false
    end

    class_methods do
      def descendants
        preload_sti unless preloaded
        super
      end

      def preload_sti
        user_subclasses = [
          "User::ProjectOwner",
          "User::Donor",
          "User::Donor::Natural",
          "User::Donor::Legal"
        ]

        user_subclasses.each do |type|
          type.constantize
        end

        self.preloaded = true
      end
    end
  end
end

Thanks for reading!