Nick Kallen's blog
Ten Things I Hate about Proxy Objects, Part I
Has Many Through Has Many Through Has Many Through ...
In which the author relates several things he hates about Rails' association Proxies, along with workarounds to fix them. Part I of VIII
From time to time, you want to join through more than one join table. Consider the following example:
class Essay < ActiveRecord::Base
has_many :chapters
has_many :pages, :through => :chapters
has_many :paragraphs, :through => :pages
has_many :words, :through => :pages
end
This kind of scenario has arisen in many of the Rails Apps that I've worked on. In the current version of ActiveRecord, has_many :through cannot go through another has_many :through. There are various ways to work around this, like mapping through the associations, but most workarounds are either inefficient or are difficult to extend with pagination and such.
In theory, it's easy to make ActiveRecord support this; we need merely to walk down a has_many :through chain, joining tables as we go. This took me a long time to implement though, as I had to decipher the opaque Reflection object model, and that's where all the magic happens.
What is a Reflection?
Every particular association declaration, like has_many :chapters, etc., is represented as an Object by means of the Reflection class. From an instance of a Reflection, you can access all of the details of the has_many declaration as well as information about the database schema that ActiveRecord was able to infer. Reflections look different for each kind of association, and (unsurprisingly) the has_many :through Reflection is the scariest. Consider this example:
has_many :words, :through => :paragraphs
The Reflection representing this declaration is composed of two sub-reflections. The first, called the :source, represents the Word class. From it we know all about the words table and its foreign keys. The second, is called the :through; it represents the Paragraph class. Now let's dive into some code.
Mucking through ActiveRecord
So our goal is to join a bunch of tables together. For convenience ActiveRecord joins using the INNER JOIN ... ON ... form rather than the more traditional FROM ... WHERE ... form. The method that currently does the work is:
class ActiveRecord::Associations::HasManyThroughAssociation
def construct_joins(custom_joins = nil)
This method handles a number of complex cases dealing with the directionality of the foreign key relation and polymorphic relations. In the simplest case, the source code looks like this (I've added comments indicating the values of the perplexing expressions):
reflection_primary_key = @reflection.source_reflection.primary_key_name # paragraphs_id
source_primary_key = @reflection.klass.primary_key # id
"INNER JOIN s ON s.s = s.s" % [
@reflection.through_reflection.table_name, # paragraphs
@reflection.table_name, reflection_primary_key, # words, paragraphs_id
@reflection.through_reflection.table_name, source_primary_key, # paragraphs, id
]
The current algorithm is hard-coded to deal with exactly one join. Note the use of @reflection! Imagine taking the same source code, but parameterizing it to deal with an arbitrary reflection. Let's call this new function construct_one_join:
def construct_one_join(reflection)
reflection_primary_key = reflection.klass.primary_key
...
As you can see, all we really need to do is remove the @ characters and we're good to go. Next we need to walk the line, iterating down the through chain until the end. Let's overwrite the old function to do this:
def construct_joins(custom_joins = nil)
reflection = @reflection
joins = []
while reflection.through_reflection
joins << construct_one_join(reflection)
reflection = reflection.through_reflection
end
"#{joins.join(' ')} #{custom_joins}"
end
There's still a little more work to do. Ultimately all these joins need to terminate at some particular record's primary key. That is, when we say:
my_essay.words
Ultimately there should be some clause in the query saying
AND essay_id = '#{my_essay.id}
The old code to effect this is:
def construct_conditions
table_name = @reflection.through_reflection.table_name
conditions = construct_quoted_owner_attributes(@reflection.through_reflection).map do |attr, value|
"#{table_name}.#{attr} = #{value}" # paragraphs.essay_id = #{my_essay.id}
end
conditions << sql_conditions if sql_conditions
"(" + conditions.join(') AND (') + ")"
end
As you can see from the comment in the above code, this is incorrect since paragraphs doesn't have an essay_id column; rather, chapters does. We really want to say chapters.essay_id = #{my_essay.id}
If we had a function that could get us chapters (i.e., the last through reflection):
def last_through_reflection
reflection = @reflection
while reflection.through_reflection
reflection = reflection.through_reflection
end
reflection
end
Then we could replace construct_conditions with the following:
def construct_conditions
table_name = last_through_reflection.table_name
conditions = construct_quoted_owner_attributes(last_through_reflection).map do |attr, value|
"#{table_name}.#{attr} = #{value}"
end
conditions << sql_conditions if sql_conditions
"(" + conditions.join(') AND (') + ")"
end
That's almost it. There's just a little more work to handle the :conditions on queries in the chain. See the attached source code for this final detail.
Whew. The idea is of the algorithm is almost trivial: walking down a through chain, joining tables as we go, but the implementation is complex because of the impenetrable interface to Reflection. But it took very few modifications to the Rails source to make this happen... And voila! Now we can has_many :through a has_many :through.
I'll conclude this article with an exercise for the reader (I'd do it myself but I'm a bit lazy). The clumsy iteration patterns I've used (while reflection = reflection.through_reflection) would look much nicer if we implemented Enumerable on Reflection. Then, without any lack of clarity we can rewrite #construct_joins using #inject; similarly last_through_reflection becomes a trivial call to #last. Anyone up for it?
Download all of the source code: http://www.pivotalblabs.com/files/associations_on_steroids2.rb
How I Learned to Stop Hating and Love Action Mailer
My biggest gripe with ActionMailer is how difficult it is to generate URL's. It's common enough when sending an email that it includes a link. But ActionMailer, by default, gives you no access to url_for and named routes. Ugh!
Even if you're clever enough to do something like:
class ActionMailer::Base
include ActionController::UrlWriter
end
You're still screwed as you need to know the host, port, protocol, etc. to generate links. These data can be set globally, but by far the easiest and most flexible way is to get them is from the request object.
Passing around the host, port, etc.
The request object, is (of course) only available to Controllers. So the data need to be passed from the Controller to the mailer, like:
class UsersController < ApplicationController
def create
...
if @user.save
MyMailer.deliver_foo(..., request.host, ...)
end
end
end
Of course, if you've read my previous post you know I HATE polluting my Controllers with business logic like this. I strongly prefer pushing this "triggered action" into the model:
class User
after_create :send_email
end
The cost of this is that I now need to past the host, etc. down into my model so it can pass it on to the mailer! The Java Programmers over here just laugh at me for not having a real Dependency Injection framework, which they say would solve this handily. Some stupid Java framework solving this problem better than Rails?! This makes me MAD AS HELL!
Enter the Global Variable
Screw Dependency Injection. I'm going to use a Global Variable like every other God Fearing Rails programmer. His Excellency DHH said let there be cattr_accessor and it was Good.
Step one, set the around-filter in your ApplicationController:
around_filter :retardase_inhibitor
Step two,
THERE IS NO STEP TWO
You don't need to pass around host. You can generate URL's in ActionMailer no problem. Let's look at how this is done:
module UrlWriterRetardaseInhibitor
module ActionController
def self.included(ac)
ac.send(:include, InstanceMethods)
end
module InstanceMethods
def inhibit_retardase
begin
request = self.request
::ActionController::UrlWriter.module_eval do
@old_default_url_options = default_url_options.clone
default_url_options[:host] = request.host
default_url_options[:port] = request.port unless request.port == 80
protocol = /(.*):\/\//.match(request.protocol)[1] if request.protocol.ends_with?("://")
default_url_options[:protocol] = protocol
end
yield
ensure
::ActionController::UrlWriter.module_eval do
default_url_options[:host] = @old_default_url_options[:host]
default_url_options[:port] = @old_default_url_options[:port]
default_url_options[:protocol] = @old_default_url_options[:protocol]
end
end
end
end
end
module ActionMailer
def self.included(am)
am.send(:include, ::ActionController::UrlWriter)
::ActionController::UrlWriter.module_eval do
default_url_options[:host] = 'localhost'
default_url_options[:port] = 3000
default_url_options[:protocol] = 'http'
end
end
end
end
ActionController::Base.send(:include, UrlWriterRetardaseInhibitor::ActionController)
ActionMailer::Base.send(:include, UrlWriterRetardaseInhibitor::ActionMailer)
(from url_writer_retardase_inhibitor.rb)
"Oh no!," you exclaim, "Class Variables!! Get behind me, Satan!".
Well, get over it. How do you think with_scope works? How do you think you can call a Finder like User.find or User.new wherever you feel like? It's called Global Variables, man. Embrace it. I call this liberation theology.
Advanced Proxy Usage, Part I
One of the more underutilized features of ActiveRecord is the Assocation Proxy. But they are also one of the most powerful weapons in the ActiveRecord armory, and Rails apps that take advantage of them are better organized and easier to maintain.
What is a Proxy?
When in an ActiveRecord you declare an Association:
class Hand < ActiveRecord::Base
has_many :fingers
end
Instances of Hand now have a fingers method. Contrary to appearances, and contrary to the LIE told to you by hand.fingers.class, the fingers method does not return an Array of Fingers. Rather it returns a Proxy object, one that smells and tastes like an Array of fingers but actually has a rich creamy behavior all its own.
Scoped Access
The most basic use of Proxies is to "scope" the reading and writing of your ActiveRecords. For example, if you have a controller that allows CRUD on a User's Assets, you can read and write to the collection of Assets as in the following examples:
@asset = current_user.assets.find(params[:id])
@asset = current_user.assets.create(params[:asset])
@asset = current_user.assets.build(params[:asset]) # equivalent to 'new'-ing an object rather than 'create'-ing it.
There are other ways of doing this, of course:
@asset = Asset.create({:user => current_user}.merge(params[:asset))
But the Proxy Code is much better: not only is the Proxy code terse, but it meaningfully expresses the relationship between objects in your domain: Users have many assets; this Asset is created in the context of this User.
Special Queries (or Custom Finders)
The various Association declarations--has_many, belongs_to, etc.--allow you to express much more than a simple Foreign Key relation. We can richly express in the Proxy Declarations concepts like 'Assets that belong to a User' and 'Assets that don't belong to a User':
current_user.my_assets
current_user.other_assets
simply by declaring:
class User
has_many :my_assets, :class_name => 'Asset', :conditions => 'user_id = #{id}'
has_many :other_assets, :class_name => 'Asset', :conditions => 'user_id != #{id}'
end
Notice, in this last example, something peculiar: the use of single quotes ('') with variable substitution (#{...}). This is not a typo: the use of double-quotes, would perform variable interpolation when the has_many declaration is invoked. This is in the class-context--i.e., there is no instance yet. Rails always calls eval with a Binding of self when a call to one of the Proxy methods is performed, ensuring that this all comes together.
Let's consider an alternative to this approach: declaring finders as instance methods.
class User
def other_assets
assets.find(:conditions => ["user_id != ?", id])
end
end
What's wrong with this approach? Well, if you want to use this query in anything non-trivial--such as selecting the first ten of a User's Assets--you have to write fancy code:
def other_assets(options)
assets.find({:conditions => ...}.merge(options))
end
But good luck using this strategy to do pagination. You need to define my_assets and my_assets_count, too--have fun keeping your code DRY. With a proxy, we can just do something like:
current_user.my_assets.count
current_user.my_assets.sum
current_user.my_assets.average(:price)
In fact, all the richness of ActiveRecord class methods (and any other class methods of the Target type) are available to you here. Want to find all of a User's assets that are in State pending?
current_user.my_assets.find_by_state(State[:pending])
Another example:
class Asset
def self.find_portrait_assets
find(:all, :conditions => 'height > width')
end
end
Then,
current_user.my_assets.find_portrait_assets
returns only those portrait assets owned by a user.
Proxy Options
Proxy declarations accept a number of interesting parameters. There are even "lifecycle" callbacks, like after_add, and before_destroy just like a normal ActiveRecord has before_create and so forth. You can hook into these by using an option.
class User
has_many :assets, :after_add => [:send_email] do
end
def send_email(r)
end
end
This after_add could be defined in the Asset class. But suppose Assets had a Polymorphic association. Both Users and Articles have many Assets. Our Business Rule is only to send email when a User adds an Asset, not an Article. We could write:
class Asset
def after_create
case owner
when User
# send email
when Article
end
end
But this is clumsy! When we have logic to express about the relationship between things, the Proxy is the right place for it. Anywhere else is just smearing logic throughout your code.
Proxy Extensions
Consider the following example:
has_many :assets do
def to_s
self.join(',')
end
end
You can actually extend your Proxy Objects with an Anonymous module! When you have logic that applies to a Collection of ActiveRecords, your has_many Proxy is probably the proper place for it. For example:
class Table
has_many :cells do
def to_matrix
# convert from list to matrix form.
end
end
end
Another example of this technique is the following. Suppose an Asset as many Versions, such as small, medium, etc. We'd prefer a shorter way of finding the proper version of an Asset than saying asset.versions.find_by_name('thumbnail'), we'd like to just say asset.versions[:thumbnail]. Just define the brackets ([]) operator on the Proxy:
class Asset
has_many :versions, :class_name => 'Asset', :foreign_key => :parent_id do
def [](version_name)
find_by_name(version_name)
end
end
end
Suppose we want to go one step further. If a particular version doesn't exist, it shall be created on-the-fly:
def [](version_name)
if version = find_by_name(version_name)
version
else
# create a new version here.
end
end
Advanced Extensions
In some cases, we want to write generic Extensions--these should work regardless of the particular classes involved. In the context of a Proxy there are three methods you should be aware of: proxy_owner, proxy_target, and proxy_reflection.
Suppose we want to implement something like the build method, but one that doesn't have the side effect of adding it to the owner in memory:
has_many :foo do
def new(options = {})
proxy_reflection.klass.new({proxy_reflection.primary_key_name => proxy_owner.id}.merge(options))
end
end
Extensions are so useful--it just requires a little imagination--that I'm going to give one more example, this one apropos of Access Control:
class User
has_many :draft_articles do
def readable_by?(user)
user == proxy_owner
end
end
end
Some Miscellany
The
has_oneandbelongs_toProxies behave a bit oddly: here,cyclops.build_eyeis used rather than the more obviouscyclops.eye.build.In general,
has_oneandbelongs_towill shadow methods on the Target. Don't name any database columnstargetorowner, for instance. This is one of the biggest complaints against the current implementation of Proxies!Both build and create will work even if the Proxy Owner is new. For example,
u = User.new; u.assets.build; u.save. In this example, both objects will be saved with the Foreign Key set correctly.Both build and create can take an array of attributes hashes. For example:
u.assets.build([{...}, {...}]). This will build two assets at once. (This is quite nice where in a Controller you have a form that allows the upload of multiple Assets at once. The Controller code looks identical (in simple cases) regardless of whether the form allows a single or multiple upload!)
That's the basic idea. In part II of this Article (to be released in the coming weeks), I will discuss 'static' Proxy methods and I will release version 0.1 of a new plugin that builds upon a lot of exciting work in this area. In the meantime, check this out.







