CF9 ORM: Using the Secondary Cache to optimize for expensive columns

Saturday, July 17, 2010

This is an experiment. It most likely speaks to my poor SQL skills and willingness to jump for a quicker code solution than to figure out how to write faster queries. So if you can tell me a faster way to do this in SQL, I’ll listen.

Still, this is mostly about using a combination of simple property mappings and the secondary cache for an interesting, perhaps unusual solution to a problem.

This is my story

I have a table. Its name is “N_Organizations”. It’s prefixed with “N_” because this is a blog post and I am experimenting. Like many organizations, an Org can be a parent of another Org: “Paint Shop” is a child of “Operations” is a child of “The Penthouse Suite”. In an untold number of places, in both the web application and in reports that are not web-driven, the “tree” for any given child organization has a simple requirement: Show the names of the parents as a single string:

"The Penthouse Suite –> Operations –> Paint Shop”

Ideally, The creation of this string would happen in one place, and then any place that needed it would not have to do the requisite looping to construct the string itself. If different formatting were needed, it’d be better to have the reformatter be a simple string reformatter than a parent org climber: simply replace the arrows with colons, for example.

So, when thinking of a typical query, you’d get something like this:

id | Name                  | ParentID | ParentName
1    The Penthouse Suite     null       
2    Operations              1          The Penthouse Suite
3    Paint Shop              2          The Penthouse Suite --> Operations

And so on.

To fetch this stuff, I created a view that uses a Common Table Expression (CTE) to recursively loop over the parents in order to derive the ParentName. It does some other foolishness with “level”, but ignore that. CTEs are kind of like recursing but without functions:

CREATE view [dbo].[v_organizations]   as

WITH cte_tree AS
                        SELECT id, organizationname, OrgNode.GetLevel() as Level, CAST('' AS varchar(500)) AS ParentName
                        FROM dbo.n_organizations
                        WHERE organizationtypecode = 'D'

                        UNION all

                        SELECT, parent.organizationname, OrgNode.GetLevel() as Level,
                                cte_tree.parentName +
                                case when cte_tree.parentname = '' then '' else ': ' end
                                +  cte_tree.organizationname as varchar(500)
                        FROM dbo.n_organizations parent
                        JOIN cte_tree ON parent.parentorganizationid =
              , o.OrganizationName, OrganizationTypeCode,
                                ParentOrganizationID, CreatedOn, ModifiedOn, 
                                case when Level <=3 then else o.ParentOrganizationID end as GroupLevel,
                FROM cte_tree
                JOIN dbo.n_organizations o ON =


There’s a big problem with this: it’s slow. The tables are as indexed as they can be, but you can’t index views with CTEs. And what’s worse, the query is slow regardless of what kind of criteria you add to it. 

Select * from v_Organizations runs about 125 milliseconds. Select * from v_Organizations where id = 215 runs in about 90 ms. The reason is that it still has to do all that recursion. Thus, this is probably a terrible way to solve this problem.

And yet, we continue….

This is my Component


component persistent="true" accessors="true" table="N_Organizations" {
        property name="id" fieldtype="id" generator="native";
        property name="organizationName" type="string";
        property name="parentName" column="parentName" readonly="true" type="string"  table="v_organizations" joincolumn="id" cacheuse="read-only";

Notice how the parentName property points to “v_organizations” as the table? Pretty cool, huh?  So now when I load this component, when I access the parentName property, it’ll pull it from v_organizations and all will be well. Hibernate will generate a query that simply joins table N_Organizations on v_Organizations where id = id. Yeah.

Except, that’s slow, too. Remember, as I said, querying v_organizations adds about 100ms no matter how many records I’m pulling back. Notice that “cacheuse=’read-only” attribute? Yeah… that doesn’t work. No matter how many times I ask for the parentName, it’s going to run that same query.

What If?

What if I just cached the Organization.cfc in Secondary cache? Wouldn’t that rock? No… most of the data in this component will change too frequently, so I can’t cache the entire component (I’m leaving off a bunch of other properties, obviously).

But… what if I could cache just that parentName property? That’s the slow one, after all. And that never changes. Or, it changes like 2x a year, and in that case, we just clear the cache and off we go.

How could you possibly achieve such awesomeness?

You need two components: 1) Your original Organization.cfc and 2) a new ParentName.cfc


component persistent="true" accessors="true" table="N_Organizations"{
        property name="id" fieldtype="id" generator="native";
        property name="organizationName" type="string";
        property name="parentName" readonly="true" type="string" fieldtype="one-to-one" cfc="ParentName" constrained="true" cacheuse="read-only" ;
        //make it easy to get the parentName property
        function getParentName(){
                try { return parentName.getParentName(); }
                catch(Any e) { return ""; }



component accessors="true" persistent="true" readonly="true" table="v_Organizations" cacheuse="read-only" cachename="ParentName"{
        property name="id" fieldtype="id";
        property name="parentName" type="string" ;

So what does this do, exactly?

First, what we’ve done is simply rolled that parentName field from v_organizations into a 4-line component. Notice that the “table” attribute on the component is v_organizations. Notice it specifies cacheuse=”read-only”. Notice that read-only is true.

Second, in Organization.cfc, we replaced the original parentName property with a new one that is a simple one-to-one relationship with this new ParentName property

Third,we add a function, getParentName(),  so that we could simply call organization.getParentName() instead of organization.getParentName().getParentName(). It’s wrapped in try/catch because this will fail for the very top-level organization which has no parent organization.

The result: The first time getParentName() is called for any organization, the view will run (slowly, as usual). Subsequently, CF will fetch that property from the cache until that cache is cleared or the server restarts or whatever. So you do take an initial one-time hit of 90ms or so for each organization, but after that, you’re gold.

But wait, there’s more!

I mentioned this experiment to my boss, and his first reaction was, Could you just do a one-time load of all those ParentName organization instances, right on app start, so that you’d never see that 90-ms / per object penalty?

So I tried it, and this is where it gets even better. None of this is unsurprising, by the way. It all makes complete sense.

Simple math:

Doing it the ‘pay-as-you-go’ way: You have 2500 records. You hit the view one time for each record. 2500 * 90ms = 225000 ms = 225 seconds = almost 4 minutes total to load each of those records were you to do them one at a time over the lifespan of the application between server restarts.

However, if you load them all at once, remember, SQL-wise, it’s only taking about 125ms to load. So if you created all 2500 instances of ParentName.cfc, you’ve got the 125ms for the SQL and whatever time it takes to create those 2500 component instances. In my case, it’s about a second, total. To achieve such a performance improvement, you run this when your app starts:




Since ParentName is a cached component, it’ll get stuck in whatever cache you have configured in Application.cfc, and then all subsequent calls to organization.parentName() will take however long it would take to fetch a simple string from the cache. In my experience this is anywhere from 1/3 to 1 ms doing unscientific averaging.

If you need to clear the cache, simply use:



1) I was amazed at how easy it was to work with Hibernate’s second-level cache with ColdFusion ORM. It just worked.

2) Sometimes, when you’re not Joe Celko and you don’t want to spend time time figuring out how to write a better view, you say “you know what, I think I’ll see if there are fun and interesting bits in CF ORM that could solve this problem for me”. And then you go play, and you come up with some nifty things.

3) I have no idea whether I’ll ever use this, but perhaps you’ll find a use for such a solution and, when you do, thank the CF team for doing such a fine job of integrating Hibernate’s L2 cache into CF.

Good luck!




Sam Farmer said...

I've never really looked into the secondary cache at all but now I will. Great post.

Chris Blackwell said...

A pre-ordered tree is the way to go here.

If the nodes need to appear at multiple positions in the tree you can create a link table with the left/right values to allow this.

Its ideal for data that changes infrequently as managing the left/right values on the fly is a pita, i tend to keep a parentId column as well and rebuild the left/right values as necessary.