CF9 ORM: Using the Secondary Cache to optimize for expensive columns

Saturday, July 17, 2010

This is an experiment. It most likely speaks to my poor SQL skills and willingness to jump for a quicker code solution than to figure out how to write faster queries. So if you can tell me a faster way to do this in SQL, I’ll listen.

Still, this is mostly about using a combination of simple property mappings and the secondary cache for an interesting, perhaps unusual solution to a problem.

This is my story

I have a table. Its name is “N_Organizations”. It’s prefixed with “N_” because this is a blog post and I am experimenting. Like many organizations, an Org can be a parent of another Org: “Paint Shop” is a child of “Operations” is a child of “The Penthouse Suite”. In an untold number of places, in both the web application and in reports that are not web-driven, the “tree” for any given child organization has a simple requirement: Show the names of the parents as a single string:

"The Penthouse Suite –> Operations –> Paint Shop”

Ideally, The creation of this string would happen in one place, and then any place that needed it would not have to do the requisite looping to construct the string itself. If different formatting were needed, it’d be better to have the reformatter be a simple string reformatter than a parent org climber: simply replace the arrows with colons, for example.

So, when thinking of a typical query, you’d get something like this:

id | Name                  | ParentID | ParentName
1    The Penthouse Suite     null       
2    Operations              1          The Penthouse Suite
3    Paint Shop              2          The Penthouse Suite --> Operations

And so on.

To fetch this stuff, I created a view that uses a Common Table Expression (CTE) to recursively loop over the parents in order to derive the ParentName. It does some other foolishness with “level”, but ignore that. CTEs are kind of like recursing but without functions:

CREATE view [dbo].[v_organizations]   as

WITH cte_tree AS
		(
			SELECT id, organizationname, OrgNode.GetLevel() as Level, CAST('' AS varchar(500)) AS ParentName
			FROM dbo.n_organizations
			WHERE organizationtypecode = 'D'

			UNION all

			SELECT parent.id, parent.organizationname, OrgNode.GetLevel() as Level,
			CAST(
				cte_tree.parentName +
				case when cte_tree.parentname = '' then '' else ': ' end
				+  cte_tree.organizationname as varchar(500)
			)
			FROM dbo.n_organizations parent
			JOIN cte_tree ON parent.parentorganizationid = cte_tree.id
		)
		SELECT
			o.id, o.OrganizationName, OrganizationTypeCode,
				ParentOrganizationID, CreatedOn, ModifiedOn, 
				Level,
				case when Level <=3 then o.id else o.ParentOrganizationID end as GroupLevel,
			cte_tree.ParentName
		FROM cte_tree
		JOIN dbo.n_organizations o ON cte_tree.id = o.id

GO

There’s a big problem with this: it’s slow. The tables are as indexed as they can be, but you can’t index views with CTEs. And what’s worse, the query is slow regardless of what kind of criteria you add to it. 

Select * from v_Organizations runs about 125 milliseconds. Select * from v_Organizations where id = 215 runs in about 90 ms. The reason is that it still has to do all that recursion. Thus, this is probably a terrible way to solve this problem.

And yet, we continue….

This is my Component

Organization.cfc

component persistent="true" accessors="true" table="N_Organizations" {
	property name="id" fieldtype="id" generator="native";
	property name="organizationName" type="string";
	property name="parentName" column="parentName" readonly="true" type="string"  table="v_organizations" joincolumn="id" cacheuse="read-only";
}

Notice how the parentName property points to “v_organizations” as the table? Pretty cool, huh?  So now when I load this component, when I access the parentName property, it’ll pull it from v_organizations and all will be well. Hibernate will generate a query that simply joins table N_Organizations on v_Organizations where id = id. Yeah.

Except, that’s slow, too. Remember, as I said, querying v_organizations adds about 100ms no matter how many records I’m pulling back. Notice that “cacheuse=’read-only” attribute? Yeah… that doesn’t work. No matter how many times I ask for the parentName, it’s going to run that same query.

What If?

What if I just cached the Organization.cfc in Secondary cache? Wouldn’t that rock? No… most of the data in this component will change too frequently, so I can’t cache the entire component (I’m leaving off a bunch of other properties, obviously).

But… what if I could cache just that parentName property? That’s the slow one, after all. And that never changes. Or, it changes like 2x a year, and in that case, we just clear the cache and off we go.

How could you possibly achieve such awesomeness?

You need two components: 1) Your original Organization.cfc and 2) a new ParentName.cfc

Organization.cfc:

component persistent="true" accessors="true" table="N_Organizations"{
	property name="id" fieldtype="id" generator="native";
	property name="organizationName" type="string";
	property name="parentName" readonly="true" type="string" fieldtype="one-to-one" cfc="ParentName" constrained="true" cacheuse="read-only" ;
	
	//make it easy to get the parentName property
	function getParentName(){
		try { return parentName.getParentName(); }
        	catch(Any e) { return ""; }
	}

}

ParentName.cfc

component accessors="true" persistent="true" readonly="true" table="v_Organizations" cacheuse="read-only" cachename="ParentName"{
	property name="id" fieldtype="id";
	property name="parentName" type="string" ;
}

So what does this do, exactly?

First, what we’ve done is simply rolled that parentName field from v_organizations into a 4-line component. Notice that the “table” attribute on the component is v_organizations. Notice it specifies cacheuse=”read-only”. Notice that read-only is true.

Second, in Organization.cfc, we replaced the original parentName property with a new one that is a simple one-to-one relationship with this new ParentName property

Third,we add a function, getParentName(),  so that we could simply call organization.getParentName() instead of organization.getParentName().getParentName(). It’s wrapped in try/catch because this will fail for the very top-level organization which has no parent organization.

The result: The first time getParentName() is called for any organization, the view will run (slowly, as usual). Subsequently, CF will fetch that property from the cache until that cache is cleared or the server restarts or whatever. So you do take an initial one-time hit of 90ms or so for each organization, but after that, you’re gold.

But wait, there’s more!

I mentioned this experiment to my boss, and his first reaction was, Could you just do a one-time load of all those ParentName organization instances, right on app start, so that you’d never see that 90-ms / per object penalty?

So I tried it, and this is where it gets even better. None of this is unsurprising, by the way. It all makes complete sense.

Simple math:

Doing it the ‘pay-as-you-go’ way: You have 2500 records. You hit the view one time for each record. 2500 * 90ms = 225000 ms = 225 seconds = almost 4 minutes total to load each of those records were you to do them one at a time over the lifespan of the application between server restarts.

However, if you load them all at once, remember, SQL-wise, it’s only taking about 125ms to load. So if you created all 2500 instances of ParentName.cfc, you’ve got the 125ms for the SQL and whatever time it takes to create those 2500 component instances. In my case, it’s about a second, total. To achieve such a performance improvement, you run this when your app starts:

 

EntityLoad(“ParentName”);

 

Since ParentName is a cached component, it’ll get stuck in whatever cache you have configured in Application.cfc, and then all subsequent calls to organization.parentName() will take however long it would take to fetch a simple string from the cache. In my experience this is anywhere from 1/3 to 1 ms doing unscientific averaging.

If you need to clear the cache, simply use:

ormEvictEntity("ParentName");

Conclusion

1) I was amazed at how easy it was to work with Hibernate’s second-level cache with ColdFusion ORM. It just worked.

2) Sometimes, when you’re not Joe Celko and you don’t want to spend time time figuring out how to write a better view, you say “you know what, I think I’ll see if there are fun and interesting bits in CF ORM that could solve this problem for me”. And then you go play, and you come up with some nifty things.

3) I have no idea whether I’ll ever use this, but perhaps you’ll find a use for such a solution and, when you do, thank the CF team for doing such a fine job of integrating Hibernate’s L2 cache into CF.

Good luck!

 

--Marc

How to set up a new Eclipse environment

Sunday, July 11, 2010

Have you ever wanted to start using a new version of Eclipse but couldn’t justify it because of all the time you’ve spent getting your preferences and project settings just so? And you don’t want to type in all those Update site URLs again to get all your plugins? While there is currently no one-click easy way to completely migrate an existing Eclipse install to a new one, I do have some timesavers for you.

Always start with a new Workspace

I’ve seen too many problems with trying to reuse an existing workspace in hopes of saving some time moving from an old version of Eclipse to a new one. Don’t do it. When you fire up the new Eclipse, and it asks you for the workspace, give it a new location. I keep a directory named “Eclipse Workspaces”, and in it lay the corpses of a dozen or more old workspaces. I name my workspaces with this general pattern:

[Product]_[MajorVersionName]_[RCNumber]_[Purpose]

For example, for my work projects, I use ColdFusion Builder on Galileo. My workspace is Bolt_Galileo_Work. When I was fiddling with some of the Helios (Eclipse 3.6) RCs, I was using Bolt_Helios_RC2_Work.

For Eclipse Plugin development I use CFEclipse_Helios_PluginDevelopment.

I always reserve the right to break that pattern.

Getting all your plugins

I’ve seen some people suggest copying them from the old install, and perhaps that works just fine. Here’s how I do it: I Export the update sites I want to install into my new Eclipse, and then Import them in the new one. Then, I install the plugins using the tried-and-true software installer. Here’s how:

In your old Eclipse

  1. Help – Install New Software
  2. Click the “Available Software Sites” link
  3. On the Available Software Sites screen, select the plugins that you want to install into your new Eclipse
  4. Click the “Export” button, and go through the dialog of saving the XML file someplace sensible, like your Desktop

That screen looks like this:

updatesites

In your new Eclipse

  1. Following the same steps as above, get to the Available Software Sites screen
  2. Click the “Import” button
  3. Navigate to that XML file you saved previously and follow all steps
  4. Now you have those update sites available to you. From there, get back to the Install software screen, select “All Software Sites” from the dropdown, and let Eclipse find and install all those plugins you know and love

 

Getting some of your Preferences

You can export some of your preferences from the old install to the new one, though you do have to beware that you’re going to import some preferences you don’t actually want. More on that in a bit.

In your old Eclipse

  1. File – Export
  2. Select “Preferences”
  3. Follow the rest of the screens to save your prefs as an .epf file

In your new Eclipse

  1. File – Import
  2. Select “Preferences”
  3. Follow the rest of the screens

As I said, it’s going to copy some preferences that you don’t want. In particular for me, I always get bitten by the “ANT Runtime” preferences. In that case, I let it do its thing, and then I go back into Window – Preferences and point the ANT runtime to the one that comes bundled with the new Eclipse. Even with that little annoyance, it’s worth it to export/import preferences because it brings over so much.

 

Getting all your projects

Perhaps the most time-consuming part of moving to a new Eclipse installation is getting all your projects back in there. Fortunately, this can also be the least time-consuming. Here’s what to do in your New Eclipse installation:

In CFEclipse

  1. File – Import
  2. Select “Existing Projects into Workspace”. Click Next
  3. In the “Select Root Directory” field, type in the directory where you keep most of your projects. Hit Enter
  4. Once it finds all the projects in that directory, choose the ones you want, and hit “Finish”

In ColdFusion Builder

Follow the same steps as above, but at Step #2, choose “ColdFusion – Import existing Projects”

NOTE: This will not copy All of the existing properties for the projects, such as the MXUnit cfcpath and remote facade URL. For ColdFusion Builder, it will carry over the server settings since they are stored in a file that lives inside that project and not outside of it like most project properties in Eclipse

 

Some Preferences I’m always sure to confirm

Once I’m finished with the above, I have a few preferences I always double-check. In Window – Preferences:

  • General – Show Heap
  • Quick Diff – Pristine SVN Copy
  • Local History – at least 180 days
  • Key Bindings – Alt-A for ANT view, Alt-S for CFEclipse Snip Tree View, and Alt-Y for Mylyn Task List view
  • Appearance – Colors and Fonts – Basic Text --  Consolas 12pt
  • CFEclipse / ColdFusion Builder Snippets path (I point it to a dropbox location)

 

If you have any quick tips for making migrating from one Eclipse installation to another less time-consuming, please share them.

MXUnit 2.0 Released

Friday, July 9, 2010

Once again, Los Hombres at MXUnit.org are delivering a new batch of the crack cocaine of unit testing for ColdFusion--MXUnit 2.0.  Yes, it's true.  Once tasted, a life of uncontrollable and desperate automated testing will surely ensue. This particular batch has some especially addictive features:
  1. New Open Source License: MIT. Changing from GPL to MIT effectively allows you to integrate MXUnit into any product with little restriction (other than honoring the No Whining clause).
  2. New built-in mocking framework: Simply call the new mock() method to create a new mocked object and inject that into your component under test.
  3. New integrated HTML view: This replaces the existing HTML and ExtJS view and is backward compatible.
  4. New beforeTests() and afterTests() for component-level set up and tear down.
  5. New and improved documentation and bug tracking at http://wiki.mxunit.org and http://jira.mxunit.org/secure/IssueNavigator.jspa?reset=true&mode=hide&pid=10000
  6. Ant task improvement: now you can send arbitrary URL parameters to your custom runners.
  7. New Ant build.xml examples and Hudson-friendly examples.
  8. More Eclipse snippets!
  9. More Dataprovider improvements.
  10. Eclipse plugin data compare tool.
  11. Multiple Github fork(s) (http://github.com/virtix/mxunit).
  12. Office Hours every other Monday at 12:00pm US Eastern Time. Join in via Adobe Connect and bring any testing or other technical questions or topics: http://bit.ly/MXUnitOfficeHours.
MXUnit couldn't deliver the kind of product it is without the generous help of many talented and dedicated programmers. This particular release saw significant contributions from these chefs:

Tim Farrar
Peter Farrell
Patrick McElhaney
Randy Merrill
Bob Silverberg
John Whish

Cop the latest and greatest at http://mxunit.org/download.cfm and get involved http://groups.google.com/group/mxunit
Test and be Happy!
theguys at mxunit.org