After reading this very smart piece, it became apparent that its application points to and addresses a bigger issue: complex conditional logic. When I look at code that has complex conditional logic I get a queasy feeling in the pit of my stomach, thinking about how that logic could be broken and about the complexity of the associated tests. For example, consider this example code:
function createLogin(username, password){
if( len(username) lt 8 or
len(username) gt 12 or
refind('[:punct:]', username) or
refind('[:space:]', username) or
refind('[:cntrl:]', username) ){
_throw('FormatException','Invalid Username format','Username needs to follow these rules: http://www...');
}
//Also need to do something similar for the password. Omitted for simplicity.
//After checks have passed, store credentials. Intentionally omitted.
}
You can see that this implements username requirements. The username needs to be between 8 and 12 characters, and contain only alpha-numeric characters. Yes, there are more elegant regular expressions, but this is intended to be easy to read ;-) There are some issues with this. Functionally, it may fit the specification just fine, but if something fails, how do we know which of the 5 conditions failed and what caused it to fail? Also, a developer writing a test for this code, and naturally wanting that test to cover as much as possible, would need to consider all the possible combinations. Essentially, all of the above conditions need to be false in order for the statement to be true Take a look at the following truth table which illustrates all the possible combinations needed to completely test this expression. P or Q or X or Y or Z all represent a T/F value returned by each of the conditional tests in our expression respectively:
Truth table built using Truth Table Constructor What the truth table says, and what our validation also says, is that
all expressions must be
false in order for the if statement (the expression) to be
true. If
any of the conditions are
true, the statement will be
false and throw the
FormatException. A test just for the
username rule might look like this. This assumes that the
createLogin method returns a boolean value. Another issue, is that all the subsequent logic after the checks will be invoked, too, which may generate unwanted side effects.
function testCreateLogin(){
var user = createObject('component','user');
var password = 'Pa$w00rD';
var badUsernames = 'few,MoreThan12characters,hasPunc$%, ,has space,#chr(10)#,user#chr(13)#Name,user$name';
var i = 1;
var tempUser = '';
for (i; i lte listLen(badUsernames); i=i+1){
tempUser = listGetAt(badUsernames,i);
try{
//Should fail
actual = user.createLogin(tempUser, password);
if(actual) {
fail('#tempUser# passed and should not.');
}
}
catch(usernameexception e){
debug(tempUser);
}
}
}
Note that the above
does not test all rows of the truth table; what's missing? Hint: consider row #2 of the truth table, the username should be 8-12 charcters, contain no spaces or punctuation AND contain a control character; e.g., user#char(13)#Name. That one is covered. Eek! Again we are testing the conditional logic and possibly dealing with side effects of the subsequent code. We could extract everything into a separate method, which is better. But that still leaves us with a long conditional statement, it just exists in another object. What about something like this:
function createLogin(username, password){
validator
.checkForLength(username, 'username')
.checkForSpaces(username, 'username')
.checkForPunctuation(username, 'username')
.checkForControls(username, 'username')
.validate();
//Also need to do something similar for the password. Omitted for simplicity.
//After checks have passed, create login logic. Intentionally omitted.
}
I suspect one of several responses to this: 1. What's that dot stuff? 2. Cool. Show me the code, or, maybe, you're like, 'yeah, yeah ... like, show me something new, dude'. All are welcome! :-) For those in the 1st category. That's method chaining. Check out Martin Fowler's piece on the topic. It could just as easily been written in one line:
validator.checkForLength(username, 'username').checkForSpaces(username,'username').checkForPunctuation(username,'username').checkForControls(username, 'username').validate();
Or without chaining as:
validator.checkForLength(username, 'username')
validator.checkForSpaces(username, 'username')
validator.checkForPunctuation(username, 'username')
validator.checkForControls(username, 'username')
validator.validate();
But it reads much better broken down, IMO. This has been coined "Fluency" and is useful for situations such as this; specifically,
expression building. I would not recommend implementing chaining just because you can; make sure it provides some benefit. How it's done: Essentially, each validation method in the Validator object returns an instance of itself; i.e.,
this. This allows for the method chaining. Pretty simple, really.
The Validator This is a rather simple component, as all components should be. Maybe look at it as a paradigm and write your own. You can use it out of the box to perform simple validations or tests. It has one validation method,
assert(expression, source), where expression is any boolean expression and source is a name you'd like to associate with the expression. In practice, this would likely be the name of an argument or parameter or it could be used to validate a post-condition prior to a method's return statement. The only other Validator methods you need to worry about are
collect() and
validate(). When you exercise a validation, e.g.
assert(...), and there is a failure, the
Validator captures the exception information but does not throw the exception until
validate() is called. This allows for multiple checks and builds details about where the failure occurred (@see 3rd paragraph). Example:
validator
.assert(1 eq 1, 'param1')
.assert(true , 'param2')
.assert( isValid('struct', {foo='bar'}), 'param3')
.validate();
Rarely are any kind of conditionals and validation like the above so simple! So, the
Validator concept is intended to be extended. There are three things to know about extending it:
- Make sure : component extends="path.to.Validator"
- Each validation method should return THIS so chaining can be accomplished.
- When your validation fails, call the collect(...) method in the parent; e.g., collect( 'YourFailureType', value, source);
Example:
<cfcomponent displayname="MyValidator" output="false" extends="Validator">
<cfscript>
this.MIN = createObject('java','java.lang.Integer').MIN_VALUE;
this.MAX = createObject('java','java.lang.Integer').MAX_VALUE;
function isInRange(value,src){
if( !isValid("range", value, this.MIN, this.MAX )){
collect("NumberRangeException", value, src); }
return this;
}
function shouldtBe5AlphaCharsExactly(val,src){
if( !isValid('regex', val , '[a-zA-Z|a-zA-Z]{5,5}' ) ){
collect('StringFailureException',val,src);
}
return this;
}
</cfscript>
</cfcomponent>
Basic Usage:
function myMethod(someNumber,someId){
validator
.isInRange(someNumber,'someNumber')
.shouldtBe5AlphaCharsExactly(someId,'someId')
.validate();
//...
}
When a client calls
myMethod(), the
Validator is invoked and checks to see if
someNumber is in the acceptable range and that
someId is exactly five alpha characters long. If not, it collects the exception information and proceeds until
validate() is called.
validate() iterates over its exception collection and throws a general failure with details of each exception:
ValidationException - One or more parameters failed validation. Parameter 'someNumber' with value '-1.2312312312312312E23' threw NumberRangeException ; Parameter 'someId' with value 'morethan5' threw StringFailureException ; There are a number of clear benefits to this:
- The logic becomes a lot easier to test.
- Re-use possibilities emerge.
- Specifying the implementation of the conditional rules becomes easier.
- The code becomes clearer to read and to maintain.
- It may even lead to better security.
Example Tests:
<cfcomponent output="false" extends="mxunit.framework.TestCase">
<cfscript>
function outOfRangeParametersShouldFail(){
try{
validator
.isInRange(1,"param1") //ok
.isInRange(-99,"param2") //ok
.isInRange(createObject("java","java.lang.Integer").MIN_VALUE-100,"param3") //capture this
.isInRange(createObject("java","java.lang.Integer").MAX_VALUE+100,"param4") //capture this
.validate();
}
catch(ValidationException e){}
assertEquals(2, arrayLen(validator.getExceptions()),"Should be only 1 exception, because it should fail at first validate().");
}
function charLengthShouldBeOk(){
var exceptions = validator
.shouldNotBeNoMoreThan5Chars('asdfg','param1')
.validate()
.getExceptions();
assertEquals(0, arrayLen(exceptions) ,'Oops. Exceptions are present in array.');
}
</cfscript>
</cfcomponent>
*Source Code Link*:
http://mxunit.org/examples/validator.zip Summary: I used a parameter validation application as the example, but you can see than
anywhere complex conditional logic is implemented, you could apply this.
Next: This and some other things really got me thinking. Should the User object be responsible for validating? Should
any object assume that responsibility? This question points to Inversion of Control and the Aspect Oriented Programming techniques of ColdSpring, Spring, and AspectJ. So, on the slate for next time will be a light-weight AOP/cross-cutting approach to this. Test and be happy ... Namasté. -bill
15 comments:
Makes me want to go and immediately rewrite a ton of my code. This looks very elegant. Thanks for a great post.
@Ilya ...I, too, now have an additional 10K SLOC to refactor ;-) thanks! bill
I see how this refactoring makes for a much better code and is way easier to read, but the part that I'm missing is that I don't see how the use of the validator object will cut down on the test cases you need to check. The different constraints are still there and your test code still needs to test each one of those conditions (and their combinations). So if the idea is to reduce the testing I'm not really seeing how that is achieved. Can you please clarify that?
@Oscar, good point. In general, I feel that there is no good solution to reducing the "amount" of tests here or elsewhere. We can, however, create tests that are clearer, more targeted, and easier to maintain. This approach really is aimed at breaking down the complexity of conditional logic into smaller more manageable pieces. A real-world value, I see, is when the production code is executed and fails, you will know exactly what failed and why. You could then display that info back to the user.
thanks,
bill
Thinking out loud here, and maybe this is just stupid, but when I think of truth table logic, I'll often write tests that use.... tables.
I create a private function in the test, and within that, i'll use cfquerysim to generate my table.
then, the test itself will loop over that test, run my functionUnderTest with each of the inputs corresponding to a single row in the table.
If something happens I don't expect to happen, then I'll append that to a structure. I do this so that the test does NOT fail fast... I want every scenario to be run.
Then, at the end of the test, I'll write an assertion on that resultant structure.
Here's a quickie example of what I'm talking about. This is thoroughly untested!
[cffunction name="usernameShouldFailForSpecifiedReason"]
[cfset var table = createValidatorTruthTable()]
[cfset results = StructNew()]
[cfloop query="table"]
[cfif table.ExpectedException eq ""]
[cfset tmp = obj.createLogin(table.username,table.password)]
[cfif NOT tmp.success]
[cfset results["#username#_#password#_ShouldNotFailButDid"]]
[/cfif]
[cfelse]
[cftry]
[cfset tmp = obj.createLogin(table.username,table.password)]
[cfthrow message="#ExpectedException#_ShouldFailButDidNot"]
[cfcatch type="#ExpectedException#"][!--- do nothing... that's what we want ---][/cfcatch]
[cfcatch]
[cfset results["#username#_#password#_#cfcatch.message#"]]
[/cfcatch]
[/cftry]
[/cfif]
[/cfloop]
[cfset debug(results)]
[cfset assertEquals(0,ArrayLen(results),"The following combinations failed: #StructKeyList(results)#")]
[/cffunction]
[cffunction name="createValidatorTruthTable" access="private"]
[cfset var q = ""]
[cfoutput]
[cf_querysim]
q
username|password|ExpectedException
mesher|easy|TooShortException
mesher|l33thx04||
mesher|nopunct|nopunctuation
........
[/cf_querysim]
[/cfoutput]
[cfreturn q]
[/cffunction]
@bill, I see that now. Good stuff!
@marc, that's smart testing! It's also one of the TestNG approaches I want us to build into mxunit - dataprovider.
@marc, thinking again about covering the truth table ... there should be a recursive algorithm that, given a set of values intended for boolean expressions, traverses or visits each combination. I suspect that we could generate a minimal structure of valid and invalid values and then visit each combination. I'll bang this out after cup #2 ;-), and after the adobe articles, and after the AOP widget, and after this that and the other ...
bill, are you talking about auto-generating inputs for a dataprovider annotation? or just providing the dataprovider annotation and the mechanism for running it:
[cffunction name=myFunctionSHouldDoThis mxunit:dataprovider=someFunctionNameThatReturnsAQuery mxunit:failfast=false]
it sounds like you're talking about generating meaningful inputs... that seems awfully sketchy to me. but then, i'm no big thinker.
i've only read the docs on testng's dataprovider, but from what i've read, i'm thinking mxunit will be able to go even further with the concept, largely because you can so easily do dynamic invocations. you could have pre and post functions to run before and after each row of the dataprovider is executed, and then a post-dataprovider function that acted on the results of the full loop. this owuld just be gravy if you needed it; otherwise, with no functions being passed in, mxunit would simply loop over the test function itself once for each row in the dataprovider, execute the test, and behave accordingly based on the failfast attribute.
great stuff here. i am looking forward to the day where test writing becomes more about thinking about inputs and less thinking about boilerplate junk
@marc, re:dataprovider, I wasn't thinking too much about generating inputs, but I think you got the picture. Imagine a test signature like this:
[cffunction name="testThatLoopsOverAQuery" dataprovider="someQuery"]
[cfargument name="rowName1"]
[cfargument name="rowName2"]
assert (rowName1 gt rowName2, 'or whatever - #someQuery.currentRow#');
[/cffunction]
Note parameters. This test would be executed for each row of the query. We should also be able to pass in arrays and structs.
re:covering the truth table - I didn't think about the connection between the dataprovider and truth table coverage until you brought it up. I think I like that idea :-). Note to self: As per @marc esher, see how we can leverage a dataprovider annotation and automatic truth table generation.
bill
This is great. Thanks. I downloaded the .zip file and placed it in my wwwroot, but I am getting a CFML compiler. I am running CFMX v8 and MXUnit. When I run http://localhost//validator/ValidatorTest.cfc?method=runTestRemote
I get this error:
The CFML compiler was processing:
* A script statement beginning with args on line 88, column 4.
* A script statement beginning with function on line 86, column 3.
* A cfscript tag beginning on line 2, column 2.
The error occurred in D:\WebSites\validator\ValidatorTest.cfc: line 88
86 : function setUp(){
87 : validator = createObject("component","validator.Validator");
88 : args = {args = [1,2,'asd',now(),{foo=['bar',-123.09932,[123,'00']]}]};
89 : }
90 :
@manfred, thanks! It looks like, for some reason, your CF install does not like my implicit struct notation. I just re-downloaded it myself and ran it ok on my windows xp machine (CF8.1 multi-server). To make the tests pass, you might want to change line 88: args = {...}; to something like:
args = structNew();
args.foo = 'bar';
args.mydate = now();
a = arrayNew(1);
a[1] = 123;
args.theArray = a;
...
or some such. The args data is pretty much arbitrary, but it's used to test complex data.
Try that and see if that works.
best,
bill
Billy, thanks for the quick response. Yes, that was the problem. I was trying to figure out what your code was trying to create on line 88, as I have never tried/seen that syntax.
FYI, I am running CFMX v8,0,0,176276 Enterprise. I will try it on CFMX v8.01 server to see what happens.
All of the tests in /validator/ValidatorTest.cfc?testMethod pass except resetGoodieShouldClearExceptionArray(). It indicates the variable PRINT is undefined (line 5 of ValidatorTest.cfc). Is that a function that you use? Or is that in MXunit? I was just trying to figure out today if I have the latest version of MXUnit.
Thanks,
Manfred
@manfred, doh! my bad. the print() method is a hack i threw into the framework - i'm always adding stuff i think should be in coldfusion anyway. i'll send up another zip file with the correction. just replace that with debug() and you'll be good to go. thanks for point it out!
bill
Formulate the arguments symbolically and test its validity using the truth table. Also mention critical row (or rows).
If I study hard or I get rich, then I get A’s.
I get A’s.
If I don’t study hard, then I get rich.
Where p = I study hard,
q = I get A’s,
r = I get rich.
Post a Comment