Wednesday: CFExecute is timing out, and I don’t know why
I get an IM from a coworker, Alan. One of the applications I wrote a while back isn’t working on our Dev server. Specifically, this application uses CFExecute to invoke Beyond Compare from the command line on Windows. The app would spin, and eventually, Alan would get a “Timeout period expired without completion of C:\Program Files\Beyond Compare 2\bc2.exe”. Windows Task Manager was littered with BC2.exe processes
I fixed this with little troubles and sent out an email to the dev group with instructions for fixing it (explanation below).
Thursday: CFExecute is timing out – again, but on a different machine – and I don’t know why
The next day, after the app was running successfully again on the Dev server, Alan emails me. The app won’t run on his machine now, and it’s giving the same error message. Me, well… I get pissy. “C’mon, dude, I just sent an email telling how to fix it”. It was early, I was cranky, and the “angels of my better nature” weren’t in the mood. In short, I was an a-hole, assuming Alan was being lazy.
He wasn’t. He tried everything I had suggested as a fix. It still wasn’t working.
Sitting at his desk trying to figure it out, I was stumped. I rarely get this stumped. I *never* get stumped enough to email a company for help (in this case, Scooter, the makers of Beyond Compare). And there I was, writing an email to Scooter.
But you know how it is. You’re a geek. You like puzzles. You never give up. So… one more trip over to Alan’s desk. “Let’s try a few more things…”
This is a post about debugging. And it’s a post about assumptions.
What we did to debug
Wednesday night: The Easy Fix
To debug the hanging bc2.exe on the Dev server took about 10 minutes. After a few piddles around (selecting different options in the app, restarting CF – ruling out the really easy stuff) I simply ran the same thing that CFExecute was running at the command line. This is CFExecute Debugging 101. In my experience, this almost always yields the answer. When I wrote the app, I had it write its entire cfexecute command to a log file, so it was a cut-n-paste job to run it from the command line.
When I did so, I saw the problem right away: the license had expired for Beyond Compare on the Dev server. Instead of running the command I had given it, Beyond Compare was throwing up a popup asking for a new serial number. All those BC2.exe processes in Task Manager were those popups, swallowed by CF, but never completing. Fixing the license problem fixed the CFExecute problem.
Thursday: The not-so-easy Fix
After confirming that the problem on Alan’s machine was not license-related, we tried a whole bunch of things, mostly centered around answering the question: “Why does it work on one machine and not the other? What are the differences between these two machines?”
We suspected a permissions issue from the start. The BC command was looking at different network locations, so the first thing we did was change the script to look only at directories on Alan’s machine… remove anything network related. No Dice.
We always run CF under a special account (we’ll call it “cfservice”), and we thought “maybe the account got borked”. We weren’t taking that for granted as it’s happened before. So we tried running CF under the local system account. Nada.
We added additional logging into the BC command to get more info… it wouldn’t write the logs. But it would write the logs if we ran the command directly on CMD. WTF?
We went into Beyond Compare’s options and fiddled with a few important-sounding checkboxes. Still nothing.
So we’re at a point where, when running the command from CMD, it worked fine. But when run from within CFExecute, it was hanging.
What’s the difference between ColdFusion running it, and us running it from CMD? We were stumped.
At this point, we were certain that for some reason, Beyond Compare was throwing up a popup when run from within CF, but we couldn’t “see” that popup because, obviously, CF was swallowing it. We tried hooking a debugger up to the BC2.exe process, but we got nowhere fast.
Aside: while all this is happening, another coworker is talking about some other thing, “this app (unrelated) worked a few days ago. Now it’s busted. What’s going on?”. This was an app I had fiddled with “a few days ago”. I knew exactly what the problem was. But I didn’t step in yet. When you’re debugging, you stay focused. You ignore other problems even if you know the answer because when you’re debugging, you stay in the zone. There are times when you break from the problem and let your subconscious chew. For me, those times are toward the end of the day, when I’m spent, and I can let my mind wrestle with it on my commute home. But this was 10:00 AM. This, for me, is prime focus time, and a big no-no for me is pulling off of a problem during my peak mental energy. When you commit, you commit, even if it means leaving someone else hanging for a little bit. Good teams understand this, and God willing you’re patient with each other and recognize when your coworkers need to focus. It’s a high form of respect among geeks. If you see one of your colleagues in the zone, and they sort of mindlessly blow you off, it’s not necessarily an affront. Maybe it’s that they’re running at full pace and stopping would put them at the back of the proverbial pack. Be patient. Do not assume the worst.
I was about to give up, so I went back to my desk to do a little research. I drafted an email to Scooter. I held off, then went back to Alan’s desk.
Alan was still pretty sure it was permissions related, and at this point I was up for anything. But we had tried that route already. We ran it under cfservice. We ran it under local system. What else could it be? What’s the *delta* among running it from cfservice, running it from local system, and running it from CMD logged into the machine normally (under Alan’s account)?
OK, let’s try this: let’s take a cue from Michael Jordan. “You gotta be the ball”. So, let’s “Be ColdFusion”.
We logged out, logged back into his machine under our cfservice account (which, remember, CF runs under on our machines), and attempted to run the app from the command line. Bam. Popup… The damn license popup. How? What? Huh?
It turns out that Beyond Compare stores a license for each user on the machine, which we confirmed by poking through the registry. It explained why it would work when we ran it from CMD when logged in normally, and it explained why cfservice was hanging BC2.exe. We logged back in under Alan’s account, then did a shift-run as – cfservice on Beyond Compare, and sure enough we got the popup.
To fix, we simply entered the license for the application while running it from the same account as ColdFusion, which cured our ills.
In retrospect, it makes perfect sense. While you’re debugging, it never makes sense. This is learning, and it’s why I go to work every day.
- If your application appears to hang, try running it on the command line first, outside of CF, and see if you get any popups
- If your application appears to hang, and it works from the command line when you run it, try running it from the same user account as CF is running under
- Your coworkers can sometimes *appear* to be helpless, or at least appear to not be taking the initiative you wish they would take. Before you assume that’s the case, try instead to assume the best. Assume they tried; assume they tried hard. When you think you’re not getting the best out of people, ask yourself: Are you giving them your best, too? Not your best tech. Your best you.