Tuesday, September 18, 2012

It's judgment time!


I came to wonder about oracles in testing. Or more likely got bewildered. You see, I read Cem Kaner's article about this subject and was totally lost when I hit the paragraph "Oracles are Heuristics".

Oracles and heuristics started to blur! I started this post as an attempt to cure my bewilderment via deeper analysis about the article and the thinking behind it, but then shifted to my focus on something much more interesting, namely passing judgment on people.

It's a dangerous path to take, but I took it. Let's see how it went...

Define: Heuristics

Let's start by defining heuristics. I consider them to be specific means to try to solve a problem and learn, rules of thumb, educated guesses, apply common sense or even find bugs, if you will. Oh look, Wikipedia thinks the same! :) But by specific I don't mean algorithms. That's why I underlined "to try". I have a massive library of heuristics. A mindmap of almost 2000 nodes, each node meaning a specific try, an idea to find bugs. It's based on James Bach's HTSM master and updated with everything I got. A lot of it is tied to NDA tied material for example purposes, so I cannot unfortunately share it. But it's basically quite generic, nearly everything can be applied everywhere. And I do just that by selecting the ideas that fit the occasion and context. Adaptation is the key.

Define: Oracles

Ok, then the oracles. Quotes from Kaner's article:
"A software testing oracle is a tool that helps you decide whether the program passed your test."
"Imagine running a test. The program misbehaves. The tester notices the behavior and recognizes that something is wrong. What is it that makes the tester decide this is wrong behavior?"
Successful failure
I can't come up with better description about oracles in testing than those two quotes. In practise they are requirements/expectations/want of someone who matters, or something that matters. They can be explicit and implicit ones. In the comment section of my previous blog post I explained these. In short; Explicit requirements/expectations/want are those that can be put into words, perhaps even into formal requirement form that's managed in serious fashion while implicit ones are those that cannot. Implicit requirements/expectations/want are what makes software development so hard; They might be based on whims, sudden innovations, secrets or drunken late night calls from customer, and that can be quite a handful.

Back to Kaner's article:
"(Doug) Hoffman argued that no oracle can fully specify the postcondition state of the system under test and therefore no oracle is complete. Given that an oracle is incomplete, you might use the oracle and incorrectly conclude that the program failed the test when it didn’t or passed the test when it didn’t. Either way, reliance on an oracle can lead you to the wrong conclusion.
A decision rule that is useful but not always correct is called a heuristic."
This is where I got lost. And I was lost in a way that for example an ISTQB Foundation level tester might be; I focused on terms and explanations more than the thinking behind them. I did the rookie mistake; I didn't test the rule itself and pass judgment on it.

I wouldn't mix oracles and heuristics together. For me oracles are the flavour of heuristics. Sometimes doing something based on a heuristic might reveal an oracle. Actually that happens quite often. Even more often oracles reveal the need for certain heurictics. But even though they live in harmony, I wouldn't mix them together. Examining oracles and their dynamics is a completely different ball game.

Oracle is, heuristic does.

Oracle helps you to pass judgment.

Passing judgment on... people?

I'm getting nowhere by contemplating the differences of oracles and heuristics so I leave it there. The passing judgment part instead sounds intriguing. As testers, more specificly sapient testers, we pass judgment on software we test. Either it works or it doesn't. Not so easy always, so we seek more information to ease this. One crucial source of this information is people around us.


What about passing judgment on them?

Software is all about people. It's about people doing it, setting expectations to it, using it and depending on it. And people are imperfect. "Flaws" in people's behaviour are all that matters when it comes to hunting bugs. People are the root cause for everything in software testing. Good and bad.

So why not pass judgment on them?

I'm walking a fine line here, so let's be careful...

Few lines before I wrote about implicit requirements/expectations/want. Let's expand that concept to oracles. Indirect, vague and potential means to pass judgment on something. When it comes to software, our goal is often to come up with explicit oracles that leave no room for interpretation. Formal requirements management aims for just this. This is however an unreasonably ambitious goal. That's why we've invented the concept of implicit oracles, which are basically everything that hasn't been or cannot be quantified. Where explicit oracles would be the bricks, implicit ones would be the mortar.

Note: When an implicit oracle is quantified, it becomes explicit. Add this to your process. ;)

When dealing with people it's all about unwritten, unquantified, indirect, vague, [add mystic adjective]. Many people make the mistake of dealing with people in the same way they would with software. It's all about social skills, even when you're not directly in contact with another person.

Let me elaborate.

The story of imperfect people

Now, have you ever been tired at work? Have you ever been under the weather? Have you had so much to drink on the weekend that you forgot to wear shoes to work? Well, so have the developers, architects, project managers and pretty much everyone responsible of the product/service which's quality you're questioning. Even at full health people aren't perfect. Nothing ever is. The same applies on environments, prevalent mood, weather, alignment of planets and whatnot. And all that affects on quality.

Can we recognize patterns from this mess?

Can we come up with means to pass judgment on people?

Let's pull the breaks here. You should never pass judgment on people! You're never in a position to do so. However people's actions can be judged. The good. The bad. And of course the ugly ones, as you see next.

Let's look at toilet brushes and people not using them. That's a very tangible example. So very many times I've been disgusted by someone's "unfinished business" at the toilet. But when connected to a developer it becomes an interesting idea; Does this person leave other things unfinished too? Very much so! You wouldn't imagine how often these kind of signals can be identified as the original root cause for problems.

Heads up!
One even more tangible example was this guy who was a bit aggressive at work and often drifted into quarrels with other people. He was also a drinker. Not that hard to tell from a person. On the other hand he wasn't that big or the kind of guy who could handle himself on a tight spot. So I came up with this preconception that this guy gets into trouble on the weekends and might not show up on Mondays, or even Tuesdays. And I was right on the money! On more than several occasions he got beaten up on the weekends (I didn't do it! :) and had to lick his wounds from some days. This created a problem with workload estimations and we had to seek replacement. He wasn't that good developer either. And he didn't use a toilet brush! :)

Attitude problems, lacking competence, poor guidance, etc. have more prominence in this matter, but I'm more fashinated in subtle signals like developer not using toilet brush. Now that's a revolting thing to begin with, leave places like that after you, but it acts as quite solid oracle; This guy is sloppy, which can cause a number of problems. Ok, that sloppiness can manifest in number of ways, but The Toilet Brush oracle heuristic has never failed me. In every measure people who don't use toilet brush, produce more bugs (not just defects, but things that bug people). Based on my observations of course. I'm more aware when testing something they've made. I emphasize even on positive scenarios because they will fail.

I'm more alert.

Of course there are many other signals that can raise a red flag, but this and it's cosmic connection to the effects lept to mind. I'm sorry for such a sh**ty example. :D

So if you're not using a toilet brush and I'm next, know that your software is next under the spotlight. ;)

Disclaimer: Be very careful when walking this path. What if you're wrong? What if you're interpreting the subtle signals wrongly? Two tips:

  • Start with the more prominent signals. Be sure.
  • Notice a pattern before making a judgment!

Those apply quite nicely on software too. If you manage to do this with people, doing it to software is cakewalk. I'd however advice you to start vice versa. So, third point:

  • Software first, then people.

One more:

  •  You're not perfect either. ;)

And just to remind you; Do not pass judgment on people, but their actions! Where you can and should indeed pass judgement on a software itself, an idea of it and everything it does, you cannot roam in such freedom with people. Take care and responsibility of your actions and - whatta you know! - you start to do things better.

Ok, I've written a heap of text that spans for several days, that has been edited for several times and might miss it's point badly. It's slightly insane and very dangerously interpretable so I'm contemplating if I shoud even publish it. If you see it, you know I've made a decision... ;)

Quote time! One of my favourite comic characters hits the big screen on 21th of September, so let's give room for the king of badasses. He does pass judgment on people themselves, but life's different in post-apocalyptic world... ;)
"It's judgment time!" -Judge Dredd
Yours truly,

Sami "Judge Sir Blom" Söderblom


  1. Glad you published, funny and thought provoking!

    So a couple of thoughts that were provoked:

    On the toilet brush, different people can be sloppy in different ways:
    - The guy who leaves his business into the toilet, might be a germ-a-phobic and thus cannot use another peoples toilet brush (beware for time spent on refactoring 10 times after each code edit).
    - some people who are really "neat" might not want to work with other "messy" people (beware for integration issues)

    "You're not perfect either." Couldn't agree more. It is really easy as a tester to fall into the trap of know-it-all smartass. As we kind of deal in the business of hindsight (=jälkiviisaus).

    "Have you had so much to drink on the weekend that you forgot to wear shoes to work"
    Mondays aren't really working days, right?

    Yours truly,
    Sloppy, toomuchontheweekenddrinking, but still toilet brush using (cmoon we live in a society!) tester

  2. Thanks Anssi! I really struggled if I should post it or not. I'm glad I did. :)

    Root cause is a funny thing. We see an effect which is often a software bug. We start to wonder why did this happen so we track the problem to code, design, architecture, whatnot. Social suspicion takes it a little further; Why did the design fail? Could it be that the designer didn't bring his a-game? Why is that? A hangover? General sloppiness? What? And what is the reason for that? And the reason for the reason? And then?

    I see a gradient forming. A gradient from computer science to behavioral psychology. Advanced root cause analyisis might be just that and I'm beyond fashinated where it takes the industry that is software testing.

    Exciting times ahead!

  3. I suggest:

    A heuristic is a fallible method of solving a problem or making a decision.

    An oracle is a heuristic for recognizing a problem once it has manifested.

    These are simple, usable definitions, at least for me. Personally, I find the Kaner/Hoffman view on oracles to be a bit simplistic and narrow and pandering to less intelligent people, but not terribly wrong.