User:DavidHouse/Bharat on testing

This is a log from a conversation between myself and bharat, a developer from the Gallery2 project about testing software in general, with an eye toward setting up unit testing for WP. It took place in #wordpress on 30th Dec 05.
<bharat> hallo :-)
<davidhouse> bharat, what do you use for it? phpunit?
<bharat> we use a hacked up version of one of the forks of phpunit
<bharat> when we started unit testing ~3 years ago there weren't any mature phpunit frameworks yet
<davidhouse> SteamedPenguin, snap.
<davidhouse> hmm
<davidhouse> i tried out phpunit, but it seemed too complex.
<davidhouse> all this messing around with test harnesses and enough classes and reflection to sink the titanic.
<bharat> davidhouse: is your goal to do unit testing or regression testing?
<davidhouse> err, we haven't decided yet. this is very informal discussion.
<davidhouse> we just want it to Not Suck.
<bharat> unit testing and regression testing are very different beasts
<bharat> they help you at different levels of the product development cycle
<davidhouse> what's regression testing?
<bharat> http://www.devbistro.com/articles/Testing/Testing-Terminology-Glossary
<bharat> davidhouse: in that glossary, I'm referring to a "system test"
<skippy> "The objective of system test is to measure the effectiveness and efficiency of the system in the "real-world" environment." 
<davidhouse> right, that makes sense
<bharat> system tests are not particularly useful in helping you to refactor the code
<davidhouse> you use them at the release-candidate stage?
<bharat> we don't have many system tests
<bharat> we have ~1800 unit tests
<bharat> I need to go purchase a clover license to figure out what our code coverage is
<skippy> how many developers have commit access?
<bharat> but our goal is to have unit tests for all the model/controller code
<skippy> davidhouse: I meant in gallery2
<bharat> let's see
<skippy> is there a split between developer and tester, or test writer and test executor?
<bharat> roughly 10
<bharat> give or take
<bharat> no, the developer writes the unit test as he writes the code it's testing
<skippy> ok
<davidhouse> how does it compare in size to wordpress?
<bharat> http://codex.gallery2.org/index.php/Main_Page#id3006022
<davidhouse> i really should be a lot more familiar with gallery2 than i actually am
<bharat> http://fisheye.gallery2.org/viewrep/gallery
<bharat> this is better: http://codex.gallery2.org/index.php/Gallery2:Developers
<h0bbel> The codebase of G2 is way bigger than WP's
<davidhouse> okay.
<davidhouse> so the number of unit tests for WP shouldn't be out of control.
<davidhouse> like, 1500 at max?
<bharat> I'm currently refactoring our main data representation, which is resulting in roughly 400 files changed. It's helpful to make a fundamental change then run all 1800 unit tests
<davidhouse> i don't really know
<bharat> roughly how many lines of code is wp?
<bharat> expect to have about the same amount of lines of unit test code
<bharat> a 50/50 ratio is reasonable
<davidhouse> yeah
<bharat> it'll probably be a lot more though unless you've got suitable abstractions or another way to create a seam
<bharat> since in order to effectively unit test you need to be able to mock up the code you're not covering in a particular test
<bharat> that's usually challenging if you're starting writing the tests after the code/design is more or less complete
<bharat> this is a useful intro to mock objects: http://mockobjects.com/Faq.html
<bharat> davidhouse: there's a lot of really good material out there to help you figure out how to start
<davidhouse> yeah, i get that impression :)
<bharat> but from what I know of the WP situation, it sounds like you should start with some characterization tests
<bharat> that will help you establish your initial invariants so that you can develop unit tests and begin refactoring as necessary
<davidhouse> could you sum up characterization tests quickly?
<davidhouse> i should really get to know this side of software development in more detail.
<bharat> sure. they are simple functional tests (see glossary link above) that let you measure what your current codebase does
<bharat> so for example, if you have some code that generates a menu, you could write a characterization test that exercises the menu, captures the output and compares it to a "golden file" 
<bharat> when you write the test, you capture the initial output and save it as the golden file
<davidhouse> right
<bharat> now you've characterized your code. if you make a change, your test may fail because you've introduced a difference from the golden file, so you can examine the difference and determine whether or not this is an expected change
<bharat> and update the golden file
<bharat> this lets you know for sure if you're changing your behavior
<bharat> which gives you the freedom to go in and hack things without worrying about weird side-effects
<davidhouse> right
<bharat> that's the upside. the downside is that they typically are brittle because they're testing many levels of functionality. unit testing each individual level will give you more stability
<bharat> so usually when I go into a legacy codebase and want to introduce testing I start with a characterization test
<bharat> then I start refactoring the code to introduce abstraction so that I can write unit tests
<bharat> once I have good unit tests and have achieved the coverage level I want I may delete the characterization test because it's no longer necessary
<davidhouse> bharat, right, so you write the characterization tests to make sure you're not breaking anything in a major way whilst writing the unit tests?
<bharat> davidhouse: right
<davidhouse> bharat, is unit testing the only testing you do?
<bharat> davidhouse: we have a lot of consumers of our nightlies so we get a lot of manual testing
<davidhouse> yes, that's true.
<bharat> usually a 2 week interval on an alpha/beta/release-candidate is enough to shake out most of the issues
<davidhouse> we need to expand our manual test userbase.
<bharat> I'd estimate that we have 3-400 people using CVS and nightlies
<davidhouse> we only have a limited set of people covering a limited set of functionality
<bharat> our issues are almost always rendering problems because the rest of the code is covered pretty well by the tests
<bharat> those are usually very easily fixed
<davidhouse> what kind of level do you unit test at? individual functions?
<bharat> yes
<bharat> back when I started doing this I didn't know enough so I didn't mock out the database
<davidhouse> right.
<bharat> which is unfortunate because it means that our tests use the db which means they run slower than I'd like
<davidhouse> you mean you didn't abstract to a db access layer you could swap at will? ;)
<bharat> heh
<bharat> we did, actually
<bharat> right now we have enough abstractions that we support mysql, postgres, oracle with db2, firebird and sqlite in the works
<davidhouse> and the API stays the same
<bharat> but for our testing we don't swap out the db layer.
<bharat> it's on my list though.
<davidhouse> yeah. :)
<bharat> because we have an abstraction at the right level it's probably on the work of about a week or so
<bharat> there are many things to like about unit testing (and some things that many don't like) but one thing that I find is that it drives the right level of abstraction
<davidhouse> bharat, this has been very valuable advice.
<davidhouse> along with 'get the damn php debugger working', i think 'learn how to test right' is going to be a new years resolution.
<bharat> davidhouse: I'm always around. we've been doing this for a couple of years now (and I do a lot of test driven design at work) so come on by #gallery any time you want to talk
<davidhouse> thanks. i appreciate it :)
Codex

User:DavidHouse/Bharat on testing

Codex Resources