Prioritizing unit test coverage

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Prioritizing unit test coverage

Hippie Hedgehog: Feb 19, 2007; Ever cuddled a hedgehog?

Let's say I'm a newly-joined senior Java developer, assigned a code base that's a couple years old and fairly sprawling. Test coverage is ~12% on average for the 5 repos I work in. I see a clear need to add more unit test coverage, but here's the deal: We can't do it all, and even if we could, I need to prioritize the most important code first.

Tools already at my disposal:

Gerrit (hosts the repos)
Jenkins
Jacoco
Sonarqube (which can visualize/track the coverage over time as well as code complexity)

Are there any tools that can help me to prioritize which code is most important to cover first?
Also, any advice for metrics to help prioritize?

My thinking right now is that two metrics will help me:

Complexity of the unit. Trivial code doesn't need unit tests as urgently as that behemoth which Sonar tells me has a cyclomatic complexity of 37.
Rate of change. Even complex or large code units don't need regression tests that urgently if nobody has changed them since 2020.

What did I miss?

And what tools can help me find the most-changed files? (Git blame is useful, but only reports on a single file at a time. I'd like something that gives me a breakdown for the whole repo, at least.)

# ? May 12, 2023 23:21

Adbot: ADBOT LOVES YOU

# ? May 4, 2024 03:38

cynic: Jan 19, 2004

Hippie Hedgehog posted:

My thinking right now is that two metrics will help me:

Complexity of the unit. Trivial code doesn't need unit tests as urgently as that behemoth which Sonar tells me has a cyclomatic complexity of 37.

Rate of change. Even complex or large code units don't need regression tests that urgently if nobody has changed them since 2020.

Metrics aren't really your best friend for sprawling untestable code bases - I would say up front you should do a few things;

Locate easy wins - boilerplate code and duplicated code (Sonar will have this). Write common tests for this. Bump your coverage massively in a week by writing good tests here. There you go; you just massively increased your coverage and you've identified the stuff you no longer have to worry about. A code base I approached I wrote a ton of 'testGenericEntityGetterSetter' stuff so I could just do `testGenericHashGetterSetter('variablename', $hash)` type code and autogenerate it. Bumped a 12% codebase to 25% instantly, thousands of new assertions with minimal effort. Make sure everyone uses them. Make sure you ~~berate people for their lovely tests~~ encourage best practices by saying you doubled test coverage in a week.
Locate the truly fucky code by asking people who've worked with it - find out what has hosed the codebase over in the last year. Find out what developers are scared of in the code; poo poo they triple check before deploying. Locate things which always come up in incident reports. Someone knows in the company where you need to look for these. Going to be lovely to write tests here, but worth it.
Set a quality gate in Sonar for 80% on new code and loving enforce it. All new code has tests. If it's not testable you hosed up. If it's not testable because other code is hosed up then fix it and write tests around that code first so you know you didn't break it. Code coverage will creep upwards if you create an environment where coverage on new code is enforced.

# ? May 14, 2023 23:19

Hippie Hedgehog: Feb 19, 2007; Ever cuddled a hedgehog?

Thanks for these pieces of advice!

Will definitely ask around about which code is "scary" to the devs. I have an idea myself after looking at it myself.

One good thing is this particular code base is not generally hard to write tests for. It's just that people have not taken the time for it.

Anyway, duplication doesn't seem like a huge problem right now, at least as far as Sonar indicates. I'm mostly concerned with some really complex methods that are impossible to break up because people mindlessly just added special cases over a few years, and need new abstractions.

Hippie Hedgehog fucked around with this message at 09:21 on May 15, 2023

# ? May 15, 2023 08:56

beauty queen breakdown: Dec 21, 2010; partially cromulent posting.
"2021's worst kept secret"

Hippie Hedgehog posted:

One good thing is this particular code base is not generally hard to write tests for. It's just that people have not taken the time for it.

If I had a nickel for every time I'd seen that one at a job, I'd have six nickels etc. I don't even count the number of times I've heard this or something similar from people in the industry who I don't work with directly.

I would strongly emphasize cynic's "quality gate" strategy -- a lot of this will have to do with your build system. This should include at a minimum: Tests run on every pull request; you can't merge code if it doesn't pass its tests, even as an administrator/spooky scenarios; and, you have a minimum metric of test coverage (percentage) in new code.

Combining that with the easy-win stuff should highlight areas for improvement -- often what I find in writing tests is not merely duplicated code but also code that is worthy of removal/refactoring as part of the test suite.

Obviously all of this is super easy to say. I'm rooting for you. :unsmith:

-- last thing I note is that obviously I'm a big proponent of Maven in these scenarios, but I note that you don't include that part of your build toolchain here. Anything in that space we could improve as part of the test push?

# ? May 16, 2023 04:07

Hippie Hedgehog: Feb 19, 2007; Ever cuddled a hedgehog?

I didn't include Maven, right, yes that's the build tool. Might as well have been gradle or something else, though, I don't think maven by itself will help me.

I think we're setup well for gating this using sonarqube's coverage gating, so I'm not worried about that. The new code is easy, looking for how to find the proper place to start covering the legacy code. =)

# ? May 16, 2023 09:12

Adbot: ADBOT LOVES YOU

# ? May 4, 2024 03:38

beauty queen breakdown: Dec 21, 2010; partially cromulent posting.
"2021's worst kept secret"

The way Maven can help you here is enforcing your code coverage and (with an artifact repository) helping you extract functionality at a dependency level via a release process (which may help improve the architecture). If you don't have an artifact repository hanging around (Nexus, GitHub Packages, Artifactory, etc) feel free to ignore this portion of advice. A solid if verbose example is the parent->child architecture with jacoco enforcement; cf. https://github.com/infrastructurebuilder/ibparent/blob/master/pom.xml#L1465 and a downstream project https://github.com/infrastructurebuilder/audit-reporting-maven-plugin/blob/develop/pom.xml#L31; note how the use of a released parent allows you to enforce this configuration on downstream projects consistently and track that the covered percentage goes up. Additionally, this allows you to pull things away from the legacy architecture with its boatload of bespoke cases. One of the things I would look at would be drilling into the 'complex methods' you mentioned - can they be pulled away as a dependency? If not, why not? Getting that away from the main code may give you architecture/refactoring ideas and then enables you to write the tests covering such things in isolation. Of course, writing testable code isn't always the goal of such things originally so ... yeah, daunting.

Again I'm armchair quarterbacking here so don't take my advice as law, ymmv, etc.

# ? May 16, 2023 11:44

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Prioritizing unit test coverage