Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hippie Hedgehog
Feb 19, 2007

Ever cuddled a hedgehog?
Let's say I'm a newly-joined senior Java developer, assigned a code base that's a couple years old and fairly sprawling. Test coverage is ~12% on average for the 5 repos I work in. I see a clear need to add more unit test coverage, but here's the deal: We can't do it all, and even if we could, I need to prioritize the most important code first.

Tools already at my disposal:
  • Gerrit (hosts the repos)
  • Jenkins
  • Jacoco
  • Sonarqube (which can visualize/track the coverage over time as well as code complexity)
Are there any tools that can help me to prioritize which code is most important to cover first?
Also, any advice for metrics to help prioritize?

My thinking right now is that two metrics will help me:
  • Complexity of the unit. Trivial code doesn't need unit tests as urgently as that behemoth which Sonar tells me has a cyclomatic complexity of 37.
  • Rate of change. Even complex or large code units don't need regression tests that urgently if nobody has changed them since 2020.
What did I miss?

And what tools can help me find the most-changed files? (Git blame is useful, but only reports on a single file at a time. I'd like something that gives me a breakdown for the whole repo, at least.)

Adbot
ADBOT LOVES YOU

cynic
Jan 19, 2004



Hippie Hedgehog posted:

My thinking right now is that two metrics will help me:
  • Complexity of the unit. Trivial code doesn't need unit tests as urgently as that behemoth which Sonar tells me has a cyclomatic complexity of 37.
  • Rate of change. Even complex or large code units don't need regression tests that urgently if nobody has changed them since 2020.

Metrics aren't really your best friend for sprawling untestable code bases - I would say up front you should do a few things;

  • Locate easy wins - boilerplate code and duplicated code (Sonar will have this). Write common tests for this. Bump your coverage massively in a week by writing good tests here. There you go; you just massively increased your coverage and you've identified the stuff you no longer have to worry about. A code base I approached I wrote a ton of 'testGenericEntityGetterSetter' stuff so I could just do `testGenericHashGetterSetter('variablename', $hash)` type code and autogenerate it. Bumped a 12% codebase to 25% instantly, thousands of new assertions with minimal effort. Make sure everyone uses them. Make sure you berate people for their lovely tests encourage best practices by saying you doubled test coverage in a week.
  • Locate the truly fucky code by asking people who've worked with it - find out what has hosed the codebase over in the last year. Find out what developers are scared of in the code; poo poo they triple check before deploying. Locate things which always come up in incident reports. Someone knows in the company where you need to look for these. Going to be lovely to write tests here, but worth it.
  • Set a quality gate in Sonar for 80% on new code and loving enforce it. All new code has tests. If it's not testable you hosed up. If it's not testable because other code is hosed up then fix it and write tests around that code first so you know you didn't break it. Code coverage will creep upwards if you create an environment where coverage on new code is enforced.

Hippie Hedgehog
Feb 19, 2007

Ever cuddled a hedgehog?
Thanks for these pieces of advice!

Will definitely ask around about which code is "scary" to the devs. I have an idea myself after looking at it myself.

One good thing is this particular code base is not generally hard to write tests for. It's just that people have not taken the time for it.

Anyway, duplication doesn't seem like a huge problem right now, at least as far as Sonar indicates. I'm mostly concerned with some really complex methods that are impossible to break up because people mindlessly just added special cases over a few years, and need new abstractions.

Hippie Hedgehog fucked around with this message at 09:21 on May 15, 2023

beauty queen breakdown
Dec 21, 2010

partially cromulent posting.
"2021's worst kept secret"


Hippie Hedgehog posted:

One good thing is this particular code base is not generally hard to write tests for. It's just that people have not taken the time for it.

If I had a nickel for every time I'd seen that one at a job, I'd have six nickels etc. I don't even count the number of times I've heard this or something similar from people in the industry who I don't work with directly.

I would strongly emphasize cynic's "quality gate" strategy -- a lot of this will have to do with your build system. This should include at a minimum: Tests run on every pull request; you can't merge code if it doesn't pass its tests, even as an administrator/spooky scenarios; and, you have a minimum metric of test coverage (percentage) in new code.

Combining that with the easy-win stuff should highlight areas for improvement -- often what I find in writing tests is not merely duplicated code but also code that is worthy of removal/refactoring as part of the test suite.

Obviously all of this is super easy to say. I'm rooting for you. :unsmith:

-- last thing I note is that obviously I'm a big proponent of Maven in these scenarios, but I note that you don't include that part of your build toolchain here. Anything in that space we could improve as part of the test push?

Hippie Hedgehog
Feb 19, 2007

Ever cuddled a hedgehog?
I didn't include Maven, right, yes that's the build tool. Might as well have been gradle or something else, though, I don't think maven by itself will help me.

I think we're setup well for gating this using sonarqube's coverage gating, so I'm not worried about that. The new code is easy, looking for how to find the proper place to start covering the legacy code. =)

Adbot
ADBOT LOVES YOU

beauty queen breakdown
Dec 21, 2010

partially cromulent posting.
"2021's worst kept secret"


The way Maven can help you here is enforcing your code coverage and (with an artifact repository) helping you extract functionality at a dependency level via a release process (which may help improve the architecture). If you don't have an artifact repository hanging around (Nexus, GitHub Packages, Artifactory, etc) feel free to ignore this portion of advice. A solid if verbose example is the parent->child architecture with jacoco enforcement; cf. https://github.com/infrastructurebuilder/ibparent/blob/master/pom.xml#L1465 and a downstream project https://github.com/infrastructurebuilder/audit-reporting-maven-plugin/blob/develop/pom.xml#L31; note how the use of a released parent allows you to enforce this configuration on downstream projects consistently and track that the covered percentage goes up. Additionally, this allows you to pull things away from the legacy architecture with its boatload of bespoke cases. One of the things I would look at would be drilling into the 'complex methods' you mentioned - can they be pulled away as a dependency? If not, why not? Getting that away from the main code may give you architecture/refactoring ideas and then enables you to write the tests covering such things in isolation. Of course, writing testable code isn't always the goal of such things originally so ... yeah, daunting.

Again I'm armchair quarterbacking here so don't take my advice as law, ymmv, etc.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply