1st February 2021

Code that "just works" is not enough

Micro benchmarking

Colin Cachia

Software Engineering Team Lead

In the second instalment of Payments IQ, our Software Engineering Team Lead Colin Cachia explains how his team micro-benchmarks code performance to deliver industry-leading reliability and uptime for our clients. If you know a bit about Java, Throughput and Benchmarking — this one is for you!

Relying on decades of industry and engineering experience, Ixaris’ engineering teams build and maintain our payment solution technologies in-house. The code we produce is, first and foremost, functional to our clients’ requirements, but having code that “just works” is not enough.

We use several industry-standard practices to ensure that our software not only exceeds customer’s expectations but that it’s developed at a pace that reflects the speed at which the payments industry moves — and that it outperforms client traffic.

At Ixaris, we implement the best standards in software design and use the latest tools available to ensure scalability and performance. Ixaris’ systems need to be able to handle whatever is thrown at them, all the time, such as large loads of traffic at peak times. Not only is downtime unacceptable for our clients, but lost traffic means lost business for us.

To measure our system performance, we primarily use Apache JMeter and Gatling to load test our applications. Load testing is the process of stressing the system with real-life production load conditions and measuring the ways that the system might respond to such stresses. Our whole system is tested from user-facing interfaces, such as API endpoints. Think of it like Twitter engineers testing the number of new tweets per second during the US Presidential Election, or Google engineers making sure that their systems can handle a surge in searches during a pandemic outbreak. The result of this testing is a macro benchmark.

Tools like Apache JMeter and Gatling enable engineers to understand how a whole system responds to different loads. However, they do little to show engineers how a specific unit of the codebase handles scale. To dive deeper into this, we use profiling tools like Newrelic and VisualVM, which reveal where system bottlenecks are. Although profiling may give answers about a particular class or method, profiling is mainly intended to diagnose issues within the system at runtime. It’s not intended to ensure that no new performance issues are added through time (such as in a Continuous Integration (CI) environment).

Java Microbenchmark Harness (JMH) to the rescue

To tackle these challenges we use the Java Microbenchmark Harness (JMH). This tool performs benchmarks on a piece of code while taking care of typical scenarios in a JVM (such as warm-up time and code optimization paths) that might skew the benchmark results. It comes packaged with the OpenJDK distribution and makes benchmarking much simpler. If you have a somewhat technical background (and if you’re this far in, you probably do!) you can think of this tool as having the same simplicity as writing a JUnit test. In fact, its setup is largely annotation-based, with some annotations similar in concept to what JUnit offers, like @Setup, @Params and @Benchmark.

We configure JMH for several benchmark testing scenarios like throughput, latency, sample time and even single-shot times using the @BenchmarkMode annotation. We can even configure the number of threads to be used when benchmarking with @Threads.

Enough with the talk, show us the code!

Let’s say we want to benchmark the code required to replace some text in a String, based on a regex. We could compare two approaches that both achieve this: Directly via the String#replaceAll method, or via the Matcher#replaceAll.

This code snippet runs four iterations of this benchmark, with one warmup phase that is required to have the JIT (Just In Time) compiler pre-optimise code at runtime. This code takes the “dogcatdog” string and replacing whatever comes with and after the term “cat” with empty text. We’ll benchmark the throughput that both approaches give us.


The result:


With more than double the throughput (8,549,559 operations per second vs. 3,554,781 operations per second), it’s no surprise that the Matcher approach is better than the String approach. Every time we use String#replaceAll, a new regex pattern is being compiled. With the Matcher approach, we avoid this extra computation by pre-compiling the regex pattern.

This is just scratching the surface of what JMH can do, since JMH is adaptable to even more stringent scenarios. The outcome? Engineers can get instant benchmarks on a particular area of the system or a particular algorithm with a simple setup, easy like writing a simple unit test. As it should be.

Copy link to clipboard