Performance

Tackler-CLI can parse and process one million (1E6) transactions under 20 seconds, with processing speed of over 56 000 transactions per second. Server API can process that 1E6 transaction set in about 1.3 seconds, when it is cached in memory.

Tackler is performance tested to validate sanity of used algorithms and overall performance of system.

These are ballpark figures and should be treated as such. Test method contains serious inaccuracies especially with smaller tests set (load time of JVM, JIT, etc.), however it should be good enough to validate selected algorithms and overall memory usage with increasing work load.

Performance test data is generated by generator tool.

Test results

Used test sets are sets of 1E3, 1E4, 1E5 or 1E6 transactions. Tests are run for all report types and for all formats. If version supports filtering then all above if also run when filtering is enabled.

Git Storage backend is performance tested with same data sets than filesystem based storage. Git Storage backend has same or better performance as filesystem backend, but with less CPU utilization.

Results for HW00: Quad core system

This is low-end server system with quad core cpu. CPU has four cores and eight threads. All journal data can be cached in memory.

Balance report, CPU utilization is around 540%

  • 1E3 txns: 2 sec, 255MB

  • 1E4 txns: 3 sec, 500MB

  • 1E5 txns: 5 sec, 2.3GB

  • 1E6 txns: 18 sec, 4.2GB, 56000 txn/s

Results for HW01: Dual core system

This system is normal-ish laptop system with dual core cpu. CPU has two cores and four threads. All journal data be cached in memory.

Balance report, CPU utilization is around 322%

  • 1E3 txns: 3 sec, 220MB

  • 1E4 txns: 4 sec, 470MB

  • 1E5 txns: 8 sec, 2.3GB

  • 1E6 txns: 32 sec, 4.3GB, 31000 txn/s

Test data

Performance testing is done with artificial transaction data which mimics real journal. Test data is generated with generator, which can also generate ledger-like compatible journal formats.

Tests are done with all report and format types. Account validation is turned on, (e.g. accounts.strict=true). This means that for each transaction it is checked that all txn’s accounts are listed in Chart of Accounts.

Each transaction is located on own file, and shard of files is based on txn dates (e.g. perf-1E6/YYYY/MM/DD/YYYYMMDDTHHMMSS-idx.txn, where idx is index of txn).

Used test sets (small and big) are:

  • 1E3 (1 000) transactions

  • 1E4 (10 000) transactions

  • 1E5 (100 000) transactions

  • 1E6 (1 000 000) transactions

Chart of Accounts for perf tests

Chart of Accounts has 378 entries and it is generated based on txn’s dates:

For "assets" following structure is used a:ay<year>:am<month>.

This yields to 12 different accounts:

...
"a:ay2016:am01",
"a:ay2016:am02",
"a:ay2016:am03",
...

For "expenses" following structure is used e:ey<year>:em<month>:ed<day>.

This yields to 366 different accounts:

...
"e:ey2016:em01:ed01",
"e:ey2016:em01:ed02",
"e:ey2016:em01:ed03",
...

Single file or single input String

With single file Tackler has basically same processing time than with shard txn data.

However, with 1E6 txns in single input file or in one single string (about 73MB), Tackler has non-optimal memory usage behaviour, and it’s memory usage peaks around 6-7GB.

To avoid this, do not put one million (1E6) transactions in single journal. There should be a shard data set, for example collections of ten files of hundred thousand transactions on each.

Shard data set couldalso be structured so that there is only one transaction per file. This is basic mode for performance test, and this setup has basically same processing time than putting all transactions into one big journal file (with fast SSD and good disk cache).

Single file performance

Tacklers processing time with single file (1E6) is around ~30sec, and memory usage peaks around 6-7G. Single file performance testing is not part of routine performance testing.