Performance

Processing journal of one thousand (1_000) transactions takes about 20 ms (0.02 seconds). And processing ten thousand (10_000) transactions takes about 0.1 second.

Tackler-NG can parse and process one million (1_000_000) transactions around 6 to 12 seconds, with processing speed of 120 000 to 250 000 transactions per second on normal laptop computer.

Test results

Tackler is performance tested to validate sanity of used algorithms and overall performance of system. These are ballpark figures and should be treated as such.

Artificial test data is used for performance test data. Used test sets are sets of 1E3, 1E4, 1E5 and 1E6 transactions. Tests are run for all report types, and also with transaction filtering and without.

Also Filesystem and Git based storage are tested.

Git Storage backend is performance tested with same data sets than filesystem based storage. Git Storage backend has same or better performance as filesystem backend.

There are five runs for each report type, the fastest and slowest run are removed from results, and then the remaining three values are used to calculate average.

Test setup and full results are described on Performance Readme document.

Results for HW02: Quad core system

This system is laptop system with quad-core cpu. CPU has four cores and eight threads. All journal data be cached in memory.

Tackler-NG initial implementation is based on single thread, single core at the moment.

Balance report, CPU utilization is around 100%

  • 1E3 txns: 0.02 sec, 35 MB

  • 1E4 txns: 0.11 sec, 44 MB

  • 1E5 txns: 3.28 sec, 124 MB

  • 1E6 txns: 11 sec, 955 MB

Full results: Perf HW02

Test data

Performance testing is done with artificial transaction data which mimics real journal. Test data is generated with generator, which can also generate ledger-like compatible journal formats.

Tests are done with all report and format types. Account validation is turned on, (e.g. strict.mode=true), so all accounts are checked that they are defined in Chart of Accounts.

Each transaction is located on own file, and sharding of transaction journals is based on txn dates (e.g. one transaction would be perf-1E6/YYYY/MM/DD/YYYYMMDDTHHMMSS-idx.txn, where idx is index of txn).

Used test sets (small and big) are:

  • 1E3 (1 000) transactions

  • 1E4 (10 000) transactions

  • 1E5 (100 000) transactions

  • 1E6 (1 000 000) transactions

Chart of Accounts for perf tests

Chart of Accounts has 378 entries and it is generated based on txn’s dates:

For "assets" following structure is used a:ay<year>:am<month>.

This yields to 12 different accounts:

...
"a:ay2016:am01",
"a:ay2016:am02",
"a:ay2016:am03",
...

For "expenses" following structure is used e:ey<year>:em<month>:ed<day>.

This yields to 366 different accounts:

...
"e:ey2016:em01:ed01",
"e:ey2016:em01:ed02",
"e:ey2016:em01:ed03",
...

Single file or single input String

Tackler has same processing speed regardless how input is structured, e.g. the processing time is same for one big single journal file as it is for sharded transaction directories.

However, to optimize memory usage of tackler, bigger transactions sets over one hundred thousands (1E5) transactions should split over multiple files, and use FS or GIT storage system with shard directory structure. The one common way to split transaction journals is based on dates, or by producers (each have they own subtree of transactions or journal files which to use).