Performance

Tackler-NG has excellent performance and it can process from 250_000 to 700_000 txns per second on modern laptop.

What does it mean to process 250_000 txns per sec?

0.03 seconds (30 milliseconds): Journal with 3 transactions, in every day, for one year - that’s journal with one thousand (1 000) transactions.
0.15 seconds: Journal with 30 transactions, in every day, for one year - that’s journal with ten thousands (10 000) transactions.
2 seconds: Journal with a transaction in every minute, for one year - that’s journal with about 500 000 transactions.

These times are real wall clock values.

At the same time tackler can easily process journals with over one million transactions.

A million transaction journal is comparable to 3 000 transactions per day for one year. Or 100 transactions in every hour, every day, for one year.

Performance benchmarks

Typically tackler is 1.5-2x faster than ledger (C++), and 15-20x faster than hledger (Haskell) and 30-40x times faster than beancount (C++ and Python). At the same time, it has modest memory footprint, which is 2-4x smaller than memory usage of those other tools.

Set Size	Macbook Pro M3	i5-8250U @ 1.60GHz
1e3	0.03 sec	5 MB	0.02 sec	12 MB
1e4	0.05 sec	18 MB	0.15 sec	24 MB
1e5	0.17 sec	128 MB	0.40 sec	135 MB
1e6	1.41 sec	1450 MB	3.95 sec	1470 MB
Performance	+700_000 txn / s	+250_000 txn / s

Set Size

Macbook Pro M3

i5-8250U @ 1.60GHz

1e3

0.03 sec

5 MB

0.02 sec

12 MB

1e4

0.05 sec

18 MB

0.15 sec

24 MB

1e5

0.17 sec

128 MB

0.40 sec

135 MB

1e6

1.41 sec

1450 MB

3.95 sec

1470 MB

Performance

+700_000 txn / s

+250_000 txn / s

You can easily check the performance on your own system with pta-generator tool.

Test results

Tackler is performance tested to validate sanity of used algorithms and overall performance of system. These are ballpark figures and should be treated as such.

Artificial test data is used for performance test data. Used test sets are sets of 1E3, 1E4, 1E5 and 1E6 transactions. Tests are run for all report types, and also with transaction filtering and without.

Also Filesystem and Git based storage are tested.

Git Storage backend is performance tested with same data sets than filesystem based storage. Git Storage backend has same or better performance as filesystem backend.

There are five runs for each report type, the fastest and slowest run are removed from results, and then the remaining three values are used to calculate average.

Test setup and full results are described on Performance Readme document.

Test data

Performance testing is done with artificial transaction data which mimics real journal. Test data is generated with pta-generator, which can also generate ledger and beancount compatible journal formats.

Tests are done with all report and format types. Account validation is turned on, (e.g. strict.mode=true), so all accounts are checked that they are defined in Chart of Accounts.

Each transaction is located on own file, and sharding of transaction journals is based on txn dates (e.g. one transaction would be perf-1E6/YYYY/MM/DD/YYYYMMDDTHHMMSS-idx.txn, where idx is index of txn).

Used test sets (small and big) are:

1E3 (1 000) transactions
1E4 (10 000) transactions
1E5 (100 000) transactions
1E6 (1 000 000) transactions

Chart of Accounts for perf tests

Chart of Accounts has 378 entries and it is generated based on txn’s dates:

For "assets" following structure is used a:ay<year>:am<month>.

This yields to 12 different accounts:

...
"a:ay2016:am01",
"a:ay2016:am02",
"a:ay2016:am03",
...

For "expenses" following structure is used e:ey<year>:em<month>:ed<day>.

This yields to 366 different accounts:

...
"e:ey2016:em01:ed01",
"e:ey2016:em01:ed02",
"e:ey2016:em01:ed03",
...

Single file or single input String

Tackler has same processing speed regardless how input is structured, e.g. the processing time is same for one big single journal file as it is for sharded transaction directories.

However, to optimize memory usage of tackler, bigger transactions sets over one hundred thousands (1E5) transactions should split over multiple files, and use FS or GIT storage system with shard directory structure. The one common way to split transaction journals is based on dates, or by producers (each have they own subtree of transactions or journal files which to use).