Performance
Tackler-NG has excellent performance and it can process from 250_000 to 700_000 txns per second on modern laptop.
What does it mean to process 250_000 txns per sec?
- 0.03 seconds (30 milliseconds)
-
Journal with 3 transactions, in every day, for one year - that’s journal with one thousand (1 000) transactions.
- 0.15 seconds
-
Journal with 30 transactions, in every day, for one year - that’s journal with ten thousands (10 000) transactions.
- 2 seconds
-
Journal with a transaction in every minute, for one year - that’s journal with about 500 000 transactions.
These times are real wall clock values.
At the same time tackler can easily process journals with over one million transactions.
A million transaction journal is comparable to 3 000 transactions per day for one year. Or 100 transactions in every hour, every day, for one year.
Performance benchmarks
Typically tackler is 1.5-2x faster than ledger (C++), and 15-20x faster than hledger (Haskell) and 30-40x times faster than beancount (C++ and Python). At the same time, it has modest memory footprint, which is 2-4x smaller than memory usage of those other tools.
Set Size | Macbook Pro M3 | i5-8250U @ 1.60GHz | ||
---|---|---|---|---|
1e3 |
0.03 sec |
5 MB |
0.02 sec |
12 MB |
1e4 |
0.05 sec |
18 MB |
0.15 sec |
24 MB |
1e5 |
0.17 sec |
128 MB |
0.40 sec |
135 MB |
1e6 |
1.41 sec |
1450 MB |
3.95 sec |
1470 MB |
Performance |
+700_000 txn / s |
+250_000 txn / s |
You can easily check the performance on your own system with pta-generator tool.
Test results
Tackler is performance tested to validate sanity of used algorithms and overall performance of system. These are ballpark figures and should be treated as such.
Artificial test data is used for performance test data. Used test sets are sets of 1E3, 1E4, 1E5 and 1E6 transactions. Tests are run for all report types, and also with transaction filtering and without.
Also Filesystem and Git based storage are tested.
Git Storage backend is performance tested with same data sets than filesystem based storage. Git Storage backend has same or better performance as filesystem backend.
There are five runs for each report type, the fastest and slowest run are removed from results, and then the remaining three values are used to calculate average.
Test setup and full results are described on Performance Readme document.
Test data
Performance testing is done with artificial transaction data which mimics real journal. Test data is generated with pta-generator, which can also generate ledger and beancount compatible journal formats.
Tests are done with all report and format types. Account validation is turned on, (e.g. strict.mode=true
),
so all accounts are checked that they are defined in Chart of Accounts.
Each transaction is located on own file, and sharding of transaction journals is based on txn dates
(e.g. one transaction would be perf-1E6/YYYY/MM/DD/YYYYMMDDTHHMMSS-idx.txn
, where idx
is index of txn).
Used test sets (small and big) are:
-
1E3 (1 000) transactions
-
1E4 (10 000) transactions
-
1E5 (100 000) transactions
-
1E6 (1 000 000) transactions
Chart of Accounts for perf tests
Chart of Accounts has 378 entries and it is generated based on txn’s dates:
For "assets" following structure is used a:ay<year>:am<month>
.
This yields to 12 different accounts:
... "a:ay2016:am01", "a:ay2016:am02", "a:ay2016:am03", ...
For "expenses" following structure is used e:ey<year>:em<month>:ed<day>
.
This yields to 366 different accounts:
... "e:ey2016:em01:ed01", "e:ey2016:em01:ed02", "e:ey2016:em01:ed03", ...
Single file or single input String
Tackler has same processing speed regardless how input is structured, e.g. the processing time is same for one big single journal file as it is for sharded transaction directories.
However, to optimize memory usage of tackler, bigger transactions sets over one hundred thousands (1E5) transactions should split over multiple files, and use FS or GIT storage system with shard directory structure. The one common way to split transaction journals is based on dates, or by producers (each have they own subtree of transactions or journal files which to use).