I use a helper script to run a sequence of tests. For all tests there is either one table shared by all clients or a separate table per client. Some storage engines suffer more from concurrency within a table. For MySQL each table has a primary key index up to 3 secondary indexes. I configure the tests to use 3 secondary indexes. The inserts during the load are in PK order but random for the secondary indexes. InnoDB benefits from the change buffer. MyRocks and MongoRocks benefit from read-free secondary index maintenance. MongoRocks and WiredTiger suffer from the hidden index used by MongoDB for engines that use clustered indexes, but the overhead is less with the insert benchmark than for Linkbench. The tests are:
- load - this is the first test and uses N clients to load into the test table(s).
- q1000 - this is the second test and uses N query clients and N writer clients. Each writer client is rate limited to 1000 inserts/second. With some storage engines the writer clients are unable to sustain that rate. With N writer clients the global write rate should be N*1000 inserts/second. The query clients perform range scans as fast as possible. The usage of tables is the same as
- q100 - this is the third test and uses N query clients and N writer clients. Each writer client is rate limited to 100 inserts/second. With N writer clients the global write rate should be N*100 inserts/second. The query clients perform range scans as fast as possible.
bash iq.sh innodb "" ~/b/orig801/bin/mysql /data/m/my/data nvme0n1 1 1 no no no 0 no 500000000
When it finishes there will be three directories named l, q1000 and q100. In each directory there is a file named o.res.$something that has performance and efficiency metrics. The l directory has results from the load step. The q1000 and q100 directories have results from the write+query step where each writer is rate limited to 1000/second and 100/second. An example o.res.$something file for a test with 500M rows and 1 client is here for the load test and here for the q1000 test. Each file has a section with efficiency metrics normalized by the insert rate and then efficiency metrics normalized by the query rate. For the load test only the first section is interesting. For the q1000 test both sections are interesting. For the q100 test only the second section is interesting because the insert rates are too low.
The first section starts with the text iostat, vmstat normalized by insert rate and the second section starts with the text iostat, vmstat normalized by query rate. They both have similar data -- rates from iostat and vmstat normalized by the insert and query rate. An example is:
iostat, vmstat normalized by insert rate
samp r/s rkb/s wkb/s r/q rkb/q wkb/q ips spi
501 3406.8 54508 91964 3.410 54.563 92.057 999.0 0.100100
samp cs/s cpu/c cs/q cpu/q
525 16228 17.3 16.244 0.017276
iostat, vmstat normalized by query rate
samp r/s rkb/s wkb/s r/q rkb/q wkb/q qps spq
501 3406.8 54508 91964 4.023 64.370 108.602 846.8 0.118092
samp cs/s cpu/c cs/q cpu/q
525 16228 17.3 19.164 0.02038
The metric names are:
- samp - number of iostat or vmstat samples collected during the test
- r/s, rkb/s, wkb/s - average values for iostat r/s, rKB/s and wKB/s during the test
- r/q - iostat r/s divided by the insert or query rate
- rKB/q, wKB/q - iostat rKB/s and wKB/s divided by the insert or query rate
- ips, qps - average insert and query rate
- spi, spq - seconds per insert, seconds per query - average response time for inserts and queries
- cs/s - average vmstat cs/s rate (cs is context switch)
- cpu/c - average CPU utilization from vmstat us & sy (user & system)
- cs/q - context switches per insert or per query
- cpu/q - CPU utilization divided by insert or query rate then multiplied by a constant. This is only comparable between servers with the same CPU count.