GIMP's tile cache suffers from a few minor flaws with major practical implications:
In addition to fixes offered for these implementation flaws, I've also constructed patches to add additional tile-cache profiling output and internal tile-cache consistency checking. Automated test scripts are also provided to collect profiling data.
The idle swapper relies on gtk idle processing to determine when to work. Unfortunately, an idle UI does not necessarily imply that GIMP is also idle. The idle processor will also wake up and work during periods of heavy computation and cache usage. When the cache is under heavy pressure (performing large numbers of flushes and swaps), idle swapper activity simply adds to the disk load as it tosses its own disk requests into the queue. It will also usually be writing or accessing non-sequential data, that is, generating more than its share of seek delays.
Perhaps for this reason, the idle swapper's performance is utterly strangled. It will attempt to flush at most only four tiles a second, that is, 64kB/s for an RGBA image. This figure is so low that any performance enahnacement provided by the idle swapper is lost in the collection noise (assuming that the idle swapper is not, in fact, causing a net performance penalty!)
GIMP currently uses a tile cache flush strategy that will never flush a 'dirty' tile [that is, a tile with changes waiting to be written to disk] if the cache contains at least a single 'clean' tile. Unfortunately, the cache will only flush a dirty tile to disk or remove it from the cache in one of three cases:
The idle swapper cannot keep up with the rate at which dirty tiles can be generated either by interactive drawing or monolithic transforms. Maximum undo depth may be shallower than the cache depth, but does not account for temporary working space. In practice, it will not reliably provide any breathing room.
The last case means that, in practice, the tile cache becomes completely full of 'dirty' destination tiles [because they are never flushed] and a single tile cache slot, if any, is used for 'source' tiles read in fresh from disk. Because it will be the only clean tile in the cache, it will be flushed as soon as the next clean tile is read. Once the cache fills, nearly all source tiles will be read from the swap-file every time they are needed. The profiling data below supports this scenario:
This data was collected using the script 'tilecache-test1' from the test scripts below. We can see that once the tile-cache is no longer large enough to hold the entirety of the source image, the destination image and all intermediate tile state, performance decays rapidly with the bulk of tile-cache activity consisting of repeatedly reading the same tiles in from disk over and over again. For example, the profiling output from the first iteration, using an 8M tile cache:
Peak Tile usage: 1090 Tile structs Total tiles swapped out to disk: 7079 Unique tiles swapped out to disk: 6070 Total tiles swapped in from disk: 2440622 Unique tiles swapped in from disk: 4049 Tiles swapped out by idle swapper: 86 Total seeks during swapping: 1836013 Total time spent in swap: 32.284776 seconds Total zorched tiles: 2447682 Total zorched tiles swapped out: 6993 Total zorched tiles swapped back in: 2440622 Total zorched tiles wasted after swapping out: 2504 Total interactive swap/cache delay: 36.798448 seconds
These patches are intended to be applied in the following order.
patch: 0001-Minor-change-to-TILE_DATA_POINTER-that-restricts-TIL.patch
This is a very minor 'sneak-it-in' sort of performance tweak, broken out as a seperate patch so that it's more visible. It eliminates two integer divisions (well-- it eliminates the still needlessly complex assembly resulting from optimizing out the two integer divisions) during tile data pointer lookups. It's a few-percent of 'free' performance for some transforms (such as in the new resampler that's in progress).
It has one caveat: it restricts choice of TILE_WIDTH and TILE_HEIGHT to powers of two.
patch: 0002-Add-additional-profiling-to-tile-usage-in-order-to-a.patch
This patch adds profiling collection and output to the tile cache, as well as fixing up the bitrotted tile profiling code that already existed but no longer compiles. To use it, either add '-DTILE_PROFILING' to the environment CFLAGS before running configure, or uncomment the #define at the top of app/base/tile-private.h
patch: 0003-Replace-two-list-flush-clean-first-cache-strategy-wi.patch
Replaces the 'flush clean first' strategy with a simple 'flush least recently used' strategy, that is, a simple flush FIFO that always flushes from the tail, and places any newly released tile at the head. Although LRU is not a particularly clever caching strategy, it is significantly better than 'clean-first'. The measured cache performance improvemnt [when the cache is under heavy pressure] was over 7x for the interactive line-drawing test and greater than 100x for the monolithic transform test.
Profiling output from the patched tile cache [same script test as quoted above] shows that the read starvation is corrected:
Peak Tile usage: 1089 Tile structs Total tiles swapped out to disk: 7626 Unique tiles swapped out to disk: 6088 Total tiles swapped in from disk: 10540 Unique tiles swapped in from disk: 4353 Tiles swapped out by idle swapper: 1452 Total seeks during swapping: 9679 Total time spent in swap: 0.246506 seconds Total zorched tiles: 17084 Total zorched tiles swapped out: 6174 Total zorched tiles swapped back in: 10540 Total zorched tiles wasted after swapping out: 1505 Total interactive swap/cache delay: 0.273391 seconds
0004-Correct-startup-flaw-in-idle-swapper-start-Don-t-wat.patch
This patch makes three changes to the idle swapper:
test scripts: gimp-tilecache-scripts.tgz
browse contents: gimp-tests/
I've written two automated test scripts to profile the tile cache changes. The first script runs three monolithic transforms in order (scale to 1500x1500, rotate, then perspective) with 8M, 16M, 32M, 64M and 128M title cache sizes. The second draws 1000 random lines with the paintbrush, ten connected lines at a time, in a growing pattern for each of the same tile cache sizes.
The scripts require the batch mode fix to work, a reasonably modern perl install, and a gimp compiled with TILE_PROFILING defined. Simply unpack the tests and run, eg:
gimp-tests/tilecache-test1 /path/to/gimp base-name-for-logfile
...and results of the test run will be put in gimp-tests/results.
Happy hacking!