Josh Radel
|
mmap on slow I/O devices can block unrelated thread creation for several seconds
|
Josh Radel
02/17/2010 1:08 PM
post47551
|
mmap on slow I/O devices can block unrelated thread creation for several seconds
When doing an mmap() on a file that resides on a slow I/O device, procnto can block other (unrelated) threads in the
mmap’ing process for the duration of the mmap() (which can take several seconds on a slow medium like NOR flash if the
flash filesystem happens to be doing a reclaim). Without looking into procnto’s code, my guess is that the process
address map is being locked for the duration of the mmap() call, and that anything else that needs to modify the address
map (such as new thread creation) is blocked until the mmap() completes.
Attached is a simple test case that proves the problem. "mmap_test.zip" contains a simple program (mmap_test) that
spawns two continuously working threads: a low-priority (5) write thread that continually opens, mmaps/modifies/msyncs/
munmaps, and closes a file, and a high priority (40) thread that continually spawns and joins on a temporary thread (the
temporary thread exits as soon as it is created). The high priority thread also sleeps for a few milliseconds every few
loops to ensure it doesn't starve everything else out.
Inside the zip file is an eight second trace of this running on a PPC SMP platform where the file being written to was
on a NOR flash device. In the trace, mmap_test’s create_destroy_thr is the high priority thread that should be creating
and joining to the temporary thread that immediately exits (Thread 4); however, create_destroy_thr is getting stuck in
the WaitThread state for a couple seconds while devf-generic is presumably doing a reclaim. I've also reproduced this on
an x86 single CPU device where the persistent file write was to an IDE compact flash device (though then the
pthread_create delays are on the order of 90ms, not 1.5seconds).
The mmap_test program takes two options: -f <filename> (required) and -w (optional; indicates to do a plain open/write/
close without mmap). If I do just plain open/write/closes, the problem does not occur.
(I originally found this when my thread pool resource manager sometimes failed to handle incoming messages for a couple
seconds. One of the threads in my server was doing an mmap() on a NOR flash device; if the NOR flash device happened to
be in a reclaim, the thread pool could be blocked from creating a new thread during the other thread’s mmap().)
I’ve also filed a support ticket against this (CaseID00100888).
|
|
|