runtests.py: add -j/--parallel option for parallel test execution
Add parallel test execution using concurrent.futures. With -j8 the
test suite completes in ~4s vs ~29s sequential (~7x speedup).
Also fix two issues that caused failures under parallel execution:
- rsync_ls_lR now prunes testtmp/ so parallel tests don't see each
other's temp files when scanning the source tree
- clean-fname-underflow.test now uses $scratchdir instead of /tmp
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>