tree-diff: reuse base str(buf) memory on sub-tree recursion
Instead of allocating it all the time for every subtree in
ll_diff_tree_sha1, let's allocate it once in diff_tree_sha1, and then
all callee just use it in stacking style, without memory allocations.
This should be faster, and for me this change gives the following
slight speedups for