dnsname: Optimize parsing of uncompressed labels
The gist of this change is to stop allocating and copying per label
when parsing DNSNames from the wire format, as long as we do not
encounter a compression pointer, so that we only allocate and copy
once for as many labels as possible.
This has a noticeable impact in some of our speedtest results:
| Test | Before | After |
| --- | --- | --- |
| 'parse 'empty-query'' |
7282032.6 runs/s, 0.14 us/run |
13519722.8 runs/s, 0.07 us/run |
| 'parse 'empty-query' bare' |
7512588.4 runs/s, 0.13 us/run |
14421770.5 runs/s, 0.07 us/run |
| 'parse 'typical-referral' bare | 917539.2 runs/s, 1.09 us/run |
1151581.7 runs/s, 0.87 us/run |
| 'parse 'typical-referral'' | 626927.3 runs/s, 1.60 us/run | 711754.3 runs/s, 1.40 us/run |
The improvement is quite clear when the number of labels increases:
| Number of labels | Before | After |
| --- | --- | --- |
| 1 |
16280173.9 runs/s, 0.06 us/run |
15798338.6 runs/s, 0.06 us/run |
| 2 |
11591389.8 runs/s, 0.09 us/run |
15677266.9 runs/s, 0.06 us/run |
| 3 |
9008087.9 runs/s, 0.11 us/run |
14705491.1 runs/s, 0.07 us/run |
| 4 |
7391707.9 runs/s, 0.14 us/run |
14368828.1 runs/s, 0.07 us/run |
| 5 |
6172025.9 runs/s, 0.16 us/run |
14326900.3 runs/s, 0.07 us/run |
| 6 |
5396152.4 runs/s, 0.19 us/run |
13585892.7 runs/s, 0.07 us/run |
| 7 |
4763488.4 runs/s, 0.21 us/run |
12824105.9 runs/s, 0.08 us/run |
| 8 |
4323804.8 runs/s, 0.23 us/run |
12494736.6 runs/s, 0.08 us/run |
| 9 |
3877356.8 runs/s, 0.26 us/run |
12308737.6 runs/s, 0.08 us/run |
| ... | ... | ... |
| 127 | 360564.0 runs/s, 2.77 us/run |
2782692.4 runs/s, 0.36 us/run |