Apply various ICU patches between ICU 4.4.1 and 57.1
Parts of ICU 3.8 and 4.4.1 are included in bora/lib/unicode.
ICU is now at version 57.1 (which would have been 5.7.1 in
its old version numbering scheme) and has had some
security-related patches during that time.
Directly updating the ICU bits in bora/lib/unicode with the
corresponding bits from ICU 57.1 no longer seems feasible
because the ICU code has made increasing use of C++
(including C++11), which is something that we'd like to avoid
in widely shared libraries like lib/unicode and lib/misc.
Picked out a handful of fixes (particularly security-relevant
changes) that are applicable to our forked copy:
* r28300: ticket:7783: error checking in U16_GET()
* r29214: ticket:8238: Implement max subpart policy for UTF7
toUnicode, don't consume valid bytes after err
* r30175: ticket:8569: Terminate the UTF-7 byte buffer with
MINUS when flushing
* r30326: ticket:8265: Fix race (=> U_FILE_ACCESS_ERROR) when
loading full set of ICU data
* r31914: ticket:8235: do not call memcpy()/memmove()/...
with a NULL/invalid source pointer
* r31948: ticket:9340: Use bit mask instead of cast to avoid
buffer overflow
* r32021: ticket:9340: Fix potential out of bound error in
ICU4C ISCII converter
* r32041: ticket:9432: fix value of UDATA_FILE_ACCESS_COUNT
* r32242: ticket:9481 : handled segmentation fault issue with
uenum_next
* r32529: ticket:9601: from-UTF-8 m:n conversion: properly
revert to pivoting for m:n matching
* r32574: ticket:9398: avoid use of utf8_countTrailBytes[],
rewrite/optimize U8_COUNT_TRAIL_BYTES() &
U8_NEXT_UNSAFE(), test _UNSAFE macros only with
(mostly) well-formed UTF-8 text
* r32907: ticket:9687: Propagate the ambiguous alias warning
when opening converter
* r37670: ticket:11776 Thread safety fixes in data loading.
* r37889: ticket:11765: fix utrans_stripRules() source
overruns from a comment or an escape at the end of
the source string; make U8_SET_CP_LIMIT() work with
index after NUL terminator, consistent with
U16_SET_CP_LIMIT(), although strictly speaking this
behavior is undefined
* r38086: ticket:11979: Fix max char size for iso-2022-kr in
icu4c
* r38185: ticket:12015: Update the array size to avoid buffer
overflow
Diffs for the ICU changes (with associated links to their
tickets) can be found at:
where REVISION is the corresponding numeric value.
Notes:
* r32907 makes a slight change to ucnv_open's return value
and isn't strictly necessary, but is included as a matter
of correctness and because we would eventually need to
handle the new behavior anyway. Changed sites that
checked directly against U_ZERO_ERROR to use
U_SUCCESS/U_FAILURE instead.
* Included r30326 and r37670 (which both involve race
conditions when loading ICU data), but we do not execute
those code paths.