]> git.ipfire.org Git - thirdparty/sqlite.git/commit
Up until now the fts4 "unicode61" tokenizer has treated all private use codepoints...
authordan <dan@noemail.net>
Wed, 5 Jun 2013 16:17:21 +0000 (16:17 +0000)
committerdan <dan@noemail.net>
Wed, 5 Jun 2013 16:17:21 +0000 (16:17 +0000)
commitf2c9229f73e7ceb6899f246e19c10dfd644b58a5
tree938b5489337afd28f60b8b9dc329f91b12b48fd4
parentf5ad80397d44c088a3974e2838446011f2430b97
Up until now the fts4 "unicode61" tokenizer has treated all private use codepoints except the first and last of each of the three ranges as alphanumeric (eligible to be part of tokens). This commit fixes this so that all private use codepoints are considered alphanumeric. In other words, it fixes the handling of codepoints 0xE000, 0xF8FF, 0xF0000, 0xFFFFD, 0x100000 and 0x10FFFD.

FossilOrigin-Name: 6cfd9af5250029c0d275be027b4208c48954a8a1
ext/fts3/fts3_unicode2.c
ext/fts3/unicode/mkunicode.tcl
manifest
manifest.uuid
test/fts4unicode.test