shess [Thu, 5 Oct 2006 21:48:56 +0000 (21:48 +0000)]
Fix incorrect doclist initialization in term_select_all().
docListRestrictColumn() generates a DL_POSITIONS doclist, which means
that after the first doclist is processed, the second doclist is
initialized as DL_POSITIONS, but with DL_POSITIONS_OFFSETS data.
(Note that DL_DEFAULT is now DL_POSITIONS, which masks this bug.) (CVS 3467)
drh [Tue, 3 Oct 2006 19:05:18 +0000 (19:05 +0000)]
Report the error SQLITE_CORRUPT instead of SQLITE_IOERR if unable
to rollback a hot journal that was damaged (for example) by filesystem
corruption following a power failure. (CVS 3460)
drh [Sun, 1 Oct 2006 18:58:31 +0000 (18:58 +0000)]
Remove one non-working test case fromthe Porter stemmer tests and add
an acknowledgement for the source of the test data (Martin Porter himself.) (CVS 3453)
Be sure to ignore PRAGMA encoding pragmas if the encoding has already been
set for a database. Ticket #1987. This patch also includes some cleanup
of the schema parser and initialization logic. (CVS 3436)
We handle an UPDATE to a row by performing an UPDATE on the content table and by building new position lists for each term which appears in either the old or new versions of the row. We write these position lists all at once; this is presumably more efficient than a delete followed by an insert (which would first write empty position lists, then new position lists). (CVS 3434)
When gathering a doclist for querying, don't discard empty position lists until the end; this allows empty position lists to override non-empty lists encountered later in the gathering process. This fixes #1982, which was caused by the fact that for all-column queries we weren't discarding empty position lists at all. (CVS 3433)
Convert all names to lower case before sending them to the xFindFunction
method of a virtual table. In FTS1, use strcmp instead of strcasecmp.
Ticket #1981. (CVS 3429)
Convert all names to lower case before sending them to the xFindFunction
method of a virtual table. In FTS1, use strcmp instead of strcasecmp.
Ticket #1981. (CVS 3428)
Modify FTS1 so that the "magic" column has the same name as the virtual
table. Offsets are retrieved using a special "offsets" function whose
first argument is the magic column. Snippets will ultimately be retrieved
in the same way. (CVS 3427)
Add support for extended result codes - additional result information
carried in the higher bits of the integer return codes. This must be
enabled using the sqlite3_extended_result_code() API. Only a few extra
result codes are currently defined. (CVS 3422)
The FTS1 tables have a new automatic column named "offset" that returns
a string containing byte offset information for all matching terms.
Also added a large test case based on SQLite mailing list entries. (CVS 3417)
Module spec parser enhancements for FTS1. Now able to cope with column
names in the spec that are SQL keywords or have special characters, etc.
Also added support for additional control lines. Column names can be
followed by a type specifier (which is ignored.) (CVS 3410)
Allow virtual tables to contain multiple full-text-indexed columns. Added a magic column "_all" which can be used for querying all columns in a table at once.
For now, each posting list stores position/offset information for multiple columns. We may implement separate posting lists for separate columns at some future point. (CVS 3408)
Re-use deleted rowids for new segments. This has a somewhat
surprising impact on performance, I believe because it keeps the index
smaller (by keeping rowids smaller), and also because it improves
locality in the table (deleting a row means we've already touched the
pages leading to that rowid). (CVS 3405)
Add a rudimentary tokenizer and parser to FTS1 for parsing the module
arguments during initialization. Recognized arguments include a
tokenizer selector and a list of virtual table columns. (CVS 3403)
Add pzErr parameters to the xConnect and xCreate methods of virtual tables
in order to provide better error reporting. This is an interface change
for virtual tables. Prior virtual table implementations will need to be
modified and recompiled. (CVS 3402)
Add a new zErrMsg field to the sqlite3_vtab structure to support returning
error messages from virtual table constructors. This change means that
virtual table implementations compiled as loadable extensions for version
3.3.7 will need to be recompile for version 3.3.8 and will not be usable
by both versions at one. The virtual table mechanism is still considered
experimental so we feel justified in breaking backwards compatibility
in this way. Additional interface changes might occurs in the future. (CVS 3401)
Write doclists using a segmented technique to amortize costs better.
New items for a term are merged with the term's segment 0 doclist,
until that doclist exceeds CHUNK_MAX. Then the segments are merged in
exponential fashion, so that segment 1 contains approximately
2*CHUNK_MAX data, segment 2 4*CHUNK_MAX, and so on. (CVS 3398)
Add HAVE_GMTIME_R and HAVE_LOCALTIME_R flags and use them if defined.
Unable to modify the configure script to test for gmtime_r and
localtime_r, however, because on my SuSE 10.2 system, autoconf generates
a configure script that does not work. Bummer. Ticket #1906 (CVS 3397)
Bug fix in date/time computations. Ticket #1964.
Some unrelated comment typos are also fixed and got accidently
checked in at the same time. (CVS 3396)
Do not call the xDisconnect method on a virtual table while xUpdate is
pending. Instead, defer the xDisconnect until after xUpdate completes. (CVS 3387)
Test for busted TCL builds that do not support 64-bit integers and print
a warning message to users that test failures may be a result of the bad
TCL build and not some problem with SQLite. Ticket #1953. (CVS 3386)
Automatically compute the sqlite3.def and tclsqlite3.def files when
building windows DLLs. This will (hopefully) keep the .def files in
perfect synchronization with the DLLs. Ticket #1951. (CVS 3381)
Make fts1.c not rely on nul-terminated strings. Mostly a matter of
making sure we always pass around ptr/len, but there were a few places
where we actually relied on nul-termination.
An earlier change had additionally changed appropriate
sqlite3_bind_text() calls to sqlite3_bind_blob(). I've found that
this changes what's actually stored in the database, so backed those
changes out. Also (and this is weird), I found that I could no longer
do straight-forward = queries against %_term.term at a command-line. (CVS 3379)
Make tokenizer not rely on nul-terminated text. Instead of using
strcspn() and a nul-terminated delimiter list, I just flagged
delimiters in an array and wrote things inline. Submitting this for
review separately because it's pretty standalone. (CVS 3378)
drh [Thu, 31 Aug 2006 15:07:14 +0000 (15:07 +0000)]
Refactor the FTS1 module so that its name is "fts1" instead of "fulltext",
so that all symbols with external linkage begin with "sqlite3Fts1", and
so that all filenames begin with "fts1". (CVS 3377)
shess [Wed, 30 Aug 2006 21:40:30 +0000 (21:40 +0000)]
Just don't run tolower() on hi-bit characters. This shouldn't cause
us to break any UTF-8 code points, unless they were already broken in
the input. (CVS 3376)
drh [Tue, 29 Aug 2006 13:08:37 +0000 (13:08 +0000)]
Document the fact that SQLite allows NULL values in PRIMARY KEY columns
and the fact that we might design to change this in the future.
Ticket #518. (CVS 3373)