drh [Thu, 21 Dec 2006 01:29:22 +0000 (01:29 +0000)]
Move the shared-library loading routines into the OS portability layer,
thus enabling the os_win.c code to handle the character encoding
confusion of win95/nt/ce. Ticket #2023. (CVS 3541)
drh [Wed, 20 Dec 2006 03:24:19 +0000 (03:24 +0000)]
The query optimizer does a better job of optimizing out ORDER BY clauses
that contain the rowid or which use indices that contain the rowid.
Ticket #2116. (CVS 3536)
drh [Sat, 16 Dec 2006 16:25:15 +0000 (16:25 +0000)]
Query optimizer enhancement: In "FROM a,b,c left join d" allow the C table
to be reordered with A and B. This used to be the case but the capability
was removed by (3203) and (3052) in response to ticket #1652. This change
restores the capability. (CVS 3529)
shess [Wed, 29 Nov 2006 21:03:00 +0000 (21:03 +0000)]
Test that terms longer than interior nodes work correctly. A bug
prior to fts2.c r1.10 meant that such large terms caused an eventual
stack overflow. (CVS 3523)
shess [Wed, 29 Nov 2006 05:17:28 +0000 (05:17 +0000)]
http://www.sqlite.org/cvstrac/tktview?tn=2046
The virtual table interface allows for a cursor to field multiple
xFilter() calls. For instance, if a join is done with a virtual
table, there could be a call for each row which potentially matches.
Unfortunately, fulltextFilter() assumes that it has a fresh cursor,
and overwrites a prepared statement and a malloc'ed pointer, resulting
in unfinalized statements and a memory leak.
This change hacks the code to manually clean up offending items in
fulltextFilter(), emphasis on "hacks", since it's a fragile fix
insofar as future additions to fulltext_cursor could continue to have
the problem. (CVS 3521)
shess [Wed, 29 Nov 2006 01:02:03 +0000 (01:02 +0000)]
Delta-encode terms in interior nodes. While experiments have shown
that this is of marginal utility when encoding terms resulting from
regular English text, it turns out to be very useful when encoding
inputs with very large terms. (CVS 3520)
shess [Sat, 18 Nov 2006 00:12:44 +0000 (00:12 +0000)]
Store minimal terms in interior nodes. Whenever there's a break
between leaf nodes, instead of storing the entire leftmost term of the
rightmost child, store only that portion of the leftmost term
necessary to distinguish it from the rightmost term of the leftmost
child. (CVS 3513)
shess [Fri, 17 Nov 2006 21:12:15 +0000 (21:12 +0000)]
Refactoring groundwork for coming work on interior nodes. Change
LeafWriter to use empty data buffer (instead of empty term) to detect
an empty block. Code to validate interior nodes. Moderate revisions
to leaf-node and doclist validation. Recast leafWriterStep() in terms
of LeafWriterStepMerge(). (CVS 3512)
shess [Mon, 13 Nov 2006 21:00:54 +0000 (21:00 +0000)]
Require a minimum fanout for interior nodes. This prevents cases
where excessively large terms keep the tree from finding a single
root. A downside is that this could result in large interior nodes in
the presence of large terms, which may be prone to fragmentation,
though if the nodes were smaller that would translate into more levels
in the tree, which would also have that problem. (CVS 3510)
aswift [Sat, 11 Nov 2006 01:31:58 +0000 (01:31 +0000)]
The uninitialized file descriptor from the unixFile structure is passed to sqlite3DetectLockingStyle in allocateUnixFile rather than the file descriptor passed in. This was causing the locking detection on NFS file systems to behave somewhat randomly and the result was locks were not respected and data loss could occur. (CVS 3508)
drh [Thu, 9 Nov 2006 00:24:53 +0000 (00:24 +0000)]
First cut at adding the sqlite3_prepare_v2() API. Test cases added, but
more testing would be useful. Still need to update the documentation. (CVS 3506)
drh [Mon, 6 Nov 2006 21:20:25 +0000 (21:20 +0000)]
Use the difference between the SQLITE_IOERR_SHORT_READ and SQLITE_IOERR_READ
returns from sqlite3OsRead() to make decisions about what to do with the
error. (CVS 3503)
drh [Tue, 31 Oct 2006 21:16:48 +0000 (21:16 +0000)]
Change the default prefix for temporary files so that it no longer
contains the text "sqlite". In this way, perhaps we will not get so
many false bug reports such as ticket #2049, #1989, and #1841. (CVS 3498)
drh [Thu, 26 Oct 2006 18:15:42 +0000 (18:15 +0000)]
Bring CVS output into more commonly accepted practice. Tickets #2030, #1573.
Add command-line options -bail and ".bail" commands. Default behavior is
to continue after encountering an error. Ticket #2045. (CVS 3491)
drh [Thu, 26 Oct 2006 14:25:58 +0000 (14:25 +0000)]
Command-line shell enhancements. Bail out when errors are seen in
non-interactive mode. Override isatty() using -interactive or -batch
command-line options. Report line number in error messages.
Tickets #2009, #2045. (CVS 3490)
shess [Thu, 26 Oct 2006 00:41:51 +0000 (00:41 +0000)]
Empty queries should get no results. My recent change
( http://www.sqlite.org/cvstrac/chngview?cn=3486 ) broke test fts2a-5.3.
This change should make the expected result more obvious. (CVS 3489)
shess [Thu, 26 Oct 2006 00:04:31 +0000 (00:04 +0000)]
Make memset() uses less error-prone.
http://www.sqlite.org/cvstrac/tktview?tn=2036,35 describes some cases
where we were passing memset() a length which was the sizeof a
pointer, rather than the structure pointed to. Instead, wrap this
idiom up in CLEAR() and SCRAMBLE() macros. (CVS 3488)
shess [Wed, 25 Oct 2006 21:00:09 +0000 (21:00 +0000)]
Replace the DocList and DocListReader structures. The new structures
distinguish reading from a static buffer from writing to a dynamic
buffer. This allows n-way doclist merging, and in-place merging of
segment leaf nodes, which together cut segment merge times in half. (CVS 3486)
shess [Wed, 25 Oct 2006 05:21:55 +0000 (05:21 +0000)]
Don't store empty segments. When inserting empty strings, the code
was writing out a segment made up of a single leaf node containing the
\0 header. LeafReader assumed that leaf nodes always contained at
least one term, so assertions would fail.
While it would be possible to support reading and merging empty
segments, there's no reason to do so. While this change could have
been done in writeZeroSegment(), I put it in leafWriterFlush() so that
it would work right if segmentMerge() created an empty segment, which
could happen with future changes to how deleted documents are handled. (CVS 3484)
shess [Thu, 12 Oct 2006 23:15:24 +0000 (23:15 +0000)]
Convert fts2 to store data in a way which allows for much faster
updates. Groups of documents form segments which are encoded in a
btree layered over a table of blocks, with various tricks to make
merges fast. This performs 20x-25x faster than fts1 when loading the
Enron corpus, and is only slightly slower for queries. (CVS 3474)
shess [Thu, 5 Oct 2006 21:48:56 +0000 (21:48 +0000)]
Fix incorrect doclist initialization in term_select_all().
docListRestrictColumn() generates a DL_POSITIONS doclist, which means
that after the first doclist is processed, the second doclist is
initialized as DL_POSITIONS, but with DL_POSITIONS_OFFSETS data.
(Note that DL_DEFAULT is now DL_POSITIONS, which masks this bug.) (CVS 3467)
drh [Tue, 3 Oct 2006 19:05:18 +0000 (19:05 +0000)]
Report the error SQLITE_CORRUPT instead of SQLITE_IOERR if unable
to rollback a hot journal that was damaged (for example) by filesystem
corruption following a power failure. (CVS 3460)
drh [Sun, 1 Oct 2006 18:58:31 +0000 (18:58 +0000)]
Remove one non-working test case fromthe Porter stemmer tests and add
an acknowledgement for the source of the test data (Martin Porter himself.) (CVS 3453)