From: drh Date: Wed, 20 Sep 2017 09:09:34 +0000 (+0000) Subject: Updates to the "lemon.html" document received from Andy Goth. X-Git-Tag: version-3.21.0~76 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=9a243e69c2a067040aa2fe373cb25ec1278e9112;p=thirdparty%2Fsqlite.git Updates to the "lemon.html" document received from Andy Goth. FossilOrigin-Name: 5b2002f3df1902aaa571a0efd01ab8bae7f4d37ac4819cc51595277f4de93433 --- diff --git a/doc/lemon.html b/doc/lemon.html index f05c481d79..3ed85176f7 100644 --- a/doc/lemon.html +++ b/doc/lemon.html @@ -2,12 +2,12 @@ The Lemon Parser Generator - -

The Lemon Parser Generator

+ +

The Lemon Parser Generator

Lemon is an LALR(1) parser generator for C. It does the same job as "bison" and "yacc". -But lemon is not a bison or yacc clone. Lemon +But Lemon is not a bison or yacc clone. Lemon uses a different grammar syntax which is designed to reduce the number of coding errors. Lemon also uses a parsing engine that is faster than yacc and @@ -16,7 +16,7 @@ bison and which is both reentrant and threadsafe. has also been updated so that it too can generate a reentrant and threadsafe parser.) Lemon also implements features that can be used -to eliminate resource leaks, making is suitable for use +to eliminate resource leaks, making it suitable for use in long-running programs such as graphical user interfaces or embedded controllers.

@@ -58,8 +58,8 @@ Lemon comes with a default parser template which works fine for most applications. But the user is free to substitute a different parser template if desired.

-

Depending on command-line options, Lemon will generate between -one and three files of outputs. +

Depending on command-line options, Lemon will generate up to +three output files.

-The %name directive allows you to generator two or more different -parsers and link them all into the same executable. -

+The %name directive allows you to generate two or more different +parsers and link them all into the same executable.

The %nonassoc directive

This directive is used to assign non-associative precedence to -one or more terminal symbols. See the section on +one or more terminal symbols. See the section on precedence rules -or on the %left directive for additional information.

+or on the %left directive +for additional information.

The %parse_accept directive

-

The %parse_accept directive specifies a block of C code that is +

The %parse_accept directive specifies a block of C code that is executed whenever the parser accepts its input string. To "accept" an input string means that the parser was able to process all tokens without error.

@@ -821,7 +830,7 @@ without error.

The %parse_failure directive

-

The %parse_failure directive specifies a block of C code that +

The %parse_failure directive specifies a block of C code that is executed whenever the parser fails complete. This code is not executed until the parser has tried and failed to resolve an input error using is usual error recovery strategy. The routine is @@ -837,14 +846,14 @@ only invoked when parsing is unable to continue.

The %right directive

This directive is used to assign right-associative precedence to -one or more terminal symbols. See the section on +one or more terminal symbols. See the section on precedence rules or on the %left directive for additional information.

The %stack_overflow directive

-

The %stack_overflow directive specifies a block of C code that +

The %stack_overflow directive specifies a block of C code that is executed if the parser's internal stack ever overflows. Typically this just prints an error message. After a stack overflow, the parser will be unable to continue and must be reset.

@@ -857,7 +866,7 @@ will be unable to continue and must be reset.

You can help prevent parser stack overflows by avoiding the use of right recursion and right-precedence operators in your grammar. -Use left recursion and and left-precedence operators instead, to +Use left recursion and and left-precedence operators instead to encourage rules to reduce sooner and keep the stack size down. For example, do rules like this:

@@ -868,7 +877,7 @@ Not like this:
 
    list ::= element list.      // right-recursion.  Bad!
    list ::= .
-
+

The %stack_size directive

@@ -876,7 +885,7 @@ Not like this:

If stack overflow is a problem and you can't resolve the trouble by using left-recursion, then you might want to increase the size of the parser's stack using this directive. Put an positive integer -after the %stack_size directive and Lemon will generate a parse +after the %stack_size directive and Lemon will generate a parse with a stack of the requested size. The default value is 100.

@@ -886,25 +895,40 @@ with a stack of the requested size.  The default value is 100.

The %start_symbol directive

-

By default, the start-symbol for the grammar that Lemon generates +

By default, the start symbol for the grammar that Lemon generates is the first non-terminal that appears in the grammar file. But you -can choose a different start-symbol using the %start_symbol directive.

+can choose a different start symbol using the +%start_symbol directive.

    %start_symbol  prog
 

+ +

The %syntax_error directive

+ +

See Error Processing.

+ + +

The %token_class directive

+ +

Undocumented. Appears to be related to the MULTITERMINAL concept. +Implementation.

+

The %token_destructor directive

-

The %destructor directive assigns a destructor to a non-terminal -symbol. (See the description of the %destructor directive above.) -This directive does the same thing for all terminal symbols.

+

The %destructor directive assigns a destructor to a non-terminal +symbol. (See the description of the +%destructor directive above.) +The %token_destructor directive does the same thing +for all terminal symbols.

Unlike non-terminal symbols which may each have a different data type for their values, terminals all use the same data type (defined by -the %token_type directive) and so they use a common destructor. Other -than that, the token destructor works just like the non-terminal +the %token_type directive) +and so they use a common destructor. +Other than that, the token destructor works just like the non-terminal destructors.

@@ -913,8 +937,9 @@ destructors.

Lemon generates #defines that assign small integer constants to each terminal symbol in the grammar. If desired, Lemon will add a prefix specified by this directive -to each of the #defines it generates. -So if the default output of Lemon looked like this: +to each of the #defines it generates.

+ +

So if the default output of Lemon looked like this:

     #define AND              1
     #define MINUS            2
@@ -931,7 +956,7 @@ to cause Lemon to produce these symbols instead:
     #define TOKEN_MINUS      2
     #define TOKEN_OR         3
     #define TOKEN_PLUS       4
-
+

The %token_type and %type directives

@@ -952,7 +977,7 @@ token structure. Like this:

is "void*".

Non-terminal symbols can each have their own data types. Typically -the data type of a non-terminal is a pointer to the root of a parse-tree +the data type of a non-terminal is a pointer to the root of a parse tree structure that contains all information about that non-terminal. For example:

@@ -973,14 +998,15 @@ and able to pay that price, fine. You just need to know.

The %wildcard directive

-

The %wildcard directive is followed by a single token name and a -period. This directive specifies that the identified token should -match any input token. +

The %wildcard directive is followed by a single token name and a +period. This directive specifies that the identified token should +match any input token.

When the generated parser has the choice of matching an input against the wildcard token and some other token, the other token is always used. -The wildcard token is only matched if there are no other alternatives. +The wildcard token is only matched if there are no alternatives.

+

Error Processing

After extensive experimentation over several years, it has been @@ -988,19 +1014,20 @@ discovered that the error recovery strategy used by yacc is about as good as it gets. And so that is what Lemon uses.

When a Lemon-generated parser encounters a syntax error, it -first invokes the code specified by the %syntax_error directive, if +first invokes the code specified by the %syntax_error directive, if any. It then enters its error recovery strategy. The error recovery strategy is to begin popping the parsers stack until it enters a state where it is permitted to shift a special non-terminal symbol named "error". It then shifts this non-terminal and continues -parsing. But the %syntax_error routine will not be called again +parsing. The %syntax_error routine will not be called again until at least three new tokens have been successfully shifted.

If the parser pops its stack until the stack is empty, and it still -is unable to shift the error symbol, then the %parse_failed routine +is unable to shift the error symbol, then the +%parse_failure routine is invoked and the parser resets itself to its start state, ready to begin parsing a new file. This is what will happen at the very -first syntax error, of course, if there are no instances of the +first syntax error, of course, if there are no instances of the "error" non-terminal in your grammar.

diff --git a/manifest b/manifest index b883f00be7..7d332d8350 100644 --- a/manifest +++ b/manifest @@ -1,5 +1,5 @@ -C Add\sthe\ssqlite3_mmap_warm()\sfunction\sas\san\sextension\sin\sthe\sext/misc/mmapwarm.c\ssource\sfile. -D 2017-09-18T18:17:01.889 +C Updates\sto\sthe\s"lemon.html"\sdocument\sreceived\sfrom\sAndy\sGoth. +D 2017-09-20T09:09:34.192 F Makefile.in 4bc36d913c2e3e2d326d588d72f618ac9788b2fd4b7efda61102611a6495c3ff F Makefile.linux-gcc 7bc79876b875010e8c8f9502eb935ca92aa3c434 F Makefile.msc 6033b51b6aea702ea059f6ab2d47b1d3cef648695f787247dd4fb395fe60673f @@ -33,7 +33,7 @@ F config.sub 9ebe4c3b3dab6431ece34f16828b594fb420da55 F configure e691ad9b505f1f47bc5d99be9e1d49b1be9037e9cb3821c9b14c63c3d413d055 x F configure.ac bb85c1c53e952c8c7078a2f147eba613e0128b8b6e7780d64758d8fb29bcc695 F contrib/sqlitecon.tcl 210a913ad63f9f991070821e599d600bd913e0ad -F doc/lemon.html 1f8b8d4c9f5cfe40e679fee279cc9eb2da8e6eb74ad406028538d7864cc4b6cb +F doc/lemon.html 278113807f49d12d04179a93fab92b5b917a08771152ca7949d34e928efa3941 F doc/pager-invariants.txt 27fed9a70ddad2088750c4a2b493b63853da2710 F doc/vfs-shm.txt e101f27ea02a8387ce46a05be2b1a902a021d37a F ext/README.md fd5f78013b0a2bc6f0067afb19e6ad040e89a10179b4f6f03eee58fac5f169bd @@ -1655,8 +1655,7 @@ F vsixtest/vsixtest.tcl 6a9a6ab600c25a91a7acc6293828957a386a8a93 F vsixtest/vsixtest.vcxproj.data 2ed517e100c66dc455b492e1a33350c1b20fbcdc F vsixtest/vsixtest.vcxproj.filters 37e51ffedcdb064aad6ff33b6148725226cd608e F vsixtest/vsixtest_TemporaryKey.pfx e5b1b036facdb453873e7084e1cae9102ccc67a0 -P a944719314e0ac2f1954b65668815769eba3ab3e39a74666293b8dea52a184b2 3235835babb49b4dd1acaabd1aa6cfb0b7fe19a914db1cb511e8cc872d3c0c39 -R 62da41337307696798754c70fb4a4da8 -T +closed 3235835babb49b4dd1acaabd1aa6cfb0b7fe19a914db1cb511e8cc872d3c0c39 +P 1b2de41453ac33de82f9cd6cbb92eee4fe184fb282c27e5efa5243c8cb239630 +R e0e8cf7279386534b02109d8fa18ad99 U drh -Z 9c4b90490d8e7ae619a7569445f78dbf +Z 86920861ac015f347841db4c53c64a7b diff --git a/manifest.uuid b/manifest.uuid index eea4571e30..0ce91438a9 100644 --- a/manifest.uuid +++ b/manifest.uuid @@ -1 +1 @@ -1b2de41453ac33de82f9cd6cbb92eee4fe184fb282c27e5efa5243c8cb239630 \ No newline at end of file +5b2002f3df1902aaa571a0efd01ab8bae7f4d37ac4819cc51595277f4de93433 \ No newline at end of file