From 4f3557e4c95b78bc15229afc06799b2bf1ef99dd Mon Sep 17 00:00:00 2001 From: drh <> Date: Fri, 7 Jan 2022 18:09:56 +0000 Subject: [PATCH] Notes on the JSON enhancement proposals. FossilOrigin-Name: 18160985ea6b2bbf27de25e0f4f3a1ebcdb079a36af039fc06e37a834e49e772 --- doc/json-enhancements.md | 123 +++++++++++++++++++++++++++++++++++++++ manifest | 11 ++-- manifest.uuid | 2 +- 3 files changed, 130 insertions(+), 6 deletions(-) create mode 100644 doc/json-enhancements.md diff --git a/doc/json-enhancements.md b/doc/json-enhancements.md new file mode 100644 index 0000000000..4c226a2c96 --- /dev/null +++ b/doc/json-enhancements.md @@ -0,0 +1,123 @@ +# Proposed Enhancements To JSON Functions + +## 1.0 New function json_nextract() + +The new function json_nextract() works the same as json_extract() except +when the input JSON is not well-formed. This is what the routines do +when the input JSON in the first argument is not well-formed: + + 1. **json_extract()** → raises an error and aborts the query. + + 2. **json_nextract()** with 2 arguments the second argument is + exactly `'$'` → work like json_quote() and return the first + argument as a JSON quoted string. + + 3. **json_nextract()** otherwise → return NULL. + +If the input is known to be JSON, then json_extract() should work just +fine for all your needs. But sometimes a table might have a column that +sometimes holds JSON and sometimes holds some other content. Suppose, +for example, an application started out holding a single phone number for +each user, but later was enhanced so that the same database file could +hold a JSON array of phone numbers. The USER table might have some entries +that are JSON arrays and some entries which are just text strings containing +phone numbers. The application can use json_nextract() to be robust in +extracting values from that column. + +The feature (2) above is envisioned to be useful for sanitizing table +content. Suppose a table is populated from dirty CSV, and some of the +JSON is mis-formatted. You could convert all entries in a table to use +well-formed JSON using something like this: + +> ~~~ +UPDATE data SET jsonData = json_nextract(jsonData,'$'); +~~~ + +In the query above, well-formed JSON would be unchanged, and mis-formatted +JSON would be converted into a well-formatted JSON string. + +## 2.0 Add the `->` and '->>` operators as aliases for json_extract(). + +Two new binary operators "`->`" and "`->>`" operators are the same +as json_nextract() and json_extract(), respectively. + +> ~~~ +SELECT '{"a":5,"b":17}' -> '$.a', '[4,1,-6]' ->> '$[0]'; +~~~ + +Is equivalent to (and generates the same bytecode as): + +> ~~~ +SELECT json_nextract('{"a":5,"b":17}','$.a'), json_extract('[4,1,-6]','$[0]'); +~~~ + +The ->> operator works the same as the ->> operator in MySQL +and mostly compatible with PostgreSQL (hereafter "PG"). Addition enhancements +in section 3.0 below are required to bring ->> into compatibility with PG. + +The -> operator is mostly compatible with MySQL and PG too. The main +difference is that in MySQL and PG, the result from -> is not a primitive +SQL datatype type but rather more JSON. It is unclear how this would ever +be useful for anything, and so I am unsure why they do this. But that is +the way it is done in those system. + +SQLite strives to be compatible with MySQL and PG with the ->> operator, +but not with the -> operator. + +## 3.0 Abbreviated JSON path specifications for use with -> and ->> + +The "->" and "->>" and operators allow abbreviated +forms of JSON path specs that omit unnecessary $-prefix text. For +example, the following queries are equivalent: + +> ~~~ +SELECT '{"a":17, "b":4.5}' ->> '$.a'; +SELECT '{"a":17, "b":4.5}' ->> 'a'; +~~~ + +Similarly, these queries mean exactly the same thing: + +> ~~~ +SELECT '[17,4.5,"hello",0]' ->> '$[1]'; +SELECT '[17,4.5,"hello",0]' ->> 1; +~~~ + +The abbreviated JSON path specs are allowed with the -> and ->> operators +only. The json_extract() and json_nextract() functions, and all the other +JSON functions, still use the full path spec and will raise an error if +the full path spec is not provided. + +This enhancement provides compatibility with PG. +PG does not support JSON path specs on its ->> operator. With PG, the +right-hand side of ->> must be either an integer (if the left-hand side +is a JSON array) or a text string which is interpreted as a field name +(if the left-hand side is a JSON object). So the ->> operator in PG is +rather limited. With this enhancement, the ->> operator in SQLite +covers all the functionality of PG, plus a lot more. + +MySQL also supports the ->> operator, but it requires a full JSON path +spec on the right-hand side. SQLite also supports this, so SQLite is +compatibility with MySQL as well. Note, however, that MySQL and PG +are incompatible with each other. You can (in theory) write SQL that +uses the ->> operator that is compatible between SQLite and MySQL, +or that is compatible between SQLite and PG, but not that is compatible +with all three. + +## 4.0 New json_ntype() SQL function + +A new function "json_ntype(JSON)" works like the existing one-argument +version of the "json_type(JSON)" function, except that json_ntype(JSON) +returns NULL if the argument is not well-formed JSON, whereas the +existing json_type() function raises an error in that case. + +In other words, "`json_ntype($json)`" is equivalent to +"`CASE WHEN json_valid($json) THEN json_type($json) END`". + +This function is seen as useful for figuring out which rows of a table +have a JSON type in a column and which do not. For example, to find +all rows in a table in which the value of the the "phonenumber" column +contains a JSON array, you could write: + +> ~~~ +SELECT * FROM users WHERE json_ntype(phonenumber) IS 'array'; +~~~ diff --git a/manifest b/manifest index 3227004e9a..0c197632b6 100644 --- a/manifest +++ b/manifest @@ -1,5 +1,5 @@ -C Reverse\sthe\smeaningn\sof\s->\sand\s->>.\s\s->>\sraises\san\serror\son\sinvalid\sJSON\nbut\s->\sdoes\snot.\s\sThis\sallows\s->>\sto\sbehave\sthe\ssame\sas\sPG\sand\sMySQL. -D 2022-01-07T17:26:40.622 +C Notes\son\sthe\sJSON\senhancement\sproposals. +D 2022-01-07T18:09:56.414 F .fossil-settings/empty-dirs dbb81e8fc0401ac46a1491ab34a7f2c7c0452f2f06b54ebb845d024ca8283ef1 F .fossil-settings/ignore-glob 35175cdfcf539b2318cb04a9901442804be81cd677d8b889fcc9149c21f239ea F LICENSE.md df5091916dbb40e6e9686186587125e1b2ff51f022cc334e886c19a0e9982724 @@ -38,6 +38,7 @@ F configure 56f2a6637cdba53788673ccc229c4f95ab3ab6fe67036e0b1291dc615e531a58 x F configure.ac c8ba54bac7e73e000acdfef5e394fe21a3876aa09d0f5c07131bf5ac5a525299 F contrib/sqlitecon.tcl 210a913ad63f9f991070821e599d600bd913e0ad F doc/F2FS.txt c1d4a0ae9711cfe0e1d8b019d154f1c29e0d3abfe820787ba1e9ed7691160fcd +F doc/json-enhancements.md b026346e18a18a90d84dbda457e3127282468ad26baaefc005a1656429fa4232 F doc/lemon.html efc0cd2345d66905505d98f862e1c571512def0ceb5b016cb658fd4918eb76a3 F doc/pager-invariants.txt 27fed9a70ddad2088750c4a2b493b63853da2710 F doc/trusted-schema.md 33625008620e879c7bcfbbfa079587612c434fa094d338b08242288d358c3e8a @@ -1937,8 +1938,8 @@ F vsixtest/vsixtest.tcl 6a9a6ab600c25a91a7acc6293828957a386a8a93 F vsixtest/vsixtest.vcxproj.data 2ed517e100c66dc455b492e1a33350c1b20fbcdc F vsixtest/vsixtest.vcxproj.filters 37e51ffedcdb064aad6ff33b6148725226cd608e F vsixtest/vsixtest_TemporaryKey.pfx e5b1b036facdb453873e7084e1cae9102ccc67a0 -P b4c8a62381755b9f1447e10ab95df7209eebda91f9a4583ef1c093a13f6b4725 -R 5e00922e334b2eca45d33167fc947357 +P 85f8170555ee0d4d28cb7e120a7062e9f64c331a936fdfa29fc0e67224eea7c6 +R f0bbc4fbd1752207d287e4842cece932 U drh -Z ca8d93554520c6d97db646c26d2dff94 +Z d04dbd17b8376af71e0674b428869308 # Remove this line to create a well-formed Fossil manifest. diff --git a/manifest.uuid b/manifest.uuid index 89e07a6e91..3de7da8f44 100644 --- a/manifest.uuid +++ b/manifest.uuid @@ -1 +1 @@ -85f8170555ee0d4d28cb7e120a7062e9f64c331a936fdfa29fc0e67224eea7c6 \ No newline at end of file +18160985ea6b2bbf27de25e0f4f3a1ebcdb079a36af039fc06e37a834e49e772 \ No newline at end of file -- 2.47.2