From: Mike Stepanek (mstepane) Date: Wed, 20 Oct 2021 11:29:02 +0000 (+0000) Subject: Merge pull request #3097 in SNORT/snort3 from ~SVLASIUK/snort3:jit_integration to... X-Git-Tag: 3.1.15.0~1 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=cd13943f3425221dac2b36798caf1e6894a84810;p=thirdparty%2Fsnort3.git Merge pull request #3097 in SNORT/snort3 from ~SVLASIUK/snort3:jit_integration to master Squashed commit of the following: commit bf4d7d74121f85dfc9cc576ac43943beca597941 Author: Serhii Vlasiuk Date: Mon Sep 27 19:08:05 2021 +0300 http_inspect: implement JIT (just-in-time) for JavaScript normalization Remove 'experimental' from JavaScript normalization documentation Update js_normalization_depth=-1 as default value Remove previous JIC implementation for JavaScript normalizatio --- diff --git a/doc/user/http_inspect.txt b/doc/user/http_inspect.txt index 6f9fd5f5e..54a7150f9 100755 --- a/doc/user/http_inspect.txt +++ b/doc/user/http_inspect.txt @@ -52,6 +52,41 @@ and specific way and individually made available for the user to write rules against it. If for example a header is supposed to be a date then normalization means put that date in a standard format. +==== Legacy and Enhanced Normalizers + +Currently, there are Legacy and Enhanced Normalizers for JavaScript +normalization. Both normalizers are independent and can be configured +separately. The Legacy normalizer should be considered deprecated. +The Enhanced Normalizer is encouraged to use for JavaScript normalization +in the first place as we continue improving functionality and quality. + +===== Legacy Normalizer + +The Legacy Normalizer can normalize obfuscated data within the JavaScript +functions such as unescape, String.fromCharCode, decodeURI, and decodeURIComponent. +It also replaces consecutive whitespaces with a single space and normalizes +the plus by concatenating the strings. For more information on how to enable +Legacy Normalizer, check the http_inspect.normalize_javascript option. Legacy +Normalizer is deprecated preferably to use Enhanced Normalizer. After +supporting backward compatibility in the Enhanced Normalizer, Legacy Normalizer +will be removed. + +===== Enhanced Normalizer + +Having ips option 'js_data' in the rules automatically enables Enhanced +Normalizer. The Enhanced Normalizer can normalize inline/external scripts. +It supports scripts over multiple PDUs. It is a stateful JavaScript whitespace +and identifiers normalizer. All JavaScript identifier names, except those, +are from the list of built-in identifiers, will be substituted to unified +names with the following format: var_0000 -> var_ffff. Moreover, Normalizer +validates the syntax concerning ECMA-262 Standard, including scope tracking, +and checks for restrictions for contents of script elements (since it is +HTML-embedded JavaScript). For more information on how additionally configure +Enhanced Normalizer check the following http_inspect options: js_normalization_depth, +js_norm_identifier_depth, js_norm_max_tmpl_nest, js_norm_max_scope_depth, +js_norm_built_in_ident. Eventually Enhanced Normalizer will completely replace +Legacy Normalizer. + ==== Configuration Configuration can be as simple as adding: @@ -154,41 +189,31 @@ decodeURIComponent are normalized. The different encodings handled within the unescape, decodeURI, or decodeURIComponent are %XX, %uXXXX, XX and uXXXXi. http_inspect also replaces consecutive whitespaces with a single space and normalizes the plus by concatenating the strings. Such normalizations -refer to basic JavaScript normalization. Cannot be used together with -js_normalization_depth (doing so will cause Snort to fail to load). This is -planned to be deprecated at some point. +refer to basic JavaScript normalization. ===== js_normalization_depth -js_normalization_depth = N {-1 : max53} will set a number of input -JavaScript bytes to normalize and enable the enhanced normalizer. The enhanced -and legacy normalizers have mutual exclusion behaviour, so you cannot enable -both at the same time (doing so will cause Snort to fail to load). When the -depth is reached, normalization will be stopped. It's implemented per-script. -js_normalization_depth = -1, will set unlimited depth. By default, -the value is set to 0 which means that normalizer is disabled. The enhanced -normalizer provides more precise whitespace normalization of JavaScript, that -removes all redundant whitespaces and line terminators from the JavaScript -syntax point of view (between identifier and punctuator, between identifier and -operator, etc.) according to ECMAScript 5.1 standard. -Additionally, it performs normalization of JavaScript identifiers making a -substitution of unique names with unified names representation: a0 -> z9999. -The identifiers are variables and function names. -The normalized data is available through the js_data rule option. -This is currently experimental and still under development. +js_normalization_depth = N {-1 : max53} will set a number of input JavaScript +bytes to normalize. When the depth is reached, normalization will be stopped. +It's implemented per-script. By default js_normalization_depth = -1, will set +unlimited depth. The enhanced normalizer provides more precise whitespace +normalization of JavaScript, that removes all redundant whitespaces and line +terminators from the JavaScript syntax point of view (between identifier and +punctuator, between identifier and operator, etc.) according to ECMAScript 5.1 +standard. Additionally, it performs normalization of JavaScript identifiers making +a substitution of unique names with unified names representation: var_0000:var_ffff. +The identifiers are variables and function names. The normalized data is available +through the js_data rule option. ===== js_norm_identifier_depth -js_norm_identifier_depth = N {0 : 260000} will set a number of unique +js_norm_identifier_depth = N {0 : 65536} will set a number of unique JavaScript identifiers to normalize. When the depth is reached, a built-in -alert is generated. It's implemented per HTTP transaction (request/response), -so the context of identifier substitutions is shared between all the scripts in -the payload. By default, the value is set to 260000, which is the max allowed -number of unique identifiers. The generated names are in the range from -a0 to z9999. Thus, the number of unique identifiers cannot be greater than -26 * 10000 = 260000. This option takes effect only if js_normalization_depth is -set to a non-zero value, enabling the enhanced normalizer. -This is currently experimental and still under development. +alert is generated. Every HTTP Response has its own identifier substitution +context. Thus, all scripts from the same response will be normalized as if +they are a single script.. By default, the value is set to 65536, which +is the max allowed number of unique identifiers. The generated names are in +the range from var_0000 to var_ffff. ===== js_norm_max_tmpl_nest @@ -198,18 +223,14 @@ to be processed. Introduced in ES6, template literals provide syntax to define a literal multiline string, which can have arbitrary JavaScript substitutions, that will be evaluated and inserted into the string. Such substitutions can be nested, and require keeping track of every layer for proper normalization. This option -is present to limit the amount of memory dedicated to this tracking. This option -is used only when js_normalization_depth is not 0. This feature -is currently experimental and still under development. +is present to limit the amount of memory dedicated to this tracking. ===== js_norm_max_scope_depth js_norm_max_scope_depth = N {0 : 65535} (default 256) is an option of the enhanced JavaScript normalizer that determines the deepest level of nested scope. The scope term includes code sections("{}"), parentheses("()") and brackets("[]"). This option -is present to limit the amount of memory dedicated to this tracking. This option is used -only when js_normalization_depth is not 0. This feature is currently experimental and -still under development. +is present to limit the amount of memory dedicated to this tracking. ===== js_norm_built_in_ident diff --git a/src/detection/detection_engine.cc b/src/detection/detection_engine.cc index 4b5daca44..65572cff5 100644 --- a/src/detection/detection_engine.cc +++ b/src/detection/detection_engine.cc @@ -104,7 +104,6 @@ DetectionEngine::DetectionEngine() context = Analyzer::get_switcher()->interrupt(); context->file_data = DataPointer(nullptr, 0); - context->js_data = DataPointer(nullptr, 0); reset(); } @@ -300,12 +299,6 @@ void DetectionEngine::set_file_data(const DataPointer& dp) DataPointer& DetectionEngine::get_file_data(IpsContext* c) { return c->file_data; } -void DetectionEngine::set_js_data(const DataPointer& dp) -{ Analyzer::get_switcher()->get_context()->js_data = dp; } - -DataPointer& DetectionEngine::get_js_data(IpsContext* c) -{ return c->js_data; } - void DetectionEngine::set_data(unsigned id, IpsContextData* p) { Analyzer::get_switcher()->get_context()->set_context_data(id, p); } diff --git a/src/detection/detection_engine.h b/src/detection/detection_engine.h index b3940d278..dfb1eb278 100644 --- a/src/detection/detection_engine.h +++ b/src/detection/detection_engine.h @@ -71,9 +71,6 @@ public: static void set_file_data(const DataPointer& dp); static DataPointer& get_file_data(IpsContext*); - static void set_js_data(const DataPointer& dp); - static DataPointer& get_js_data(IpsContext*); - static uint8_t* get_buffer(unsigned& max); static struct DataBuffer& get_alt_buffer(Packet*); @@ -133,15 +130,6 @@ static inline void set_file_data(const uint8_t* p, unsigned n) static inline void clear_file_data() { set_file_data(nullptr, 0); } -static inline void set_js_data(const uint8_t* data, unsigned len) -{ - DataPointer dp { data, len }; - DetectionEngine::set_js_data(dp); -} - -static inline void clear_js_data() -{ set_js_data(nullptr, 0); } - } // namespace snort #endif diff --git a/src/detection/fp_detect.cc b/src/detection/fp_detect.cc index 82ad63da5..b49ead2cc 100644 --- a/src/detection/fp_detect.cc +++ b/src/detection/fp_detect.cc @@ -959,22 +959,8 @@ static int fp_search(PortGroup* port_group, Packet* p, bool srvc) gadget, buf, buf.IBT_COOKIE, p, port_group, PM_TYPE_COOKIE, pc.cookie_searches); search_buffer(gadget, buf, buf.IBT_VBA, p, port_group, PM_TYPE_VBA, pc.vba_searches); - } - if ( MpseGroup* so = port_group->mpsegrp[PM_TYPE_SCRIPT] ) - { - // FIXIT-M js data should be obtained from - // inspector gadget as is done with search_buffer - DataPointer js_data = p->context->js_data; - - if ( js_data.len ) - { - debug_logf(detection_trace, TRACE_FP_SEARCH, p, - "%" PRIu64 " fp search %s[%d]\n", p->context->packet_number, - pm_type_strings[PM_TYPE_SCRIPT], js_data.len); - - batch_search(so, p, js_data.data, js_data.len, pc.script_searches); - } + search_buffer(gadget, buf, buf.IBT_JS_DATA, p, port_group, PM_TYPE_JS_DATA, pc.js_data_searches); } // file searches file only diff --git a/src/detection/fp_utils.cc b/src/detection/fp_utils.cc index e0ec958ef..99af3f0e9 100644 --- a/src/detection/fp_utils.cc +++ b/src/detection/fp_utils.cc @@ -77,8 +77,8 @@ PmType get_pm_type(CursorActionType cat) case CAT_SET_COOKIE: return PM_TYPE_COOKIE; - case CAT_SET_SCRIPT: - return PM_TYPE_SCRIPT; + case CAT_SET_JS_DATA: + return PM_TYPE_JS_DATA; case CAT_SET_STAT_MSG: return PM_TYPE_STAT_MSG; @@ -133,6 +133,9 @@ static const char* get_service(const char* opt) { if ( !strncmp(opt, "http_", 5) ) return "http"; + + if ( !strncmp(opt, "js_data", 7) ) + return "http"; if ( !strncmp(opt, "cip_", 4) ) // NO FP BUF return "cip"; @@ -235,7 +238,6 @@ void validate_services(SnortConfig* sc, OptTreeNode* otn) { std::string svc; bool file = false; - bool script = false; for (OptFpList* ofl = otn->opt_func; ofl; ofl = ofl->next) { @@ -256,12 +258,6 @@ void validate_services(SnortConfig* sc, OptTreeNode* otn) continue; } - if ( !strcmp(s, "js_data") ) - { - script = true; - continue; - } - s = get_service(s); if ( !s ) @@ -293,12 +289,6 @@ void validate_services(SnortConfig* sc, OptTreeNode* otn) otn->sigInfo.gid, otn->sigInfo.sid, otn->sigInfo.rev); add_service_to_otn(sc, otn, "file"); } - if ( otn->sigInfo.services.empty() and script ) - { - ParseWarning(WARN_RULES, "%u:%u:%u has no service with js_data", - otn->sigInfo.gid, otn->sigInfo.sid, otn->sigInfo.rev); - add_service_to_otn(sc, otn, "http"); - } } PatternMatchVector get_fp_content( diff --git a/src/detection/ips_context.h b/src/detection/ips_context.h index caa02bf3f..bf8c12240 100644 --- a/src/detection/ips_context.h +++ b/src/detection/ips_context.h @@ -153,7 +153,6 @@ public: SF_EVENTQ* equeue; DataPointer file_data = DataPointer(nullptr, 0); - DataPointer js_data = DataPointer(nullptr, 0); DataBuffer alt_data = {}; uint64_t context_num; diff --git a/src/framework/base_api.h b/src/framework/base_api.h index 42c3ebdc8..d0715f032 100644 --- a/src/framework/base_api.h +++ b/src/framework/base_api.h @@ -29,7 +29,7 @@ // this is the current version of the base api // must be prefixed to subtype version -#define BASE_API_VERSION 8 +#define BASE_API_VERSION 9 // set options to API_OPTIONS to ensure compatibility #ifndef API_OPTIONS diff --git a/src/framework/inspector.h b/src/framework/inspector.h index 8e4171b0d..bb88bad8d 100644 --- a/src/framework/inspector.h +++ b/src/framework/inspector.h @@ -47,7 +47,7 @@ struct InspectionBuffer // FIXIT-L file data is tbd IBT_KEY, IBT_HEADER, IBT_BODY, IBT_FILE, IBT_ALT, IBT_RAW_KEY, IBT_RAW_HEADER, IBT_METHOD, IBT_STAT_CODE, - IBT_STAT_MSG, IBT_COOKIE, IBT_VBA, IBT_MAX + IBT_STAT_MSG, IBT_COOKIE, IBT_JS_DATA, IBT_VBA, IBT_MAX }; const uint8_t* data; unsigned len; diff --git a/src/framework/ips_option.h b/src/framework/ips_option.h index 978f442e6..e23936b78 100644 --- a/src/framework/ips_option.h +++ b/src/framework/ips_option.h @@ -53,7 +53,6 @@ enum CursorActionType CAT_SET_OTHER, CAT_SET_RAW, CAT_SET_COOKIE, - CAT_SET_SCRIPT, CAT_SET_STAT_MSG, CAT_SET_STAT_CODE, CAT_SET_METHOD, @@ -63,6 +62,7 @@ enum CursorActionType CAT_SET_BODY, CAT_SET_HEADER, CAT_SET_KEY, + CAT_SET_JS_DATA, CAT_SET_VBA, }; diff --git a/src/ips_options/CMakeLists.txt b/src/ips_options/CMakeLists.txt index 7bbfb9a97..445143c0a 100644 --- a/src/ips_options/CMakeLists.txt +++ b/src/ips_options/CMakeLists.txt @@ -36,7 +36,6 @@ SET( PLUGIN_LIST ips_rem.cc ips_rev.cc ips_rpc.cc - ips_js_data.cc ips_seq.cc ips_sid.cc ips_soid.cc @@ -69,7 +68,6 @@ set (IPS_SOURCES ips_pkt_data.cc ips_reference.cc ips_replace.cc - ips_js_data.cc ips_service.cc ips_so.cc ips_vba_data.cc diff --git a/src/ips_options/ips_js_data.cc b/src/ips_options/ips_js_data.cc deleted file mode 100644 index 49955d38c..000000000 --- a/src/ips_options/ips_js_data.cc +++ /dev/null @@ -1,137 +0,0 @@ -//-------------------------------------------------------------------------- -// Copyright (C) 2021-2021 Cisco and/or its affiliates. All rights reserved. -// -// This program is free software; you can redistribute it and/or modify it -// under the terms of the GNU General Public License Version 2 as published -// by the Free Software Foundation. You may not use, modify or distribute -// this program under any other version of the GNU General Public License. -// -// This program is distributed in the hope that it will be useful, but -// WITHOUT ANY WARRANTY; without even the implied warranty of -// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -// General Public License for more details. -// -// You should have received a copy of the GNU General Public License along -// with this program; if not, write to the Free Software Foundation, Inc., -// 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. -//-------------------------------------------------------------------------- -// ips_js_data.cc author Serhii Vlasiuk - -#ifdef HAVE_CONFIG_H -#include "config.h" -#endif - -#include "detection/detection_engine.h" -#include "framework/cursor.h" -#include "framework/ips_option.h" -#include "framework/module.h" -#include "profiler/profiler.h" - -using namespace snort; - -#define s_name "js_data" -#define s_help \ - "rule option to set detection cursor to normalized JavaScript data" - -static THREAD_LOCAL ProfileStats scriptDataPerfStats; - -class ScriptDataOption : public IpsOption -{ -public: - ScriptDataOption() : IpsOption(s_name, RULE_OPTION_TYPE_BUFFER_SET) { } - - CursorActionType get_cursor_type() const override - { return CAT_SET_SCRIPT; } - - EvalStatus eval(Cursor&, Packet*) override; -}; - -IpsOption::EvalStatus ScriptDataOption::eval(Cursor& c, Packet* p) -{ - RuleProfile profile(scriptDataPerfStats); - - DataPointer dp = DetectionEngine::get_js_data(p->context); - - if ( !dp.data or !dp.len ) - return NO_MATCH; - - c.set(s_name, dp.data, dp.len); - - return MATCH; -} - -//------------------------------------------------------------------------- -// module -//------------------------------------------------------------------------- - -class ScriptDataModule : public Module -{ -public: - ScriptDataModule() : Module(s_name, s_help) { } - - ProfileStats* get_profile() const override - { return &scriptDataPerfStats; } - - Usage get_usage() const override - { return DETECT; } -}; - -//------------------------------------------------------------------------- -// api methods -//------------------------------------------------------------------------- - -static Module* mod_ctor() -{ - return new ScriptDataModule; -} - -static void mod_dtor(Module* m) -{ - delete m; -} - -static IpsOption* js_data_ctor(Module*, OptTreeNode*) -{ - return new ScriptDataOption; -} - -static void js_data_dtor(IpsOption* p) -{ - delete p; -} - -static const IpsApi js_data_api = -{ - { - PT_IPS_OPTION, - sizeof(IpsApi), - IPSAPI_VERSION, - 0, - API_RESERVED, - API_OPTIONS, - s_name, - s_help, - mod_ctor, - mod_dtor - }, - OPT_TYPE_DETECTION, - 0, 0, - nullptr, - nullptr, - nullptr, - nullptr, - js_data_ctor, - js_data_dtor, - nullptr -}; - -#ifdef BUILDING_SO -SO_PUBLIC const BaseApi* snort_plugins[] = -#else -const BaseApi* ips_js_data[] = -#endif -{ - &js_data_api.base, - nullptr -}; - diff --git a/src/ips_options/ips_options.cc b/src/ips_options/ips_options.cc index 809a3b987..a140ee8bb 100644 --- a/src/ips_options/ips_options.cc +++ b/src/ips_options/ips_options.cc @@ -39,7 +39,6 @@ extern const BaseApi* ips_metadata; extern const BaseApi* ips_pkt_data; extern const BaseApi* ips_reference; extern const BaseApi* ips_replace; -extern const BaseApi* ips_js_data; extern const BaseApi* ips_service; extern const BaseApi* ips_sha256; extern const BaseApi* ips_sha512; @@ -107,7 +106,6 @@ static const BaseApi* ips_options[] = ips_pkt_data, ips_reference, ips_replace, - ips_js_data, ips_service, ips_sha256, ips_sha512, diff --git a/src/main/analyzer.cc b/src/main/analyzer.cc index 95cbd243e..5891e2149 100644 --- a/src/main/analyzer.cc +++ b/src/main/analyzer.cc @@ -203,7 +203,6 @@ static bool process_packet(Packet* p) if ( !(p->packet_flags & PKT_IGNORE) ) { clear_file_data(); - clear_js_data(); // return incomplete status if the main hook indicates not all work was done if (!main_hook(p)) return false; diff --git a/src/main/test/stubs.h b/src/main/test/stubs.h index 136c79944..a2e466df4 100644 --- a/src/main/test/stubs.h +++ b/src/main/test/stubs.h @@ -163,7 +163,6 @@ void DetectionEngine::idle() { } void DetectionEngine::reset() { } void DetectionEngine::wait_for_context() { } void DetectionEngine::set_file_data(const DataPointer&) { } -void DetectionEngine::set_js_data(const DataPointer&) { } void DetectionEngine::clear_replacement() { } void DetectionEngine::disable_all(Packet*) { } unsigned get_instance_id() { return 0; } diff --git a/src/ports/port_group.h b/src/ports/port_group.h index fef2c6b9b..240620447 100644 --- a/src/ports/port_group.h +++ b/src/ports/port_group.h @@ -47,10 +47,10 @@ enum PmType PM_TYPE_RAW_KEY, PM_TYPE_RAW_HEADER, PM_TYPE_METHOD, - PM_TYPE_SCRIPT, PM_TYPE_STAT_CODE, PM_TYPE_STAT_MSG, PM_TYPE_COOKIE, + PM_TYPE_JS_DATA, PM_TYPE_VBA, PM_TYPE_MAX }; @@ -58,7 +58,7 @@ enum PmType const char* const pm_type_strings[PM_TYPE_MAX] = { "packet", "alt", "key", "header", "body", "file", "raw_key", "raw_header", - "method", "script", "stat_code", "stat_msg", "cookie" , "vba" + "method", "stat_code", "stat_msg", "cookie", "js_data", "vba" }; struct RULE_NODE diff --git a/src/pub_sub/test/pub_sub_http_request_body_event_test.cc b/src/pub_sub/test/pub_sub_http_request_body_event_test.cc index 0af6cb260..2776a13db 100644 --- a/src/pub_sub/test/pub_sub_http_request_body_event_test.cc +++ b/src/pub_sub/test/pub_sub_http_request_body_event_test.cc @@ -55,7 +55,7 @@ void HttpMsgBody::publish() {} void HttpMsgBody::do_file_processing(const Field&) {} void HttpMsgBody::do_utf_decoding(const Field&, Field&) {} void HttpMsgBody::do_file_decompression(const Field&, Field&) {} -void HttpMsgBody::do_js_normalization(const Field&, Field&, bool) {} +void HttpMsgBody::do_enhanced_js_normalization(char*&, size_t&) {} void HttpMsgBody::clean_partial(uint32_t&, uint32_t&, uint8_t*&, uint32_t&, int32_t) {} void HttpMsgBody::bookkeeping_regular_flush(uint32_t&, uint8_t*&, uint32_t&, int32_t) {} #ifdef REG_TEST diff --git a/src/service_inspectors/http_inspect/http_api.cc b/src/service_inspectors/http_inspect/http_api.cc index 6f7257b45..bf1d47f14 100644 --- a/src/service_inspectors/http_inspect/http_api.cc +++ b/src/service_inspectors/http_inspect/http_api.cc @@ -67,6 +67,7 @@ const char* HttpApi::classic_buffer_names[] = "http_true_ip", "http_uri", "http_version", + "js_data", "vba_data", nullptr }; @@ -117,6 +118,7 @@ extern const BaseApi* ips_http_trailer; extern const BaseApi* ips_http_true_ip; extern const BaseApi* ips_http_uri; extern const BaseApi* ips_http_version; +extern const BaseApi* ips_js_data; #ifdef BUILDING_SO SO_PUBLIC const BaseApi* snort_plugins[] = @@ -143,6 +145,7 @@ const BaseApi* sin_http[] = ips_http_true_ip, ips_http_uri, ips_http_version, + ips_js_data, nullptr }; diff --git a/src/service_inspectors/http_inspect/http_enum.h b/src/service_inspectors/http_inspect/http_enum.h index 7ccbe6c5f..d1c0a42ab 100755 --- a/src/service_inspectors/http_inspect/http_enum.h +++ b/src/service_inspectors/http_inspect/http_enum.h @@ -60,7 +60,7 @@ enum HTTP_BUFFER { HTTP_BUFFER_CLIENT_BODY = 1, HTTP_BUFFER_COOKIE, HTTP_BUFFER_ HTTP_BUFFER_RAW_HEADER, HTTP_BUFFER_RAW_REQUEST, HTTP_BUFFER_RAW_STATUS, HTTP_BUFFER_RAW_TRAILER, HTTP_BUFFER_RAW_URI, HTTP_BUFFER_STAT_CODE, HTTP_BUFFER_STAT_MSG, HTTP_BUFFER_TRAILER, HTTP_BUFFER_TRUE_IP, HTTP_BUFFER_URI, HTTP_BUFFER_VERSION, - BUFFER_VBA_DATA, HTTP_BUFFER_MAX }; + BUFFER_JS_DATA, BUFFER_VBA_DATA, HTTP_BUFFER_MAX }; // Peg counts // This enum must remain synchronized with HttpModule::peg_names[] in http_tables.cc diff --git a/src/service_inspectors/http_inspect/http_flow_data.h b/src/service_inspectors/http_inspect/http_flow_data.h index 415fd4c00..cf02b77f6 100644 --- a/src/service_inspectors/http_inspect/http_flow_data.h +++ b/src/service_inspectors/http_inspect/http_flow_data.h @@ -170,6 +170,8 @@ private: HttpCommon::STAT_NOT_PRESENT }; int64_t detect_depth_remaining[2] = { HttpCommon::STAT_NOT_PRESENT, HttpCommon::STAT_NOT_PRESENT }; + int64_t js_norm_depth_remaining[2] = { HttpCommon::STAT_NOT_PRESENT, + HttpCommon::STAT_NOT_PRESENT }; int32_t publish_depth_remaining[2] = { HttpCommon::STAT_NOT_PRESENT, HttpCommon::STAT_NOT_PRESENT }; uint64_t expected_trans_num[2] = { 1, 1 }; diff --git a/src/service_inspectors/http_inspect/http_inspect.cc b/src/service_inspectors/http_inspect/http_inspect.cc index d9a65dc61..162347108 100755 --- a/src/service_inspectors/http_inspect/http_inspect.cc +++ b/src/service_inspectors/http_inspect/http_inspect.cc @@ -133,8 +133,7 @@ HttpInspect::HttpInspect(const HttpParaList* params_) : bool HttpInspect::configure(SnortConfig* ) { - if ( params->js_norm_param.js_norm ) - params->js_norm_param.js_norm->configure(); + params->js_norm_param.js_norm->configure(); return true; } @@ -251,6 +250,9 @@ bool HttpInspect::get_buf(InspectionBuffer::Type ibt, Packet* p, InspectionBuffe case InspectionBuffer::IBT_VBA: return get_buf(BUFFER_VBA_DATA, p, b); + case InspectionBuffer::IBT_JS_DATA: + return get_buf(BUFFER_JS_DATA, p, b); + default: return false; } @@ -306,6 +308,7 @@ bool HttpInspect::get_fp_buf(InspectionBuffer::Type ibt, Packet* p, InspectionBu break; case InspectionBuffer::IBT_BODY: case InspectionBuffer::IBT_VBA: + case InspectionBuffer::IBT_JS_DATA: if ((get_latest_is(p) != IS_FIRST_BODY) && (get_latest_is(p) != IS_BODY)) return false; break; diff --git a/src/service_inspectors/http_inspect/http_js_norm.cc b/src/service_inspectors/http_inspect/http_js_norm.cc index 036dc7530..fbdaeb644 100644 --- a/src/service_inspectors/http_inspect/http_js_norm.cc +++ b/src/service_inspectors/http_inspect/http_js_norm.cc @@ -135,8 +135,8 @@ void HttpJsNorm::configure() configure_once = true; } -void HttpJsNorm::enhanced_external_normalize(const Field& input, Field& output, - HttpInfractions* infractions, HttpFlowData* ssn) const +void HttpJsNorm::enhanced_external_normalize(const Field& input, + HttpInfractions* infractions, HttpFlowData* ssn, char*& out_buf, size_t& out_len) const { if (ssn->js_built_in_event) return; @@ -212,20 +212,19 @@ void HttpJsNorm::enhanced_external_normalize(const Field& input, Field& output, } auto result = js_ctx.get_script(); - auto script_ptr = result.first; + out_buf = result.first; - if (script_ptr) + if (out_buf) { - auto script_len = result.second; - output.set(script_len, reinterpret_cast(script_ptr), true); + out_len = result.second; trace_logf(1, http_trace, TRACE_JS_DUMP, nullptr, - "js_data[%zu]: %.*s\n", script_len, static_cast(script_len), script_ptr); + "js_data[%zu]: %.*s\n", out_len, static_cast(out_len), out_buf); } } -void HttpJsNorm::enhanced_inline_normalize(const Field& input, Field& output, - HttpInfractions* infractions, HttpFlowData* ssn) const +void HttpJsNorm::enhanced_inline_normalize(const Field& input, + HttpInfractions* infractions, HttpFlowData* ssn, char*& out_buf, size_t& out_len) const { const char* ptr = (const char*)input.start(); const char* const end = ptr + input.length(); @@ -339,15 +338,14 @@ void HttpJsNorm::enhanced_inline_normalize(const Field& input, Field& output, auto js_ctx = ssn->js_normalizer; auto result = js_ctx->get_script(); - auto script_ptr = result.first; + out_buf = result.first; - if (script_ptr) + if (out_buf) { - auto script_len = result.second; - output.set(script_len, (const uint8_t*)script_ptr, true); + out_len = result.second; trace_logf(1, http_trace, TRACE_JS_DUMP, nullptr, - "js_data[%zu]: %.*s\n", script_len, static_cast(script_len), script_ptr); + "js_data[%zu]: %.*s\n", out_len, static_cast(out_len), out_buf); } if (!script_continue) diff --git a/src/service_inspectors/http_inspect/http_js_norm.h b/src/service_inspectors/http_inspect/http_js_norm.h index 851ddcb65..ad6225710 100644 --- a/src/service_inspectors/http_inspect/http_js_norm.h +++ b/src/service_inspectors/http_inspect/http_js_norm.h @@ -43,8 +43,10 @@ public: void legacy_normalize(const Field& input, Field& output, HttpInfractions*, HttpEventGen*, int max_javascript_whitespaces) const; - void enhanced_inline_normalize(const Field& input, Field& output, HttpInfractions*, HttpFlowData*) const; - void enhanced_external_normalize(const Field& input, Field& output, HttpInfractions*, HttpFlowData*) const; + void enhanced_inline_normalize(const Field& input, HttpInfractions*, HttpFlowData*, + char*& out_buf, size_t& out_len) const; + void enhanced_external_normalize(const Field& input, HttpInfractions*, HttpFlowData*, + char*& out_buf, size_t& out_len) const; void configure(); diff --git a/src/service_inspectors/http_inspect/http_module.cc b/src/service_inspectors/http_inspect/http_module.cc index eb75456bd..f8da39760 100755 --- a/src/service_inspectors/http_inspect/http_module.cc +++ b/src/service_inspectors/http_inspect/http_module.cc @@ -89,10 +89,8 @@ const Parameter HttpModule::http_params[] = { "normalize_javascript", Parameter::PT_BOOL, nullptr, "false", "use legacy normalizer to normalize JavaScript in response bodies" }, - { "js_normalization_depth", Parameter::PT_INT, "-1:max53", "0", - "enable enhanced normalizer (0 is disabled); " - "number of input JavaScript bytes to normalize (-1 unlimited) " - "(experimental)" }, + { "js_normalization_depth", Parameter::PT_INT, "-1:max53", "-1", + "number of input JavaScript bytes to normalize (-1 unlimited)" }, // range of accepted identifier names is (var_0000:var_ffff), so the max is 2^16 { "js_norm_identifier_depth", Parameter::PT_INT, "0:65536", "65536", @@ -100,14 +98,13 @@ const Parameter HttpModule::http_params[] = { "js_norm_max_tmpl_nest", Parameter::PT_INT, "0:255", "32", "maximum depth of template literal nesting that enhanced javascript normalizer " - "will process (experimental)" }, + "will process" }, { "js_norm_max_scope_depth", Parameter::PT_INT, "0:65535", "256", - "maximum depth of scope nesting that enhanced JavaScript normalizer will process " - "(experimental)" }, + "maximum depth of scope nesting that enhanced JavaScript normalizer will process" }, { "js_norm_built_in_ident", Parameter::PT_LIST, js_built_in_ident_param, nullptr, - "list of JavaScript built-in identifiers which will not be normalized (experimental)" }, + "list of JavaScript built-in identifiers which will not be normalized" }, { "max_javascript_whitespaces", Parameter::PT_INT, "1:65535", "200", "maximum consecutive whitespaces allowed within the JavaScript obfuscated data" }, @@ -268,9 +265,6 @@ bool HttpModule::set(const char*, Value& val, SnortConfig*) else if (val.is("normalize_javascript")) { params->js_norm_param.normalize_javascript = val.get_bool(); - params->js_norm_param.is_javascript_normalization = - params->js_norm_param.is_javascript_normalization - or params->js_norm_param.normalize_javascript; } else if (val.is("js_norm_identifier_depth")) { @@ -278,10 +272,7 @@ bool HttpModule::set(const char*, Value& val, SnortConfig*) } else if (val.is("js_normalization_depth")) { - int64_t v = val.get_int64(); - params->js_norm_param.js_normalization_depth = v; - params->js_norm_param.is_javascript_normalization = - params->js_norm_param.is_javascript_normalization or (v != 0); + params->js_norm_param.js_normalization_depth = val.get_int64(); } else if (val.is("js_norm_max_tmpl_nest")) { @@ -480,12 +471,7 @@ bool HttpModule::end(const char* fqn, int, SnortConfig*) params->uri_param.iis_unicode_code_page); } - if ( params->js_norm_param.normalize_javascript and - params->js_norm_param.js_normalization_depth ) - ParseError("Cannot use normalize_javascript and js_normalization_depth together."); - - if ( params->js_norm_param.is_javascript_normalization ) - params->js_norm_param.js_norm = new HttpJsNorm(params->uri_param, + params->js_norm_param.js_norm = new HttpJsNorm(params->uri_param, params->js_norm_param.js_normalization_depth, params->js_norm_param.js_identifier_depth, params->js_norm_param.max_template_nesting, params->js_norm_param.max_scope_depth, params->js_norm_param.built_in_ident); diff --git a/src/service_inspectors/http_inspect/http_module.h b/src/service_inspectors/http_inspect/http_module.h index 58569955c..3703f45ac 100755 --- a/src/service_inspectors/http_inspect/http_module.h +++ b/src/service_inspectors/http_inspect/http_module.h @@ -66,8 +66,7 @@ public: public: ~JsNormParam(); bool normalize_javascript = false; - bool is_javascript_normalization = false; - int64_t js_normalization_depth = 0; + int64_t js_normalization_depth = -1; int32_t js_identifier_depth = 0; uint8_t max_template_nesting = 32; uint32_t max_scope_depth = 256; diff --git a/src/service_inspectors/http_inspect/http_msg_body.cc b/src/service_inspectors/http_inspect/http_msg_body.cc index 9f61c19f5..b48394cc6 100644 --- a/src/service_inspectors/http_inspect/http_msg_body.cc +++ b/src/service_inspectors/http_inspect/http_msg_body.cc @@ -80,6 +80,7 @@ void HttpMsgBody::publish() void HttpMsgBody::bookkeeping_regular_flush(uint32_t& partial_detect_length, uint8_t*& partial_detect_buffer, uint32_t& partial_js_detect_length, int32_t detect_length) { + session_data->js_norm_depth_remaining[source_id] = session_data->detect_depth_remaining[source_id]; session_data->detect_depth_remaining[source_id] -= detect_length; partial_detect_buffer = nullptr; partial_detect_length = 0; @@ -160,7 +161,7 @@ void HttpMsgBody::analyze() memcpy(cumulative_buffer + partial_detect_length, decompressed_file_body.start(), decompressed_file_body.length()); cumulative_data.set(total_length, cumulative_buffer, true); - do_js_normalization(cumulative_data, js_norm_body, true); + do_legacy_js_normalization(cumulative_data, js_norm_body); if ((int32_t)partial_js_detect_length == js_norm_body.length()) { clean_partial(partial_inspected_octets, partial_detect_length, @@ -169,7 +170,7 @@ void HttpMsgBody::analyze() } } else - do_js_normalization(decompressed_file_body, js_norm_body, false); + do_legacy_js_normalization(decompressed_file_body, js_norm_body); const int32_t detect_length = (js_norm_body.length() <= session_data->detect_depth_remaining[source_id]) ? @@ -319,63 +320,71 @@ void HttpMsgBody::fd_event_callback(void* context, int event) } } -void HttpMsgBody::do_js_normalization(const Field& input, Field& output, bool partial_detect) +void HttpMsgBody::do_enhanced_js_normalization(char*& out_buf, size_t& out_buf_len) { - if (!params->js_norm_param.is_javascript_normalization or source_id == SRC_CLIENT) - output.set(input); - else if (params->js_norm_param.normalize_javascript) - params->js_norm_param.js_norm->legacy_normalize(input, output, - transaction->get_infractions(source_id), session_data->events[source_id], - params->js_norm_param.max_javascript_whitespaces); - else if (params->js_norm_param.js_normalization_depth) - { - output.set(input); + const bool has_cumulative_data = (cumulative_data.length() > 0); + Field& input = has_cumulative_data ? cumulative_data : decompressed_file_body; - bool js_continuation = session_data->js_normalizer; - uint8_t*& buf = session_data->js_detect_buffer[source_id]; - uint32_t& len = session_data->js_detect_length[source_id]; + bool js_continuation = session_data->js_normalizer; + uint8_t*& buf = session_data->js_detect_buffer[source_id]; + uint32_t& len = session_data->js_detect_length[source_id]; - if (partial_detect) - session_data->release_js_ctx(); - else - { - session_data->update_deallocations(len); - delete[] buf; - buf = nullptr; - len = 0; - } + if (has_cumulative_data) + session_data->release_js_ctx(); + else + { + session_data->update_deallocations(len); + delete[] buf; + buf = nullptr; + len = 0; + } - auto http_header = get_header(source_id); - if (http_header and http_header->is_external_js()) - params->js_norm_param.js_norm->enhanced_external_normalize(input, enhanced_js_norm_body, - transaction->get_infractions(source_id), session_data); - else - params->js_norm_param.js_norm->enhanced_inline_normalize(input, enhanced_js_norm_body, - transaction->get_infractions(source_id), session_data); + auto http_header = get_header(source_id); - const int32_t norm_length = - (enhanced_js_norm_body.length() <= session_data->detect_depth_remaining[source_id]) ? - enhanced_js_norm_body.length() : session_data->detect_depth_remaining[source_id]; + if (http_header and http_header->is_external_js()) + params->js_norm_param.js_norm->enhanced_external_normalize(input, + transaction->get_infractions(source_id), session_data, out_buf, out_buf_len); + else + params->js_norm_param.js_norm->enhanced_inline_normalize(input, + transaction->get_infractions(source_id), session_data, out_buf, out_buf_len); - if ( norm_length > 0 ) - { - set_js_data(enhanced_js_norm_body.start(), (unsigned int)norm_length); + out_buf_len = static_cast(out_buf_len) <= session_data->js_norm_depth_remaining[source_id] ? + out_buf_len : session_data->js_norm_depth_remaining[source_id]; - if (partial_detect) - return; + if (out_buf_len > 0) + { + if (has_cumulative_data) + return; - if (js_continuation) - { - auto nscript_len = enhanced_js_norm_body.length(); - uint8_t* nscript = new uint8_t[nscript_len]; + if (js_continuation) + { + uint8_t* nscript = new uint8_t[out_buf_len]; - memcpy(nscript, enhanced_js_norm_body.start(), nscript_len); - buf = nscript; - len = nscript_len; - session_data->update_allocations(len); - } + memcpy(nscript, out_buf, out_buf_len); + buf = nscript; + len = out_buf_len; + session_data->update_allocations(len); } } + else + { + delete[] out_buf; + out_buf = nullptr; + out_buf_len = 0; + } +} + +void HttpMsgBody::do_legacy_js_normalization(const Field& input, Field& output) +{ + if (!params->js_norm_param.normalize_javascript || source_id == SRC_CLIENT) + { + output.set(input); + return; + } + + params->js_norm_param.js_norm->legacy_normalize(input, output, + transaction->get_infractions(source_id), session_data->events[source_id], + params->js_norm_param.max_javascript_whitespaces); } void HttpMsgBody::do_file_processing(const Field& file_data) @@ -543,6 +552,24 @@ const Field& HttpMsgBody::get_decomp_vba_data() return decompressed_vba_data; } +const Field& HttpMsgBody::get_norm_js_data() +{ + if (enhanced_js_norm_body.length() != STAT_NOT_COMPUTE) + return enhanced_js_norm_body; + + char* buf = nullptr; + size_t buf_len = 0; + + do_enhanced_js_normalization(buf, buf_len); + + if (buf && buf_len) + enhanced_js_norm_body.set(buf_len, reinterpret_cast(buf), true); + else + enhanced_js_norm_body.set(STAT_NOT_PRESENT); + + return enhanced_js_norm_body; +} + int32_t HttpMsgBody::get_publish_length() const { return publish_length; diff --git a/src/service_inspectors/http_inspect/http_msg_body.h b/src/service_inspectors/http_inspect/http_msg_body.h index ee1327181..2f1a9c388 100644 --- a/src/service_inspectors/http_inspect/http_msg_body.h +++ b/src/service_inspectors/http_inspect/http_msg_body.h @@ -39,6 +39,7 @@ public: HttpMsgBody* get_body() override { return this; } const Field& get_classic_client_body(); const Field& get_decomp_vba_data(); + const Field& get_norm_js_data(); const Field& get_detect_data() { return detect_data; } const Field& get_msg_text_new() const { return msg_text_new; } static void fd_event_callback(void* context, int event); @@ -62,7 +63,8 @@ private: void do_file_processing(const Field& file_data); void do_utf_decoding(const Field& input, Field& output); void do_file_decompression(const Field& input, Field& output); - void do_js_normalization(const Field& input, Field& output, bool partial_detect); + void do_enhanced_js_normalization(char*& out_buf, size_t& out_len); + void do_legacy_js_normalization(const Field& input, Field& output); void clean_partial(uint32_t& partial_inspected_octets, uint32_t& partial_detect_length, uint8_t*& partial_detect_buffer, uint32_t& partial_js_detect_length, int32_t detect_length); diff --git a/src/service_inspectors/http_inspect/http_msg_header.cc b/src/service_inspectors/http_inspect/http_msg_header.cc index e4ea11165..5f75894a4 100755 --- a/src/service_inspectors/http_inspect/http_msg_header.cc +++ b/src/service_inspectors/http_inspect/http_msg_header.cc @@ -449,6 +449,7 @@ void HttpMsgHeader::prepare_body() const int64_t& depth = (source_id == SRC_CLIENT) ? params->request_depth : params->response_depth; session_data->detect_depth_remaining[source_id] = (depth != -1) ? depth : INT64_MAX; + session_data->js_norm_depth_remaining[source_id] = session_data->detect_depth_remaining[source_id]; if ((source_id == SRC_CLIENT) and params->publish_request_body and session_data->for_http2) { session_data->publish_octets[source_id] = 0; diff --git a/src/service_inspectors/http_inspect/http_msg_section.cc b/src/service_inspectors/http_inspect/http_msg_section.cc index fa0bc52da..5abd6d9c4 100644 --- a/src/service_inspectors/http_inspect/http_msg_section.cc +++ b/src/service_inspectors/http_inspect/http_msg_section.cc @@ -390,6 +390,14 @@ const Field& HttpMsgSection::get_classic_buffer(Cursor& c, const HttpBufferInfo& else return Field::FIELD_NULL; } + case BUFFER_JS_DATA: + { + HttpMsgBody* msg_body = get_body(); + if (msg_body) + return msg_body->get_norm_js_data(); + else + return Field::FIELD_NULL; + } default: assert(false); return Field::FIELD_NULL; diff --git a/src/service_inspectors/http_inspect/ips_http.cc b/src/service_inspectors/http_inspect/ips_http.cc index 5940ac00c..38d911ce1 100644 --- a/src/service_inspectors/http_inspect/ips_http.cc +++ b/src/service_inspectors/http_inspect/ips_http.cc @@ -70,6 +70,7 @@ bool HttpCursorModule::begin(const char*, int, SnortConfig*) break; case HTTP_BUFFER_CLIENT_BODY: case HTTP_BUFFER_RAW_BODY: + case BUFFER_JS_DATA: inspect_section = IS_BODY; break; case HTTP_BUFFER_RAW_TRAILER: @@ -1201,6 +1202,46 @@ static const IpsApi version_api = nullptr }; +//------------------------------------------------------------------------- +// js_data +//------------------------------------------------------------------------- +// + +#undef IPS_OPT +#define IPS_OPT "js_data" +#undef IPS_HELP +#define IPS_HELP "rule option to set detection cursor to normalized JavaScript data" +static Module* js_data_mod_ctor() +{ + return new HttpCursorModule(IPS_OPT, IPS_HELP, BUFFER_JS_DATA, CAT_SET_JS_DATA, + PSI_JS_DATA); +} + +static const IpsApi js_data_api = +{ + { + PT_IPS_OPTION, + sizeof(IpsApi), + IPSAPI_VERSION, + 1, + API_RESERVED, + API_OPTIONS, + IPS_OPT, + IPS_HELP, + js_data_mod_ctor, + HttpCursorModule::mod_dtor + }, + OPT_TYPE_DETECTION, + 0, PROTO_BIT__TCP, + nullptr, + nullptr, + nullptr, + nullptr, + HttpIpsOption::opt_ctor, + HttpIpsOption::opt_dtor, + nullptr +}; + //------------------------------------------------------------------------- // plugins //------------------------------------------------------------------------- @@ -1223,4 +1264,5 @@ const BaseApi* ips_http_trailer = &trailer_api.base; const BaseApi* ips_http_true_ip = &true_ip_api.base; const BaseApi* ips_http_uri = &uri_api.base; const BaseApi* ips_http_version = &version_api.base; +const BaseApi* ips_js_data = &js_data_api.base; diff --git a/src/service_inspectors/http_inspect/ips_http.h b/src/service_inspectors/http_inspect/ips_http.h index 81174d236..c97b42506 100644 --- a/src/service_inspectors/http_inspect/ips_http.h +++ b/src/service_inspectors/http_inspect/ips_http.h @@ -32,7 +32,7 @@ enum PsIdx { PSI_CLIENT_BODY, PSI_COOKIE, PSI_HEADER, PSI_METHOD, PSI_PARAM, PSI_RAW_BODY, PSI_RAW_COOKIE, PSI_RAW_HEADER, PSI_RAW_REQUEST, PSI_RAW_STATUS, PSI_RAW_TRAILER, PSI_RAW_URI, PSI_STAT_CODE, PSI_STAT_MSG, PSI_TRAILER, - PSI_TRUE_IP, PSI_URI, PSI_VERSION, PSI_VBA_DATA, PSI_MAX }; + PSI_TRUE_IP, PSI_URI, PSI_VERSION, PSI_JS_DATA, PSI_VBA_DATA, PSI_MAX }; class HttpCursorModule : public snort::Module { diff --git a/src/utils/stats.cc b/src/utils/stats.cc index 470f44948..047ccbd0a 100644 --- a/src/utils/stats.cc +++ b/src/utils/stats.cc @@ -198,10 +198,10 @@ const PegInfo pc_names[] = { CountType::SUM, "raw_key_searches", "fast pattern searches in raw key buffer" }, { CountType::SUM, "raw_header_searches", "fast pattern searches in raw header buffer" }, { CountType::SUM, "method_searches", "fast pattern searches in method buffer" }, - { CountType::SUM, "script_searches", "fast pattern searches in script buffer" }, { CountType::SUM, "stat_code_searches", "fast pattern searches in status code buffer" }, { CountType::SUM, "stat_msg_searches", "fast pattern searches in status message buffer" }, { CountType::SUM, "cookie_searches", "fast pattern searches in cookie buffer" }, + { CountType::SUM, "js_data_searches", "fast pattern searches in js_data buffer" }, { CountType::SUM, "vba_searches", "fast pattern searches in MS Office Visual Basic for Applications buffer" }, { CountType::SUM, "offloads", "fast pattern searches that were offloaded" }, { CountType::SUM, "alerts", "alerts not including IP reputation" }, diff --git a/src/utils/stats.h b/src/utils/stats.h index 947ad2b39..76e491012 100644 --- a/src/utils/stats.h +++ b/src/utils/stats.h @@ -49,10 +49,10 @@ struct PacketCount PegCount raw_key_searches; PegCount raw_header_searches; PegCount method_searches; - PegCount script_searches; PegCount stat_code_searches; PegCount stat_msg_searches; PegCount cookie_searches; + PegCount js_data_searches; PegCount vba_searches; PegCount offloads; PegCount alert_pkts;