]> git.ipfire.org Git - thirdparty/vectorscan.git/log
thirdparty/vectorscan.git
9 years agoRework literal overlap checks for merging engines
Alex Coyte [Wed, 2 Dec 2015 04:15:02 +0000 (15:15 +1100)] 
Rework literal overlap checks for merging engines

Also increase the size of chunks we consider merging for castles.

9 years agoIntroduce REPEAT_ALWAYS model for {0,} castle repeats
Alex Coyte [Wed, 2 Dec 2015 03:41:57 +0000 (14:41 +1100)] 
Introduce REPEAT_ALWAYS model for {0,} castle repeats
As Castle guards the repeats, no more state is needed for these repeats

9 years agoAllow lag on castle infixes to be reduced
Alex Coyte [Wed, 2 Dec 2015 03:23:02 +0000 (14:23 +1100)] 
Allow lag on castle infixes to be reduced
Reducing lag allows for castles to be merged more effectively

9 years agoUse add_edge_if_not_present in somMayGoBackwards()
Alex Coyte [Sun, 6 Dec 2015 23:23:32 +0000 (10:23 +1100)] 
Use add_edge_if_not_present in somMayGoBackwards()

As somMayGoBackwards() operates on a copy of the graph where virtual
starts have been collapsed on to startDs, we need to be careful not to
create parallel edges.

10 years agoBump version number
Matthew Barr [Fri, 18 Dec 2015 03:37:29 +0000 (14:37 +1100)] 
Bump version number

10 years agoSmall updates to documentation for 4.1
Justin Viiret [Wed, 16 Dec 2015 03:31:43 +0000 (14:31 +1100)] 
Small updates to documentation for 4.1

10 years agoAdd ChangeLog
Justin Viiret [Wed, 16 Dec 2015 03:10:46 +0000 (14:10 +1100)] 
Add ChangeLog

10 years agosimplify max clique analysis
Xiang Wang [Wed, 2 Dec 2015 12:24:57 +0000 (07:24 -0500)] 
simplify max clique analysis

10 years agoAdd per-top findMinWidth etc for NFA graphs
Justin Viiret [Wed, 2 Dec 2015 07:16:49 +0000 (18:16 +1100)] 
Add per-top findMinWidth etc for NFA graphs

10 years agoCastleProto: track next top explicitly
Justin Viiret [Wed, 2 Dec 2015 22:27:57 +0000 (09:27 +1100)] 
CastleProto: track next top explicitly

Repeats may be removed (e.g. by pruning in role aliasing passes)
leaving "holes" in the top map. Track the next top to use explicitly,
rather than using repeats.size().

10 years agoCastleProto: track mapping of reports to tops
Justin Viiret [Wed, 2 Dec 2015 00:31:09 +0000 (11:31 +1100)] 
CastleProto: track mapping of reports to tops

This allows us to speed up report-based queries, like dedupe checking.

10 years agoassignDkeys: use flat_set<ReportID>, not set
Justin Viiret [Tue, 1 Dec 2015 23:39:32 +0000 (10:39 +1100)] 
assignDkeys: use flat_set<ReportID>, not set

10 years agofindMinWidth, findMaxWidth: width for a given top
Justin Viiret [Tue, 1 Dec 2015 23:24:54 +0000 (10:24 +1100)] 
findMinWidth, findMaxWidth: width for a given top

Currently only implemented for Castle suffixes.

10 years agoRoseDedupeAuxImpl: collect unique suffixes first
Justin Viiret [Tue, 1 Dec 2015 22:54:55 +0000 (09:54 +1100)] 
RoseDedupeAuxImpl: collect unique suffixes first

10 years agorole aliasing: simplify hashRightRoleProperties
Justin Viiret [Tue, 1 Dec 2015 22:47:59 +0000 (09:47 +1100)] 
role aliasing: simplify hashRightRoleProperties

Using the full report set for a suffix as an input to this hash was very
slow at scale.

10 years agocastle: simplify find_next_top
Justin Viiret [Tue, 1 Dec 2015 22:38:20 +0000 (09:38 +1100)] 
castle: simplify find_next_top

Tops are no longer sparse in CastleProto, so the linear scan for holes
isn't necessary.

10 years agoMake key 64 bits where large shifts may be used.
Justin Viiret [Fri, 27 Nov 2015 02:30:59 +0000 (13:30 +1100)] 
Make key 64 bits where large shifts may be used.

This fixes a long-standing issue with large multibit structures.

10 years agoPCRE includes U+180E in /[:print:]/8W
Justin Viiret [Wed, 25 Nov 2015 06:05:36 +0000 (17:05 +1100)] 
PCRE includes U+180E in /[:print:]/8W

10 years agoUpdate defn of class [:punct:] for PCRE 8.38
Justin Viiret [Wed, 25 Nov 2015 03:53:27 +0000 (14:53 +1100)] 
Update defn of class [:punct:] for PCRE 8.38

10 years agoUnify handling of caseless flag in class parser
Justin Viiret [Tue, 17 Nov 2015 06:23:52 +0000 (17:23 +1100)] 
Unify handling of caseless flag in class parser

Apply caselessness to each element added to a class, rather than all at
finalize time (which required separated ucp dnf and-ucp working data).

Unifies the behaviour of AsciiComponentClass and Utf8ComponentClass in
this respect.

10 years agoFix defn of POSIX graph, print, punct classes
Justin Viiret [Mon, 16 Nov 2015 05:43:43 +0000 (16:43 +1100)] 
Fix defn of POSIX graph, print, punct classes

The POSIX classes [:graph:], [:print:] and [:punct:] are handled
specially in UCP mode by PCRE. This change matches that behaviour.

10 years agoFDR runtime simplification
Mohammad Abdul Awal [Tue, 17 Nov 2015 17:50:23 +0000 (17:50 +0000)] 
FDR runtime simplification

Removed static specialisation of domains.

10 years agong_execute: update interface to use flat_set
Justin Viiret [Fri, 13 Nov 2015 03:36:28 +0000 (14:36 +1100)] 
ng_execute: update interface to use flat_set

This changes all the execute_graph() interfaces so that instead of
mutating a std::set of vertices, they accept an initial flat_set of
states and return a resultant flat_set of states after execution.

(Note that internally execute_graph() still uses bitsets)

This is both faster and more flexible.

10 years agoRestore \Q..\E support in character classes
Justin Viiret [Thu, 12 Nov 2015 02:27:55 +0000 (13:27 +1100)] 
Restore \Q..\E support in character classes

10 years agoIntroduce copy_bytes for writing into bytecode
Justin Viiret [Thu, 12 Nov 2015 04:27:11 +0000 (15:27 +1100)] 
Introduce copy_bytes for writing into bytecode

Protects memcpy from nullptr sources, which triggers failures in GCC's
UB sanitizer.

10 years agorepeatStoreSparseOptimalP: make diff a u32
Justin Viiret [Tue, 10 Nov 2015 05:18:42 +0000 (16:18 +1100)] 
repeatStoreSparseOptimalP: make diff a u32

As delta is a u32, we know diff will always fit within a u32 as well.
Silences a warning from Coverity.

10 years agocmake: improve build paths for nested builds
Matthew Barr [Thu, 5 Nov 2015 03:49:04 +0000 (14:49 +1100)] 
cmake: improve build paths for nested builds

If Hyperscan is built as a subproject of another cmake project, it helps to
refer to PROJECT_xx_DIR instead of CMAKE_xx_DIR, etc.

10 years agoFix includes to meet our usual guidelines
Matthew Barr [Thu, 5 Nov 2015 03:46:07 +0000 (14:46 +1100)] 
Fix includes to meet our usual guidelines

10 years agoRefine ComponentClass::class_empty
Justin Viiret [Mon, 9 Nov 2015 01:59:36 +0000 (12:59 +1100)] 
Refine ComponentClass::class_empty

ComponentClass::class_empty should only be used on finalized classes to
determine whether a given class contains any elements; it should not
take the cr_ucp or cps_ucp into account, as they have been folden in by
the finalize call.

Fixes our failure to identify that the pattern /[^\D\d]/8W can never
match.

10 years agoDon't use class_empty in early class parsing
Justin Viiret [Mon, 9 Nov 2015 01:50:52 +0000 (12:50 +1100)] 
Don't use class_empty in early class parsing

Instead, explicitly track whether we're still in the early class parsing
machine.

10 years agoRemove dead ComponentClass::{get,set}FirstChar
Justin Viiret [Sun, 8 Nov 2015 23:49:19 +0000 (10:49 +1100)] 
Remove dead ComponentClass::{get,set}FirstChar

10 years agoRework parser rejection for POSIX collating elems
Justin Viiret [Sun, 8 Nov 2015 23:37:20 +0000 (10:37 +1100)] 
Rework parser rejection for POSIX collating elems

Implement rejection of POSIX collating elements ("[.ch.]" and "[=ch=]"
entirely in the Ragel parser, using the same approach both inside and
ouside character classes.

Fix buggy rejection of [^.ch.], which we should accept as a character
class.

10 years agodepth: correct sign in printf format
Justin Viiret [Tue, 3 Nov 2015 05:24:06 +0000 (16:24 +1100)] 
depth: correct sign in printf format

10 years agonfa_api_queue: debug printf format fix
Justin Viiret [Tue, 3 Nov 2015 05:23:27 +0000 (16:23 +1100)] 
nfa_api_queue: debug printf format fix

10 years agompv_dump: correct hex escapes in printf format
Justin Viiret [Tue, 3 Nov 2015 05:22:39 +0000 (16:22 +1100)] 
mpv_dump: correct hex escapes in printf format

10 years agosimplegrep: use correct sign in printf format
Justin Viiret [Tue, 3 Nov 2015 05:21:56 +0000 (16:21 +1100)] 
simplegrep: use correct sign in printf format

10 years agocompare: always use braces for for/if blocks
Justin Viiret [Tue, 3 Nov 2015 05:19:47 +0000 (16:19 +1100)] 
compare: always use braces for for/if blocks

10 years agolimex_dump: use 'override' keyword in subclass
Justin Viiret [Tue, 3 Nov 2015 05:13:12 +0000 (16:13 +1100)] 
limex_dump: use 'override' keyword in subclass

10 years agoNGWrapper: mark dtor with override
Justin Viiret [Tue, 3 Nov 2015 05:12:36 +0000 (16:12 +1100)] 
NGWrapper: mark dtor with override

10 years agoparser: use 'override' keyword in subclasses
Justin Viiret [Tue, 3 Nov 2015 05:11:56 +0000 (16:11 +1100)] 
parser: use 'override' keyword in subclasses

10 years agoAdd inlined sparseLastTop
Justin Viiret [Tue, 3 Nov 2015 04:17:17 +0000 (15:17 +1100)] 
Add inlined sparseLastTop

This allows the code to be inlined into other sparse optimal repeat
functions.

10 years agostoreInitialRingTopPatch: fix large delta bug
Justin Viiret [Tue, 3 Nov 2015 03:58:01 +0000 (14:58 +1100)] 
storeInitialRingTopPatch: fix large delta bug

Check for staleness up front, so that it is safe to use u32 values to
handle adding more tops.

Adds LargeGap unit tests.

10 years agorepeat: use u32 arithmetic explicitly
Justin Viiret [Tue, 3 Nov 2015 02:18:34 +0000 (13:18 +1100)] 
repeat: use u32 arithmetic explicitly

In some ring-based models, we know that if the ring is not stale, then
all our bounds should fit within 32-bits. This change makes these
explicitly u32 rather than implicitly narrowing later on.

10 years agorepeatRecurTable: no need for u64a return type
Justin Viiret [Mon, 2 Nov 2015 03:41:17 +0000 (14:41 +1100)] 
repeatRecurTable: no need for u64a return type

10 years agoOptimize max clique analysis
Xiang Wang [Thu, 29 Oct 2015 10:43:47 +0000 (06:43 -0400)] 
Optimize max clique analysis

Use vectors of state ids to avoid the overhead of subgraph copies

10 years agomove oversize graph check out of Automaton_holder ctor
Alex Coyte [Mon, 2 Nov 2015 03:36:43 +0000 (14:36 +1100)] 
move oversize graph check out of Automaton_holder ctor

10 years agoraw_som_dfa: initialize members in constructor
Alex Coyte [Mon, 2 Nov 2015 02:35:04 +0000 (13:35 +1100)] 
raw_som_dfa: initialize members in constructor

10 years agoLimEx NFA: unify flush br/estate behaviour
Justin Viiret [Mon, 2 Nov 2015 01:00:09 +0000 (12:00 +1100)] 
LimEx NFA: unify flush br/estate behaviour

Make the GPR NFA models only clear cached_estate conditionally based on
cached_br, as per the SIMD models.

10 years agoLimEx NFA: no need to zero estate cache in STREAM
Justin Viiret [Mon, 2 Nov 2015 00:57:55 +0000 (11:57 +1100)] 
LimEx NFA: no need to zero estate cache in STREAM

We believe that we have solved the issues that required zeroing of the
exception state in STREAM_FN and REV_STREAM_FN nowadays.

10 years agoLimEx NFA: no need to zero init cached_esucc
Justin Viiret [Sun, 1 Nov 2015 23:20:37 +0000 (10:20 +1100)] 
LimEx NFA: no need to zero init cached_esucc

All of the "exception cache" members are guarded by cached_esucc.

10 years agomake Automaton_Base ctor protected
Alex Coyte [Mon, 2 Nov 2015 00:22:51 +0000 (11:22 +1100)] 
make Automaton_Base ctor protected

Makes explicit that Automaton_Base is intended to be used as a only base class

10 years agoadd asserts to make bounds on alphaShift clear
Alex Coyte [Sun, 1 Nov 2015 23:28:31 +0000 (10:28 +1100)] 
add asserts to make bounds on alphaShift clear

10 years agodoComponent: make it obvious that a is never null
Alex Coyte [Fri, 30 Oct 2015 05:20:18 +0000 (16:20 +1100)] 
doComponent: make it obvious that a is never null

10 years agoRoseBuildImpl: init base_id members
Justin Viiret [Fri, 30 Oct 2015 04:10:03 +0000 (15:10 +1100)] 
RoseBuildImpl: init base_id members

These are set late in the Rose build process, when final IDs are
allocated.

10 years agoFDR compiler: assert that all models are < 32 bits
Justin Viiret [Fri, 30 Oct 2015 04:01:20 +0000 (15:01 +1100)] 
FDR compiler: assert that all models are < 32 bits

10 years agoInit filter members to nullptr
Justin Viiret [Thu, 29 Oct 2015 22:52:49 +0000 (09:52 +1100)] 
Init filter members to nullptr

Note that BGL filters must be default-constructible.

10 years agoAdd q_last_type() queue function
Justin Viiret [Thu, 29 Oct 2015 22:43:28 +0000 (09:43 +1100)] 
Add q_last_type() queue function

Analogous to q_cur_type(), asserts that queue indices are within a valid
range.

10 years agoassignStringsToBuckets: assert that there are lits
Justin Viiret [Thu, 29 Oct 2015 22:33:04 +0000 (09:33 +1100)] 
assignStringsToBuckets: assert that there are lits

10 years agoMerge develop into master v4.0.1
Matthew Barr [Fri, 30 Oct 2015 00:29:20 +0000 (11:29 +1100)] 
Merge develop into master

10 years agoBump version number
Matthew Barr [Fri, 30 Oct 2015 00:20:28 +0000 (11:20 +1100)] 
Bump version number

10 years agounit: Don't run unit-internal in release build
Matthew Barr [Fri, 30 Oct 2015 00:14:32 +0000 (11:14 +1100)] 
unit: Don't run unit-internal in release build

10 years agoRemove unneeded code at preproc stage
Matthew Barr [Thu, 29 Oct 2015 23:43:43 +0000 (10:43 +1100)] 
Remove unneeded code at preproc stage

If we know we have BMI2 we shouldn't produce the fallback code.

10 years agodocs: describe BOOST_ROOT cmake variable
Matthew Barr [Thu, 29 Oct 2015 06:29:24 +0000 (17:29 +1100)] 
docs: describe BOOST_ROOT cmake variable

10 years agocmake: collection of fixes
Matthew Barr [Thu, 29 Oct 2015 06:29:06 +0000 (17:29 +1100)] 
cmake: collection of fixes

10 years agoreduce memory use in ng_small_literal_set/ng_literal_decorated
Alex Coyte [Thu, 29 Oct 2015 03:35:02 +0000 (14:35 +1100)] 
reduce memory use in ng_small_literal_set/ng_literal_decorated

These passes kept temporary strings/paths alive longer than was needed which
lead to high memory usage during these passes in pathological cases.

10 years agoCheck for (and throw on) large min repeat
Justin Viiret [Wed, 28 Oct 2015 22:08:40 +0000 (09:08 +1100)] 
Check for (and throw on) large min repeat

We were only checking for large maximum bounds, which meant that we
would attempt to compile A{N,} where N is huge.

10 years agoUpdate CMake required min version to 2.8.11
Matthew Barr [Mon, 26 Oct 2015 04:53:55 +0000 (15:53 +1100)] 
Update CMake required min version to 2.8.11

RedHat/CentOS 7 ship with 2.8.11 so this is a sane minimum.

10 years agoUnbreak unit-internal for builds w/o dump support
Justin Viiret [Tue, 20 Oct 2015 02:24:23 +0000 (13:24 +1100)] 
Unbreak unit-internal for builds w/o dump support

Use printable, rather than escapeString.

10 years agoRemove enum mqe_event and use u32 for queue events
Justin Viiret [Thu, 22 Oct 2015 23:59:48 +0000 (10:59 +1100)] 
Remove enum mqe_event and use u32 for queue events

We were using intermediate values int he enum and casting back and forth
with a u32; it is cleaner to just use a u32 and define some special
values.

Silences ICC warning #188: enumerated type mixed with another type.

10 years agoAllow no scratch for stream reset API calls
Justin Viiret [Fri, 23 Oct 2015 04:32:55 +0000 (15:32 +1100)] 
Allow no scratch for stream reset API calls

Bring hs_reset_stream(), hs_reset_and_copy_stream()'s functionality into
line with hs_close_stream() by accepting a NULL scratch if and only if
the match callback is also NULL, indicating that no matches should be
delivered.

10 years agoCustom NFA_API_NO_IMPL variant for zombie_status
Justin Viiret [Fri, 23 Oct 2015 00:39:31 +0000 (11:39 +1100)] 
Custom NFA_API_NO_IMPL variant for zombie_status

Silences ICC warning #188: enumerated type mixed with another type.

10 years agosidecar: use aligned_zmalloc_unique
Justin Viiret [Tue, 20 Oct 2015 03:37:15 +0000 (14:37 +1100)] 
sidecar: use aligned_zmalloc_unique

10 years agoFDRp tests: less raw malloc/free
Justin Viiret [Thu, 15 Oct 2015 05:08:51 +0000 (16:08 +1100)] 
FDRp tests: less raw malloc/free

10 years agoHyperscanScanGigabytesMatch: use a vector
Justin Viiret [Wed, 14 Oct 2015 01:38:13 +0000 (12:38 +1100)] 
HyperscanScanGigabytesMatch: use a vector

10 years agonfagraph_find_matches: simplify/cleanup
Justin Viiret [Mon, 12 Oct 2015 23:49:45 +0000 (10:49 +1100)] 
nfagraph_find_matches: simplify/cleanup

10 years agonfagraph_comp: use common constructGraph
Justin Viiret [Mon, 12 Oct 2015 23:16:54 +0000 (10:16 +1100)] 
nfagraph_comp: use common constructGraph

10 years agoInitial commit of Hyperscan v4.0.0
Matthew Barr [Mon, 19 Oct 2015 22:13:35 +0000 (09:13 +1100)] 
Initial commit of Hyperscan