git.ipfire.org Git - thirdparty/vectorscan.git/log

]> git.ipfire.org Git - thirdparty/vectorscan.git/log

projects / thirdparty / vectorscan.git / log

commit | commitdiff | tree

Olivia Trewin [Fri, 12 Jun 2026 14:05:24 +0000 (07:05 -0700)]

Fix #399 by correcting assert in buildFragmentPrograms (#400)

Co-authored-by: Olivia Trewin <otrewin@cloudflare.com>

commit | commitdiff | tree

Byeonguk Jeong [Fri, 24 Apr 2026 08:48:05 +0000 (17:48 +0900)]

Right Shift in rshift64_m128 fallback path (ARM NEON) (#396)

* Added testcase for issue #326, verified to work on Linux

* Fix cppcheck warnings

* Add comment pointing to the github issue

* simd: use unsigned shift intrinsics in ARM NEON rshift fallback paths

The fallback paths (when HAVE__BUILTIN_CONSTANT_P is not defined, i.e.
clang) in rshift_m128 and rshift64_m128 used vshlq_s32/vshlq_s64
(signed shift), which performs arithmetic right shift with sign
extension. This caused incorrect nibble extraction in shufti validation
when input bytes >= 0x80 landed at byte positions 7 or 15 within a
128-bit register.

Change all four shift helpers (lshift_m128, rshift_m128, lshift64_m128,
rshift64_m128) to use unsigned shifts.

Fixes: 4b41c5fe254e311d193b350a827c81885a30157a ("[NEON] simplify/optimize shift/align primitives")
Reported-by: Alexey Pismensky (#326)
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
---------

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Co-authored-by: Konstantinos Margaritis <konstantinos@vectorcamp.gr>

commit | commitdiff | tree

Konstantinos Margaritis [Wed, 22 Apr 2026 09:28:58 +0000 (12:28 +0300)]

Bug #210: Throw error at invalid character classes (#393)

commit | commitdiff | tree

Gabe DiFiore [Mon, 20 Apr 2026 07:36:16 +0000 (03:36 -0400)]

Fix stack buffer overflow in `rvermicelliDoubleExecReal()` (#392)

Fix stack buffer overflow in rvermicelliDoubleExecReal

Remove unconditional snprintf that writes input buffer length into
fixed 255-byte stack buffer. On inputs >255 bytes, this causes stack
smash detected by FORTIFY_SOURCE. The removed lines were debug
scaffolding that duplicated the DEBUG_PRINTF above them.

Fixes crash on patterns like ".*literal" with buffers >255 bytes.

commit | commitdiff | tree

Konstantinos Margaritis [Fri, 17 Apr 2026 16:45:23 +0000 (19:45 +0300)]

SIMDe is considered a valid platform (#390)

* SIMDe is considered a valid platform

* Refactor CMakeLists.txt, simplify fat building

* SIMDe can be the fallback option in the fat library

* remove redundant checks in hs.cpp

* Rework valid platform checks around SIMEe in fat library

* remove SIMDE checks in check_sse42, remove check_ssse3

* fix fat builds for aarch64

* reinstate cppcheck-suppress

* Fix SIMDE builds on non-x86

* fix x86 SIMDE fat builds once again

* another fix for x86 fat builds

* rename SIMDE_BACKEND to BUILD_SIMDE_BACKEND

* fail on CPU_NATIVE is enabled with clang, it doesn't support it and it has caused more trouble than it's worth

* rename SIMDE_NATIVE to BUILD_SIMDE_NATIVE as well

commit | commitdiff | tree

Not So Chiken [Thu, 16 Apr 2026 10:06:53 +0000 (11:06 +0100)]

Add unit tests for bug with arm64 HS_FLAG_DOTALL causing false negatives (#391)

commit | commitdiff | tree

Konstantinos Margaritis [Wed, 15 Apr 2026 07:41:40 +0000 (10:41 +0300)]

do not assume page_size is 4k, fails on Apple Silicon (#389)

commit | commitdiff | tree

Konstantinos Margaritis [Tue, 14 Apr 2026 22:31:33 +0000 (01:31 +0300)]

bug 349: remove dead code (#387)

bug 349: bug 349: remove dead code

commit | commitdiff | tree

Konstantinos Margaritis [Tue, 14 Apr 2026 19:34:55 +0000 (22:34 +0300)]

bug 328: fix integer overflow (#388)

commit | commitdiff | tree

Byeonguk Jeong [Tue, 7 Apr 2026 10:10:31 +0000 (19:10 +0900)]

shufti-double: fix regressions from #325 (#368)

* shufti-double: Fix double shufti's caller

run_accel() passed c_end - 1 to shuftiDoubleExec().
shuftiDoubleExecReal already handles the last-byte boundary internally
via check_last_byte(), so shortening the buffer caused it to miss
valid matches near the end and apply the wildcard check to the wrong
byte. Changed to pass c_end.

Fixes: ca70a3d9beca61b58c6709fead60ec662482d36e ("Fix double shufti's
vector end false positive (#325)")

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* shufti-double: Preserve first_char_mask after peel-off

first_char_mask was reset to Ones() after the peel-off block,
discarding carry-over state for cross-boundary pattern detection.
Remove the reset.

Fixes: ca70a3d9beca61b58c6709fead60ec662482d36e ("Fix double shufti's
vector end false positive (#325)")

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* shufti-double: Fix check_last_byte() loop condition

loop condition i > 2 missed the final vshr(1), leaving odd-indexed
bytes out of the reduce. Use i >= 2.

Fixes: ca70a3d9beca61b58c6709fead60ec662482d36e ("Fix double shufti's
vector end false positive (#325)")

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* shufti-double: Add regression tests for double shufti

Add three tests exercising the double shufti edge cases.

- ExecMatchVectorEdge: Two-byte pair ("ab") spanning the peel-off to
  aligned block boundary is correctly detected. Validates that
  first_char_mask state carries over and is not reset after peel-off.

- ExecNoMatchLastByte: First character of a double-byte pair ('x' from
  "xy") at the last buffer byte does not cause a false positive when
  the second character is absent.

- ExecMatchLastByte: Single-byte pattern ('a') at the last buffer byte
  is detected via check_last_byte's reduce.

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
---------

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>

commit | commitdiff | tree

Byeonguk Jeong [Tue, 7 Apr 2026 09:59:57 +0000 (18:59 +0900)]

test: add comprehensive unit tests for vermicelli and noodle accelerators (#380)

* vermicelli: Add vermicelli accelerator unit tests for edge cases

Add 50 new tests in vermicelli_extra.cpp covering:
- Small buffer paths (1 byte to VECTORSIZE-1)
- Exact VECTORSIZE and VECTORSIZE+1 boundary cases
- Per-position match sweep for all 7 vermicelli functions
- Reverse double vermicelli NoMatch (previously missing)
- Forward/reverse consistency checks
- Alignment stress tests
- Double vermicelli SIMD cross-boundary sweep
- Masked double vermicelli with partial bit masks
- Non-alphabetic character matching

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* gtest: initialize variable to suppress -Wuninitialized warning

Initialize 'dummy' variable in StackLowerThanAddress() to zero to
avoid potential undefined behavior and compiler warning.

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* sheng: correct DFA state transition table sizes

Add missing sentinel element to state transition vectors for both
16-state and 32-state DFA test configurations. The alpha_size includes
a sentinel entry at index alpha_size-1, so each state's next vector
must have alpha_size elements.

Fixes: d0325401f296 ("Add sheng tests")
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* noodle: add comprehensive noodle engine unit tests

Add tests covering edge cases and broader scenarios:
- Early termination via callback (single and double char)
- No-match scenarios for single and double patterns
- Empty and minimal-length buffer handling
- Large buffer scanning (multi-vector iteration)
- Case-insensitive matching for single and double patterns
- Unaligned buffer scanning
- Various alignment boundary conditions
- All-match dense buffers for single and double patterns

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
---------

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>

commit | commitdiff | tree

mccakit [Mon, 6 Apr 2026 16:33:34 +0000 (19:33 +0300)]

libcxx fix (#385)

commit | commitdiff | tree

Byeonguk Jeong [Mon, 6 Apr 2026 14:27:29 +0000 (23:27 +0900)]

simd: convert rshift64 macros to functions and fix simd_utils bugs (#376)

Convert rshift64_m128/m256/m512 macros to inline functions that
support runtime (non-constant) shift amounts on x86, matching the
existing lshift64 function implementations.

Also fix:
- lshift64_m256/rshift64_m256 parameter type from int to unsigned in
the non-256-bit fallback path (common/simd_utils.h)
- isnonzero512: remove redundant self-OR operations
- load512: fix alignment assertion to check m512 instead of m256

Fixes: 3f0f9e60526d ("move x86 implementations of simd_utils.h to util/arch/x86/")
Fixes: 6ff47528ba22 ("add scalar versions of the vectorized functions for architectures that don't support 256-bit/512-bit SIMD vectors such as ARM")
Fixes: 75aadb76f82e ("split arch-agnostic simd_utils.h functions into the common file")
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>

commit | commitdiff | tree

Byeonguk Jeong [Mon, 6 Apr 2026 09:42:00 +0000 (18:42 +0900)]

vermicelli: correct AVX-512 nvermicelli tail scan masking bug (#378)

In nvermicelliExecReal, the tail path loads a vector from
(buf_end - S) so the unprocessed tail bytes sit at the END of the
vector (high offsets). However, first_zero_match_inverted<64>
applies a mask selecting only the FIRST 'len' bytes (low offsets),
which means it re-checks the already-scanned overlap region and
completely misses the actual tail bytes.

This only affects AVX-512 (S=64) because the 16-byte and 32-byte
specializations of first_zero_match_inverted mark 'len' as UNUSED and
always check the full vector.

Fix by passing S instead of (buf_end - d) as the length, so the full
vector is checked. The overlap bytes are guaranteed to already match,
so no false positives are possible, and the existing (rv < buf_end)
guard prevents out-of-range results.

Fixes: d6fe28afc85e7 ("added refactored vermicelli_simd.cpp implementation")
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>

commit | commitdiff | tree

Byeonguk Jeong [Mon, 6 Apr 2026 09:23:16 +0000 (18:23 +0900)]

fix: correct SVE accelerator bugs in shufti, truffle, and noodle (#377)

* shufti-double: use predicate mask to prevent false positives

doubleMatched() in shufti_sve.hpp used svptrue_b8() for the final
comparison instead of the caller-provided predicate pg. When called
from dshuftiOnce() with a partial predicate (buffer shorter than SVE
vector length), inactive lanes loaded as zero could satisfy the match
condition, producing false positive matches.

Changed the return statement to use pg for svnot_z, ensuring inactive
lanes are excluded from match results.

Added 5 unit tests covering short/variable-length buffers with
null-byte pair patterns and mixed single/double-byte patterns to
catch regressions.

Fixes: 60b211250562626d6536e992cc1d0d52cd128f44 ("Use SVE for double shufti")
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* truffle: fix off-by-one in rtruffleExecSVE tail and add unit tests

Fix a bug in rtruffleExecSVE where the tail processing for short
buffers used svwhilele_b8 instead of svwhilelt_b8. svwhilele_b8(0, N)
activates lanes 0..N (N+1 lanes), reading one byte past the buffer
end. The forward path (truffleExecSVE) already correctly uses
svwhilelt_b8, which activates lanes 0..N-1 (N lanes).

Add 26 new unit tests for the truffle accelerator covering:
- Compile roundtrip: character ranges, empty class, same-nibble chars
- Forward exec: single byte buffers, high byte (>=0x80) matching,
  same-nibble non-match, NUL char, dot (all chars), buffer-end match,
  varying lengths (1-130), alignment sweep, multi-char classes,
  all 256 single-char classes, 0x7F/0x80 boundary
- Reverse exec: single byte, high byte, NUL, buffer-start match,
  varying lengths, large buffer (4K), alignment sweep, all 256
  single-char classes, multiple matches, boundary chars

Fixes: c67076ce22452bdfe423063b273ded8bd7444aae ("Add truffle SVE implementation")
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* hwlm: correct return types and scan length in SVE noodle engine

- Change return type from hwlmcb_rv_t to hwlm_error_t to match the
  actual return type of checkMatched() and singleCheckMatched()
- Fix scanDouble short-path condition: use (e - d) instead of scan_len
  which could be stale after adjusting d for history
- Fix formatting: add space after 'if' keyword

Fixes: 0ba1cbb32b5b ("Add SVE2 support for noodle")
Fixes: b2332218a474 ("Remove possibly undefined behaviour from Noodle.")
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
---------

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>

commit | commitdiff | tree

Yonatan Goldschmidt [Fri, 3 Apr 2026 12:39:47 +0000 (15:39 +0300)]

Fix out-of-bounds read in shuftiDoubleExecReal tail handling (#381)

Commit 9e9a10ad ("Fix double shufti's vector end false positive", #325)
changed the tail code in shuftiDoubleExecReal to read a full
vector from the current pointer (loadu(d)), which can overread past
buf_end by up to S-1 bytes. When the buffer ends at a page boundary
followed by unmapped memory, this causes a SIGSEGV.

Fix by reading the last S bytes backward from buf_end (matching the
approach used in shuftiExecReal), and falling back to memcpy for
buffers shorter than one vector width.

Signed-off-by: Yonatan Goldschmidt <yonatan.goldschmidt@wiz.io>

commit | commitdiff | tree

Byeonguk Jeong [Fri, 3 Apr 2026 07:47:05 +0000 (16:47 +0900)]

Supervector fixes v3 (#373)

* supervector: Fix multiple SuperVector bugs

Fix critical bugs found in the SuperVector SIMD abstraction layer.

SuperVector operator!() — x86 (SSE/AVX2/AVX512), ppc64el:
  - Was XOR-ing with self (always returns Zeroes instead of bitwise
    NOT).
  - Note that some other operators depends on operator!().

SuperVector<16> Ones_vshl() — x86:
  - Called vshr_128() instead of vshl_128().

Element-wise shift boundary and Unroller range — x86:
  - vshl_32/vshr_32 on SuperVector<16>: zero-boundary was N==16,
    instead of N>=32; Unroller range was <1,16> not <1,32>.
  - vshl_64/vshr_64 on SuperVector<16>: same issue.
  - vshl_64/vshr_64 on SuperVector<32>: same issue.
  - vshr_64 on SuperVector<64>: same issue.

SuperVector<32> vshr_256_imm — x86:
  - Was a copy-paste of vshl_256_imm.

SuperVector<64> vshl_256_imm / vshl_512_imm — x86:
  - Were unimplemented stubs returning empty SuperVector.

SuperVector<64> vshr_256_imm — x86:
  - Operated on v256[0] only with broken SuperVector<32> logic.

SuperVector<64> vsh{l,r}_* — x86, ppc64el, arm:
  - Were incorrectly delegating to vshl_128/vshr_128. (x86)
  - Did not have boundary checks. PPC wraps when it tries to shift
    more than bit length. (ppc64el)
  - Had signed rshifts. (arm)

comparison operators - arm:
  - operator>=: used vcgeq_u8 (unsigned) instead of vcgeq_s8 (signed).
  - operator<=: used vcgeq_s8 (>=) instead of vcleq_s8 (<=).

Fixes: 1af82e395fdce3117a1e18d9f8198a626b07cc2f
Fixes: f0e6b8459c4f1e9fb54d637c3666a4fab97f45cb
Fixes: 2f55e5b54f70693bb844015306793d29a29cd51c
Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* supervector: Add more unit tests for SuperVector

These tests covers NOT operator and many of element-wise shifts,
especially for AVX2, AVX512.

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
* shufti: Use operator!= for non-zero test

The match result vector (c_lo & c_hi) was compared using operator>
against Zeroes to detect non-zero (matching) bytes. On ARM, operator>
delegates to signed comparison (vcgtq_s8), which treats byte values
with the high bit set (0x80–0xFF) as negative, making them compare
as less than zero and falsely reporting no match.

Fixes: 92e0b9a35192ae975de9fd032fb66ddcb1682c5a ("simplify shufti and
provide arch-specific block functions")

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
---------

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>

commit | commitdiff | tree

Byeonguk Jeong [Fri, 3 Apr 2026 07:39:57 +0000 (16:39 +0900)]

accel: Fix offset clamping in do_accel_block to not go before start (#365)

When applying the accel offset, the result was clamped to buf which
could produce a position before *start, potentially causing the caller
to re-scan bytes that were already processed. Clamp to ptr (buf +
*start) instead, so that the accelerator never rewinds past the original
start position.

commit | commitdiff | tree

Konstantinos Margaritis [Wed, 1 Apr 2026 13:45:56 +0000 (16:45 +0300)]

Fix compiler errors with clang21, gcc15/gcc16. (#383)

* fix-clang21 build error with gtest

* fix gcc15/16 exception when idx2 is larger than src.next.size()

commit | commitdiff | tree

Byeonguk Jeong [Wed, 1 Apr 2026 10:08:22 +0000 (19:08 +0900)]

state-compress: fix compiler error in storecompressed128_64bit (#379)

Replace compress128 (vector op) + store128 pattern with per-chunk
compress64 (scalar op) to fix uninitialized variable warning on x[].

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>

commit | commitdiff | tree

Byeonguk Jeong [Thu, 12 Feb 2026 13:23:42 +0000 (22:23 +0900)]

sve2: Test SVE2 availability with both HWCAP_SVE and HWCAP2_SVE2 (#363)

hwcaps accidentally returns HWCAP2_SVE2 when only SME features are
supported, not SVE. Note that this bug has been fixed on kernel
mainline 6.14.

For more informations about this, see below:
https://lore.kernel.org/r/20250107-arm64-2024-dpisa-v6-1-7578da51fc3d@kernel.org

commit | commitdiff | tree

Tomer Lev [Thu, 12 Feb 2026 13:23:09 +0000 (15:23 +0200)]

cmake: add PKGCONFIG_EXTRA_LIBS option for pkg-config (#361)

Add a CMake cache variable to pass arbitrary flags to the pkg-config
Libs line. This enables CGO builds using GCC (not g++) to link correctly
by allowing users to specify C++ stdlib dependencies.

Usage: cmake -DPKGCONFIG_EXTRA_LIBS="-lstdc++ -lm" ...

commit | commitdiff | tree

Jeremy Linton [Thu, 12 Feb 2026 13:22:28 +0000 (07:22 -0600)]

update python/sphinx to use non deprecated api (#362)

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Co-authored-by: Jeremy Linton <jeremy.linton@arm.com>

commit | commitdiff | tree

Raúl Marín [Thu, 15 Jan 2026 17:57:53 +0000 (18:57 +0100)]

Fix broken include in small_vector.h (#359)

commit | commitdiff | tree

Tomer Lev [Thu, 15 Jan 2026 17:57:31 +0000 (19:57 +0200)]

cmake: remove ifunc requirement from cmake build (#360)

Removing ifunc requirement from osdetection.cmake

commit | commitdiff | tree

Konstantinos Margaritis [Wed, 12 Nov 2025 12:49:41 +0000 (14:49 +0200)]

Feature/refactor fdr (#251)

* remove the use of macros for critical loops, easier to debug
removed switch, merged get_conf_stride functions into 1

* remove the use of macros for critical loops, easier to debug
removed switch, merged get_conf_stride functions into 1
split FDR implementations into arch specific files (same for now)

commit | commitdiff | tree

JT [Sun, 12 Oct 2025 15:39:02 +0000 (11:39 -0400)]

Add Fedora Build/Installation Documentation v1 (#354)

* readme: add fedora build/package information

* readme: minor spelling/formatting updates

---------

Co-authored-by: Jason Taylor <jtfas90@proton.me>

commit | commitdiff | tree

tnias [Sun, 12 Oct 2025 15:37:47 +0000 (17:37 +0200)]

Improve the (cross-)build experience especially with LLVM (#353)

* Suppress clang warnings caused by boost headers

* Make binutils selectable via environment variables

Some build environments specify what tools to use via environment variables.

e.g. NM could be 'x86_64-unknown-linux-gnu-nm' when cross-compiling

* Use llvm-nm friendly non-short format value

LLVM's nm implementaion (llvm-nm) wants the entire string to match.
It doesn't allow just 'p' it wants 'posix'.

- https://github.com/llvm/llvm-project/blob/llvmorg-21.1.3/llvm/tools/llvm-nm/llvm-nm.cpp#L2435-L2448

The nm implementations from GNU binutils and FreeBSD parse the format
argument value only by looking at the first character.
They are accept both 'p' and 'posix'.

- https://github.com/bminor/binutils-gdb/blob/binutils-2_45/binutils/nm.c#L410-L438
- https://github.com/freebsd/freebsd-src/blob/release/14.3.0/contrib/elftoolchain/nm/nm.c#L506-L529

commit | commitdiff | tree

wnwu [Fri, 10 Oct 2025 21:49:45 +0000 (14:49 -0700)]

fix mingw compile error by resolving std::min type mismatch (#346)

Co-authored-by: Weining Wu <wnwu@fortinet.com>

commit | commitdiff | tree

Josef Schlehofer [Fri, 10 Oct 2025 21:48:48 +0000 (23:48 +0200)]

gcc does not recognize armv7a, but it recognizes armv7-a (#348)

arch
____
Fixes:
arm-openwrt-linux-muslgnueabi-gcc: error: unrecognized -march target: armv7a
arm-openwrt-linux-muslgnueabi-gcc: note: valid arguments are: armv4 armv4t armv5t armv5te armv5tej armv6 armv6j armv6k armv6z armv6kz armv6zk armv6t2 armv6-m armv6s-m armv7 armv7-a armv7ve armv7-r armv7-m armv7e-m armv8-a armv8.1-a armv8.2-a armv8.3-a armv8.4-a armv8.5-a armv8.6-a armv8-m.base armv8-m.main armv8-r armv8.1-m.main armv9-a iwmmxt iwmmxt2; did you mean 'armv7'?
arm-openwrt-linux-muslgnueabi-gcc: error: missing argument to '-march='

Reference in Linux kernel for the same change:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm/Makefile?h=v6.17-rc7&id=76ebc6a429ec2becc2fa738c85ab9688ea4b9006

flag
----
generic flag for armv7-a does not exist

Fixes:
2025-09-26T08:57:44.2958982Z cc1: error: unrecognized -mtune target: generic
2025-09-26T08:57:44.2965968Z cc1: note: valid arguments are: arm8 arm810 strongarm strongarm110 fa526 fa626 arm7tdmi arm7tdmi-s arm710t arm720t arm740t arm9 arm9tdmi arm920t arm920 arm922t arm940t ep9312 arm10tdmi arm1020t arm9e arm946e-s arm966e-s arm968e-s arm10e arm1020e arm1022e xscale iwmmxt iwmmxt2 fa606te fa626te fmp626 fa726te arm926ej-s arm1026ej-s arm1136j-s arm1136jf-s arm1176jz-s arm1176jzf-s mpcorenovfp mpcore arm1156t2-s arm1156t2f-s cortex-m1 cortex-m0 cortex-m0plus cortex-m1.small-multiply cortex-m0.small-multiply cortex-m0plus.small-multiply generic-armv7-a cortex-a5 cortex-a7 cortex-a8 cortex-a9 cortex-a12 cortex-a15 cortex-a17 cortex-r4 cortex-r4f cortex-r5 cortex-r7 cortex-r8 cortex-m7 cortex-m4 cortex-m3 marvell-pj4 cortex-a15.cortex-a7 cortex-a17.cortex-a7 cortex-a32 cortex-a35 cortex-a53 cortex-a57 cortex-a72 cortex-a73 exynos-m1 xgene1 cortex-a57.cortex-a53 cortex-a72.cortex-a53 cortex-a73.cortex-a35 cortex-a73.cortex-a53 cortex-a55 cortex-a75 cortex-a76 cortex-a76ae cortex-a77 cortex-a78 cortex-a78ae cortex-a78c cortex-a710 cortex-x1 cortex-x1c neoverse-n1 cortex-a75.cortex-a55 cortex-a76.cortex-a55 neoverse-v1 neoverse-n2 cortex-m23 cortex-m33 cortex-m35p cortex-m52 cortex-m55 star-mc1 cortex-m85 cortex-r52 cortex-r52plus

commit | commitdiff | tree

graysky [Fri, 10 Oct 2025 21:47:15 +0000 (17:47 -0400)]

README: update for Arch Linux and OpenWrt (#347)

Co-authored-by: graysky <therealgraysky AT proton DOT me>

commit | commitdiff | tree

Konstantinos Margaritis [Tue, 22 Jul 2025 15:09:14 +0000 (18:09 +0300)]

Release 5.4.12 (#341)

Multiple changes since last release, this will be the last 100% ABI and
API compatible with Hyperscan release.
Next versions will include major refactors and API extensions, it will
be mostly backwards compatible however.
Without particular order, platform support is now:

* Linux (x86, Arm, Power)
* FreeBSD 14 (x86, Arm, Power)
* MacOS 14+ (x86, Arm)

In total more than 200 configurations in the CI are tested for every PR.

Other features:
- Fat Runtime supported for Arm as well (ASIMD/SVE/SVE2).
- Initial implementations for Arm SVE/SVE2 algorithms added, thanks to
Yoan Picchi from Arm.
- SIMDe support added, used as an alternative backend for existing
platforms, but mostly interesting for allowing Vectorscan to build in
new platforms without a supported SIMD engine.
- Various speedups and optimizations.
- Cppcheck and clang-tidy fixes throughout the code, both have been
added to CI for multiple configurations, but only cppcheck triggers a
build failure for now.

Various bugfixes, most important listed:
- Speed up truffle with 256b TBL instructions (#290)
- Fix Clang Tidy warnings (#295)
- Clang 17+ is more restrictive on rebind<T> on MacOS/Boost, remove
warning (#332)
- partial_load_u64 will fail if buf == NULL/c_len == 0 (#331)
- Bugfix/fix avx512vbmi regressions (#335)
- fix missing hs_version.h header (closes #198)
- hs_valid_platform: Fix check for SSE4.2 (#310)
- Fixed out of bounds read in AVX512VBMI version of fdr_exec_fat_teddy …
(#333)
- Fix noodle SVE2 off by one bug (#313)
- Make vectorscan accept \0 starting pattern (#312)
- Fix 5.4.11's config step regression (#327)
- Fix double shufti's vector end false positive (#325)

commit | commitdiff | tree

Konstantinos Margaritis [Tue, 22 Jul 2025 11:04:53 +0000 (14:04 +0300)]

Feature/prepare 5.4.12 (#340)

* Add entry for Changelog
* Add new contributors
* Bump library version

commit | commitdiff | tree

ibrkas01arm [Mon, 21 Jul 2025 11:36:59 +0000 (12:36 +0100)]

cmake - guard against failed GNUCC_ARCH extraction (#339)

Prevents overwriting GNUCC_ARCH with an empty value when parsing output
of gcc -Q --help=target. Ensures robustness if detection fails and
returns an empty string.

Signed-off-by: Ibrahim Kashif <ibrahim.kashif@arm.com>

commit | commitdiff | tree

ypicchi-arm [Wed, 11 Jun 2025 15:55:10 +0000 (16:55 +0100)]

Fix double shufti's vector end false positive (#325)

* Add regression test for double shufti

It tests for false positive at vector edges.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
* Fix double shufti reporting false positives

Double shufti used to offset one vector, resulting in losing one character
at the end of every vector. This was replaced by a magic value indicating a
match. This meant that if the first char of a pattern fell on the last char of
a vector, double shufti would assume the second character is present and
report a match.
This patch fixes it by keeping the previous vector and feeding its data to the
new one when we shift it, preventing any loss of data.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
* vshl() will call the correct implementation

* implement missing vshr_512_imm(), simplifies caller x86 code

* Fix x86 case, use alignr instead

* it's the reverse, the avx512 alignr is incorrect, need to fix

* Make shufti's OR reduce size agnostic

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
* Fix test's array size

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
* Fix AVX2/AVX512 alignr implementations and unit tests

* Fix Power VSX alignr

---------

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
Co-authored-by: Konstantinos Margaritis <konstantinos@vectorcamp.gr>

commit | commitdiff | tree

Chrysovalantis - Michail Liakopoulos [Sat, 31 May 2025 08:46:22 +0000 (11:46 +0300)]

Fix/fbsd gcc13 error (#338)

* added static libraries in cmake to fix unit-internal seg fault in freebsd, ppc64le, gcc13 error
* Moved gcc13 flags for freebsd-gcc13 in cmake/cflags-ppc64le.make

commit | commitdiff | tree

Konstantinos Margaritis [Fri, 30 May 2025 18:08:55 +0000 (21:08 +0300)]

Bugfix/fix avx512vbmi regressions (#335)

Multiple AVX512VBMI-related fixes:

src/nfa/mcsheng_compile.cpp: No need for an assert here, impl_id can be set to 0
src/nfa/nfa_api_queue.h: Make sure this compiles on both C++ and C
src/nfagraph/ng_fuzzy.cpp: Fix compilation error when DEBUG_OUTPUT=on
src/runtime.c: Fix crash when data == NULL
unit/internal/sheng.cpp: Unit test has to enable AVX512VBMI manually as autodetection does not get trigger, this causes test to fail
src/fdr/teddy_fat.cpp: AVX512 loads need to be 64-bit aligned, caused a crash on clang-18

commit | commitdiff | tree

Konstantinos Margaritis [Fri, 30 May 2025 09:27:28 +0000 (12:27 +0300)]

Various cppcheck fixes (#337)

commit | commitdiff | tree

Rafał Dowgird [Sun, 18 May 2025 08:01:10 +0000 (10:01 +0200)]

Fixed out of bounds read in AVX512VBMI version of fdr_exec_fat_teddy … (#333)

Fixed out of bounds read in AVX512VBMI version of fdr_exec_fat_teddy (#322)

* Replaced the 32 byte read with a properly truncated mapped read
* Added a unit test

Co-authored-by: Rafał Dowgird <rafal.dowgird@rtbhouse.com>

commit | commitdiff | tree

Konstantinos Margaritis [Fri, 16 May 2025 10:44:36 +0000 (13:44 +0300)]

partial_load_u64 will fail if buf == NULL/c_len == 0 (#331)

commit | commitdiff | tree

Konstantinos Margaritis [Fri, 16 May 2025 10:44:20 +0000 (13:44 +0300)]

Clang 17+ is more restrictive on rebind<T> on MacOS/Boost, remove warning (#332)

* Clang 17+ is more restrictive on rebind<T> on MacOS/Boost, remove warning

* More clang/boost warnings on MacOS, disable for now

commit | commitdiff | tree

ypicchi-arm [Wed, 14 May 2025 21:58:01 +0000 (22:58 +0100)]

Fix 5.4.11's config step regression (#327)

An old commit (24ae1670d) had the side effect of moving cmake defines after
they were being used. This patch move them back to be defined before being used.
Speed hsbench back up by ~ 0.8%

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

commit | commitdiff | tree

HelixHexagon [Wed, 14 May 2025 20:49:07 +0000 (21:49 +0100)]

Fix typo in build instructions (#315)

commit | commitdiff | tree

gtsoul-tech [Wed, 13 Nov 2024 08:43:23 +0000 (10:43 +0200)]

Fix regression error #317 and add unit test (#318)

Revert the code that produced the regression error in #317
Add the regression error to a unit test regressions.cpp along with the rebar tests

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

gtsoul-tech [Tue, 12 Nov 2024 08:01:11 +0000 (10:01 +0200)]

Cppcheck errors fixed and suppressed (#319)

* supress knownConditionTrueFalse

* cppcheck suppress redundantInitialization

* cppcheck solve stlcstrStream

* cppcheck suppress useStlAlgorithm

* cppcheck-suppress derefInvalidIteratorRedundantCheck

* cppcheck solvwe constParameterReference

* const parameter reference cppcheck

* removed wrong fix

* cppcheck-suppress memsetClassFloat

* cppcheck fix memsetClassFloat

* cppcheck fix unsignedLessThanZero

* supressing all errors on simde gitmodule

* fix typo (unsignedLessThanZero)

* fix cppcheck suppress simde gitmodule

* cppcheck-suppress unsignedLessThanZero

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

ypicchi-arm [Thu, 29 Aug 2024 10:49:29 +0000 (11:49 +0100)]

Fix noodle sve2 off by one (#313)

* Revert "Fix noodle SVE2 off by one bug"

This patch was fixing the bug when it happens at the end of the buffer
but it wasn't fixing it when we do scanDoubleOnce before the main loop

The next patch fix this bug for both case instead

This reverts commit 48dd0e5ff0bc1995d62461c92cfb76d44d1d0105.

* Fix noodle spurious match with \0 chars for SVE2

When sve2's noodle process a non full vector (before the main loop or
at the end of it), a fake \0 was being parsed, trigerring a match for
pattern that ended with \0. This patch fix this.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
---------

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

commit | commitdiff | tree

Michael Tremer [Thu, 22 Aug 2024 07:34:05 +0000 (08:34 +0100)]

hs_valid_platform: Fix check for SSE4.2 (#310)

Vectorscan requires SSE4.2 as a minimum on x86_64. For Hyperscan this
used to be SSSE3.

Applications that use the library call hs_valid_platform() to check if
the CPU fulfils this minimum requirement. However, when Vectorscan
upgraded to SSE4.2, the check was not updated. This leads to the library
trying to execute instructions that are not supported, resulting in the
application to crash.

This might not have been noticed as the CPUs that do not support SSE4.2
are rather old and unlikely to run any load where performance is an
issue. However, I believe that the library should not let the
application crash.

Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>

commit | commitdiff | tree

ypicchi-arm [Thu, 22 Aug 2024 07:32:53 +0000 (08:32 +0100)]

Make vectorscan accept \0 starting pattern (#312)

Vectorscan used to reject such pattern because they were being compared
to "" and found to be an empty string. We now check the pattern length
instead.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

commit | commitdiff | tree

ypicchi-arm [Mon, 5 Aug 2024 06:42:56 +0000 (07:42 +0100)]

Fix noodle SVE2 off by one bug (#309)

By using svmatch on 16 bit lanes with a 8 bit predicate, we end up
including an undefined character in the pattern checks. The inactive
lane after load contains an undefined value, usually \0. Patterns
using \0 as the last character would then match this spurious
character, returning a match beyond the buffer's end. The fix checks
for such matches and rejects them.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

commit | commitdiff | tree

gtsoul-tech [Mon, 29 Jul 2024 08:49:25 +0000 (11:49 +0300)]

Bug fix/rebar tests (#307)

* fixed paths and utf8-lossy=true

* revert to maskz (its the bug)

* cppcheck fix

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

gtsoul-tech [Wed, 24 Jul 2024 07:39:24 +0000 (10:39 +0300)]

Rebar based Unit tests (#305)

* rebar based unit tests

* fixing paths

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

g. economou [Fri, 12 Jul 2024 12:23:07 +0000 (15:23 +0300)]

maybe fix the hsbench issue (check_ssse3 again) in sse2/simde env (#306)

* maybe fix the hsbench issue (check_ssse3 again) in sse2/simde env

* fix the last failing unit test with fat

---------

Co-authored-by: G.E. <gregory.economou@vectorcamp.gr>

commit | commitdiff | tree

g. economou [Wed, 10 Jul 2024 18:20:17 +0000 (21:20 +0300)]

build/run on machines that only have SSE2 with SIMDE (#303)

This allows the use of SIMDE library to emulate SSSE3/SSE4.2 instructions on SSE2-only (x86-64-v2) hardware.

---------

Co-authored-by: G.E <gregory.economou@vectorcamp.gr>
Co-authored-by: Konstantinos Margaritis <konstantinos@vectorcamp.gr>

commit | commitdiff | tree

g. economou [Wed, 26 Jun 2024 19:35:33 +0000 (22:35 +0300)]

Teddy macros unrolling - initial PR to test in CI (#294)

Major refactoring of teddy and teddy_avx2, unrolling macros to C++ templated functions

---------

Co-authored-by: G.E <gregory.economou@vectorcamp.gr>

commit | commitdiff | tree

gtsoul-tech [Thu, 20 Jun 2024 11:57:19 +0000 (14:57 +0300)]

Bug fix/clang-tidy-performance (#300)

Various clang-tidy-performance fixes:
* noexcept
* performance-noexcept-swap
* performance
* performance-move-const-arg
* performance-unnecessary-value-param
* performance-inefficient-vector-operation
* performance-no-int-to-ptr
* add performance
* performance-inefficient-string-concatenation
* clang-analyzer-deadcode.DeadStores
* performance-inefficient-vector-operation
* clang-analyzer-core.NullDereference
* clang-analyzer-core.UndefinedBinaryOperatorResult
* clang-analyzer-core.CallAndMessage

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

gtsoul-tech [Tue, 11 Jun 2024 12:05:52 +0000 (15:05 +0300)]

Script for the clang-tidy CI (#299)

script to clang-tidy CI

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

gtsoul-tech [Mon, 10 Jun 2024 07:08:54 +0000 (10:08 +0300)]

Bug fix/clang tidy warnings part3 (#298)

* clang-analyzer-deadcode.DeadStores

* clang-analyzer-optin.performance.Padding

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

gtsoul-tech [Fri, 7 Jun 2024 14:10:32 +0000 (17:10 +0300)]

Clang-tidy config (#297)

clang-tidy config

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

gtsoul-tech [Tue, 4 Jun 2024 13:18:17 +0000 (16:18 +0300)]

Bug fix/clang tidy warnings part2 (#296)

* core.StackAddressEscape

* cplusplus.VirtualCall

* clang-analyzer-deadcode.DeadStores

* clang-analyzer-core.NullDereference

* clang-analyzer-core.NonNullParamChecker

* change to nolint

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>

commit | commitdiff | tree

Konstantinos Margaritis [Fri, 31 May 2024 15:23:16 +0000 (18:23 +0300)]

Fix Clang Tidy warnings (#295)

Fixes some of the clang-tidy warnings
clang-analyzer-deadcode.DeadStores
clang-analyzer-cplusplus.NewDelete
clang-analyzer-core.uninitialized.UndefReturn

closes some:#253

ignored in this pr:
/usr/include/boost/smart_ptr/detail/shared_count.hpp:432:24
/usr/include/boost/smart_ptr/detail/shared_count.hpp:443:24
51 in build/src/parser
gtest ones
src/fdr/teddy_compile.cpp:600:5 refactoring on way
src/fdr/fdr_compile.cpp:209:5 refactoring on way

commit | commitdiff | tree

gtsoul-tech [Fri, 31 May 2024 07:34:55 +0000 (10:34 +0300)]

clang-analyzer-cplusplus.Move

commit | commitdiff | tree

gtsoul-tech [Fri, 31 May 2024 07:24:44 +0000 (10:24 +0300)]

clang-analyzer-core.uninitialized.UndefReturn

commit | commitdiff | tree

gtsoul-tech [Fri, 31 May 2024 06:47:45 +0000 (09:47 +0300)]

remove comment

commit | commitdiff | tree

Konstantinos Margaritis [Thu, 30 May 2024 14:24:14 +0000 (17:24 +0300)]

Fix Clang Tidy warning optin.performance.Padding (#293)

Fixes some optin.performance.Padding

closes some: #253

commit | commitdiff | tree

gtsoul-tech [Thu, 30 May 2024 13:40:55 +0000 (16:40 +0300)]

clang-analyzer-cplusplus.NewDelete

commit | commitdiff | tree

gtsoul-tech [Thu, 30 May 2024 13:40:47 +0000 (16:40 +0300)]

deadcode.DeadStores

commit | commitdiff | tree

gtsoul-tech [Thu, 30 May 2024 13:40:18 +0000 (16:40 +0300)]

deadcode.DeadStores

commit | commitdiff | tree

gtsoul-tech [Wed, 29 May 2024 08:51:06 +0000 (11:51 +0300)]

uninitialized.UndefReturn

commit | commitdiff | tree

gtsoul-tech [Tue, 28 May 2024 11:15:03 +0000 (14:15 +0300)]

cplusplus.Move

commit | commitdiff | tree

gtsoul-tech [Mon, 27 May 2024 12:41:57 +0000 (15:41 +0300)]

optin.performance.Padding

commit | commitdiff | tree

Konstantinos Margaritis [Mon, 27 May 2024 09:23:02 +0000 (12:23 +0300)]

Fix/Suppress remaining Cppcheck warnings (#291)

Fix/suppress the following cppcheck warnings:

* arithOperationsOnVoidPointer
* uninitMember
* const*
* shadowVariable
* assignmentIntegerToAddress
* containerOutOfBounds
* pointer-related warnings in Ragel source
* missingOverride
* memleak
* knownConditionTrueFalse
* noExplicitConstructor
* invalidPrintfArgType_sint
* useStlAlgorithm
* cstyleCast
* clarifyCondition
* VSX-related cstyleCast
* unsignedLessThanZero

Furthermore, we added a suppression list to be used, which also includes the following:
* missingIncludeSystem
* missingInclude
* unmatchedSuppression

commit | commitdiff | tree

Konstantinos Margaritis [Fri, 24 May 2024 20:24:58 +0000 (23:24 +0300)]

Part 5 of C-style cast cppcheck (#289)

Fixes some cstyleCasts part 5

closes some: #252

commit | commitdiff | tree

gtsoul-tech [Fri, 24 May 2024 07:12:15 +0000 (10:12 +0300)]

revert uniform cstyle suppress

commit | commitdiff | tree

Konstantinos Margaritis [Thu, 23 May 2024 06:38:24 +0000 (09:38 +0300)]

Speed up truffle with 256b TBL instructions (#290)

256b wide SVE vectors allow some simplification of truffle. Up to 40%
speedup on graviton3. Going from 12500 MB/s to 17000 MB/s onhe
microbenchmark.
SVE2 also offer this capability for 128b vector with a speedup around
25% compared to normal SVE

Add unit tests and benchmark for this wide variant

commit | commitdiff | tree

Yoan Picchi [Tue, 23 Apr 2024 12:04:40 +0000 (12:04 +0000)]

Speed up truffle with 256b TBL instructions

256b wide SVE vectors allow some simplification of truffle.
Up to 40% speedup on graviton3. Going from 12500 MB/s to 17000 MB/s
onhe microbenchmark.
SVE2 also offer this capability for 128b vector with a speedup around
25% compared to normal SVE

Add unit tests and benchmark for this wide variant

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

commit | commitdiff | tree

gtsoul-tech [Wed, 22 May 2024 08:11:13 +0000 (11:11 +0300)]

cstylecast parser

commit | commitdiff | tree

gtsoul-tech [Wed, 22 May 2024 07:16:56 +0000 (10:16 +0300)]

cstylecasts and suppressions

commit | commitdiff | tree

Konstantinos Margaritis [Tue, 21 May 2024 20:43:04 +0000 (23:43 +0300)]

Merge pull request #288 from isildur-g/bugfix-assert

revert a change to assert

commit | commitdiff | tree

Konstantinos Margaritis [Tue, 21 May 2024 12:59:51 +0000 (15:59 +0300)]

Merge pull request #287 from gtsoul-tech/bugFix/cppcheck-cStylecasts-Part4

Part 4 of C-style cast cppcheck

commit | commitdiff | tree

Konstantinos Margaritis [Tue, 21 May 2024 12:52:15 +0000 (15:52 +0300)]

Merge pull request #283 from isildur-g/wip-cppcheck271-part2

Wip cppcheck271 useStlAlgorithm part2

commit | commitdiff | tree

G.E. [Mon, 20 May 2024 15:03:56 +0000 (18:03 +0300)]

revert a change to assert , the original logic might have been
subtely clever (or else totally useless all these years), when we
see which of the two we might delete that assert entirely. for now
put it back as it was.

commit | commitdiff | tree

gtsoul-tech [Mon, 20 May 2024 14:09:30 +0000 (17:09 +0300)]

cstylecasts suppress,fixes

commit | commitdiff | tree

G.E [Mon, 20 May 2024 13:35:58 +0000 (16:35 +0300)]

undo that one, it breaks

commit | commitdiff | tree

Konstantinos Margaritis [Mon, 20 May 2024 12:17:01 +0000 (15:17 +0300)]

Merge pull request #286 from VectorCamp/bugfix/cppcheck-fix-wrong-casts

Fix remaining C style casts and suppressions

commit | commitdiff | tree

gtsoul-tech [Mon, 20 May 2024 11:54:35 +0000 (14:54 +0300)]

fix cStyleCasts