Commit Graph

6 Commits

Author SHA1 Message Date
Adolf Belka
f86e23906e pcre2: Update to version 10.40
- Update from 10.39 to 10.40
- Update of rootfile
- Changelog
   Version 10.40 15-April-2022
	1. Merged patch from @carenas (GitHub #35, 7db87842) to fix pcre2grep incorrect
	   handling of multiple passes.
	2. Merged patch from @carenas (GitHub #36, dae47509) to fix portability issue
	   in pcre2grep with buffered fseek(stdin).
	3. Merged patch from @carenas (GitHub #37, acc520924) to fix tests when -S is
	   not supported.
	4. Revert an unintended change in JIT repeat detection.
	5. Merged patch from @carenas (GitHub #52, b037bfa1) to fix build on GNU Hurd.
	6. Merged documentation and comments patches from @carenas (GitHub #47).
	7. Merged patch from @carenas (GitHub #49) to remove obsolete JFriedl test code
	   from pcre2grep.
	8. Merged patch from @carenas (GitHub #48) to fix CMake install issue #46.
	9. Merged patch from @carenas (GitHub #53) fixing NULL checks in matching and
	   substituting.
	10. Add null_subject and null_replacement modifiers to pcre2test.
	11. Add check for NULL subject to POSIX regexec() function.
	12. Add check for NULL replacement to pcre2_substitute().
	13. For the subject arguments of pcre2_match(), pcre2_dfa_match(), and
	    pcre2_substitute(), and the replacement argument of the latter, if the pointer
	    is NULL and the length is zero, treat as an empty string. Apparently a number
	    of applications treat NULL/0 in this way.
	14. Added support for Bidi_Class and a number of binary Unicode properties,
	    including Bidi_Control.
	15. Fix some minor issues raised by clang sanitize.
	16. Very minor code speed up for maximizing character property matches.
	17. A number of changes to script matching for \p and \P:
	    (a) Script extensions for a character are now coded as a bitmap instead of
	        a list of script numbers, which should be faster and does not need a
	        loop.
	    (b) Added the syntax \p{script:xxx} and \p{script_extensions:xxx} (synonyms
	        sc and scx).
	    (c) Changed \p{scriptname} from being the same as \p{sc:scriptname} to being
	        the same as \p{scx:scriptname} because this change happened in Perl at
	        release 5.26.
	    (d) The standard Unicode 4-letter abbreviations for script names are now
	        recognized.
	    (e) In accordance with Unicode and Perl's "loose matching" rules, spaces,
	        hyphens, and underscores are ignored in property names, which are then
	        matched independent of case.
	18. The Python scripts in the maint directory have been refactored. There are
	    now three scripts that generate pcre2_ucd.c, pcre2_ucp.h, and pcre2_ucptables.c
	    (which is #included by pcre2_tables.c). The data lists that used to be
	    duplicated are now held in a single common Python module.
	19. On CHERI, and thus Arm's Morello prototype, pointers are represented as
	    hardware capabilities, which consist of both an integer address and additional
	    metadata, meaning they are twice the size of the platform's size_t type, i.e.
	    16 bytes on a 64-bit system. The ovector member of heapframe happens to only be
	    8 byte aligned, and so computing frame_size ended up with a multiple of 8 but
	    not 16. Whilst the first frame was always suitably aligned, this then
	    misaligned the frame that follows, resulting in an alignment fault when storing
	    a pointer to Fecode at the start of match. Patch to fix this issue by Jessica
	    Clarke PR#72.
	20. Added -LP and -LS listing options to pcre2test.
	21. A user discovered that the library names in CMakeLists.txt for MSVC
	    debugger (PDB) files were incorrect - perhaps never tried for PCRE2?
	22. An item such as [Aa] is optimized into a caseless single character match.
	    When this was quantified (e.g. [Aa]{2}) and was also the last literal item in a
	    pattern, the optimizing "must be present for a match" character check was not
	    being flagged as caseless, causing some matches that should have succeeded to
	    fail.
	23. Fixed a unicode properrty matching issue in JIT. The character was not
	    fully read in caseless matching.
	24. Fixed an issue affecting recursions in JIT caused by duplicated data
	    transfers.
	25. Merged patch from @carenas (GitHub #96) which fixes some problems with
	    pcre2test and readline/readedit:
	      * Use the right header for libedit in FreeBSD with autoconf
	      * Really allow libedit with cmake
	      * Avoid using readline headers with libedit

Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Reviewed-by: Peter Müller <peter.mueller@ipfire.org>
2022-04-24 19:13:10 +00:00
Peter Müller
9a7e4d8506 Switch checksums from MD5 to BLAKE2
Historically, the MD5 checksums in our LFS files serve as a protection
against broken downloads, or accidentally corrupted source files.

While the sources are nowadays downloaded via HTTPS, it make sense to
beef up integrity protection for them, since transparently intercepting
TLS is believed to be feasible for more powerful actors, and the state
of the public PKI ecosystem is clearly not helping.

Therefore, this patch switches from MD5 to BLAKE2, updating all LFS
files as well as make.sh to deal with this checksum algorithm. BLAKE2 is
notably faster (and more secure) than SHA2, so the performance penalty
introduced by this patch is negligible, if noticeable at all.

In preparation of this patch, the toolchain files currently used have
been supplied with BLAKE2 checksums as well on
https://source.ipfire.org/.

Cc: Michael Tremer <michael.tremer@ipfire.org>
Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
Acked-by: Michael Tremer <michael.tremeripfire.org>
2022-04-02 14:19:25 +00:00
Adolf Belka
43164c6557 pcre2: Update to version 10.39
- Update from 10.37 to 10.39
- Update of rootfile
- Changelog
  Version 10.39 29-October-2021
    1. Fix incorrect detection of alternatives in first character search in JIT.
    2. Merged patch from @carenas (GitHub #28):
       Visual Studio 2013 includes support for %zu and %td, so let newer
       versions of it avoid the fallback, and while at it, make sure that
       the first check is for DISABLE_PERCENT_ZT so it will be always
       honoured if chosen.
       prtdiff_t is signed, so use a signed type instead, and make sure
       that an appropiate width is chosen if pointers are 64bit wide and
       long is not (ex: Windows 64bit).
       IMHO removing the cast (and therefore the positibilty of truncation)
       make the code cleaner and the fallback is likely portable enough
       with all 64-bit POSIX systems doing LP64 except for Windows.
    3. Merged patch from @carenas (GitHub #29) to update to Unicode 14.0.0.
    4. Merged patch from @carenas (GitHub #30):
       * Cleanup: remove references to no longer used stdint.h
         Since 19c50b9d (Unconditionally use inttypes.h instead of trying for stdint.h
         (simplification) and remove the now unnecessary inclusion in
         pcre2_internal.h., 2018-11-14), stdint.h is no longer used.
         Remove checks for it in autotools and CMake and document better the expected
         build failures for systems that might have stdint.h (C99) and not inttypes.h
         (from POSIX), like old Windows.
       * Cleanup: remove detection for inttypes.h which is a hard dependency
         CMake checks for standard headers are not meant to be used for hard
         dependencies, so will prevent a possible fallback to work.
         Alternatively, the header could be checked to make the configuration fail
         instead of breaking the build, but that was punted, as it was missing anyway
         from autotools.
    5. Merged patch from @carenas (GitHub #32):
       * jit: allow building with ancient MSVC versions
         Visual Studio older than 2013 fails to build with JIT enabled, because it is
         unable to parse non C89 compatible syntax, with mixed declarations and code.
         While most recent compilers wouldn't even report this as a warning since it
         is valid C99, it could be also made visible by adding to gcc/clang the
         -Wdeclaration-after-statement flag at build time.
         Move the code below the affected definitions.
       * pcre2grep: avoid mixing declarations with code
         Since d5a61ee8 (Patch to detect (and ignore) symlink loops in pcre2grep,
         2021-08-28), code will fail to build in a strict C89 compiler.
         Reformat slightly to make it C89 compatible again.
  Version 10.38 01-October-2021
    1. Fix invalid single character repetition issues in JIT when the repetition
       is inside a capturing bracket and the bracket is preceeded by character
       literals.
    2. Installed revised CMake configuration files provided by Jan-Willem Blokland.
       This extends the CMake build system to build both static and shared libraries
       in one go, builds the static library with PIC, and exposes PCRE2 libraries
       using the CMake config files. JWB provided these notes:
       - Introduced CMake variable BUILD_STATIC_LIBS to build the static library.
       - Make a small modification to config-cmake.h.in by removing the PCRE2_STATIC
         variable. Added PCRE2_STATIC variable to the static build using the
         target_compile_definitions() function.
       - Extended the CMake config files.
         - Introduced CMake variable PCRE2_USE_STATIC_LIBS to easily switch between
           the static and shared libraries.
         - Added the PCRE_STATIC variable to the target compile definitions for the
           import of the static library.
       Building static and shared libraries using MSVC results in a name clash of
       the libraries. Both static and shared library builds create, for example, the
       file pcre2-8.lib. Therefore, I decided to change the static library names by
       adding "-static". For example, pcre2-8.lib has become pcre2-8-static.lib.
       [Comment by PH: this is MSVC-specific. It doesn't happen on Linux.]
    3. Increased the minimum release number for CMake to 3.0.0 because older than
       2.8.12 is deprecated (it was set to 2.8.5) and causes warnings. Even 3.0.0 is
       quite old; it was released in 2014.
    4. Implemented a modified version of Thomas Tempelmann's pcre2grep patch for
       detecting symlink loops. This is dependent on the availability of realpath(),
       which is now tested for in ./configure and CMakeLists.txt.
    5. Implemented a modified version of Thomas Tempelmann's patch for faster
       case-independent "first code unit" searches for unanchored patterns in 8-bit
       mode in the interpreters. Instead of just remembering whether one case matched
       or not, it remembers the position of a previous match so as to avoid
       unnecessary repeated searching.
    6. Perl now locks out \K in lookarounds, so PCRE2 now does the same by default.
       However, just in case anybody was relying on the old behaviour, there is an
       option called PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK that enables the old behaviour.
       An option has also been added to pcre2grep to enable this.
    7. Re-enable a JIT optimization which was unintentionally disabled in 10.35.
    8. There is a loop counter to catch excessively crazy patterns when checking
       the lengths of lookbehinds at compile time. This was incorrectly getting reset
       whenever a lookahead was processed, leading to some fuzzer-generated patterns
       taking a very long time to compile when (?|) was present in the pattern,
       because (?|) disables caching of group lengths.

Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Reviewed-by: Michael Tremer <michael.tremer@ipfire.org>
2022-01-14 13:38:23 +00:00
Adolf Belka
7112adbc86 pcre2: Update to 10.37
- Update from 10.36 to 10.37
- Update rootfile
- find-dependencies run to check impact of so lib bump
   No issues found
- Changelog
   Version 10.37 26-May-2021
    1. Change RunGrepTest to use tr instead of sed when testing with binary
       zero bytes, because sed varies a lot from system to system and has problems
       with binary zeros. This is from Bugzilla #2681. Patch from Jeremie
       Courreges-Anglas via Nam Nguyen. This fixes RunGrepTest for OpenBSD. Later:
       it broke it for at least one version of Solaris, where tr can't handle binary
       zeros. However, that system had /usr/xpg4/bin/tr installed, which works OK, so
       RunGrepTest now checks for that command and uses it if found.
    2. Compiling with gcc 10.2's -fanalyzer option showed up a hypothetical problem
       with a NULL dereference. I don't think this case could ever occur in practice,
       but I have put in a check in order to get rid of the compiler error.
    3. An alternative patch for CMakeLists.txt because 10.36 #4 breaks CMake on
       Windows. Patch from email@cs-ware.de fixes bugzilla #2688.
    4. Two bugs related to over-large numbers have been fixed so the behaviour is
       now the same as Perl.
       (a) A pattern such as /\214748364/ gave an overflow error instead of being
           treated as the octal number \214 followed by literal digits.
       (b) A sequence such as {65536 that has no terminating } so is not a
           quantifier was nevertheless complaining that a quantifier number was too big.
    5. A run of autoconf suggested that configure.ac was out-of-date with respect
       to the lastest autoconf. Running autoupdate made some valid changes, some valid
       suggestions, and also some invalid changes, which were fixed by hand. Autoconf
       now runs clean and the resulting "configure" seems to work, so I hope nothing
       is broken. Later: the requirement for autoconf 2.70 broke some automatic test
       robots. It doesn't seem to be necessary: trying a reduction to 2.60.
    6. The pattern /a\K.(?0)*/ when matched against "abac" by the interpreter gave
       the answer "bac", whereas Perl and JIT both yield "c". This was because the
       effect of \K was not propagating back from the full pattern recursion. Other
       recursions such as /(a\K.(?1)*)/ did not have this problem.
    7. Restore single character repetition optimization in JIT. Currently fewer
       character repetitions are optimized than in 10.34.
    8. When the names of the functions in the POSIX wrapper were changed to
       pcre2_regcomp() etc. (see change 10.33 #4 below), functions with the original
       names were left in the library so that pre-compiled programs would still work.
       However, this has proved troublesome when programs link with several libraries,
       some of which use PCRE2 via the POSIX interface while others use a native POSIX
       library. For this reason, the POSIX function names are removed in this release.
       The macros in pcre2posix.h should ensure that re-compiling fixes any programs
       that haven't been compiled since before 10.33.

Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
2021-06-04 10:49:47 +00:00
Michael Tremer
e30e60b1c6 pcre2: Disable JIT for RISC-V
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
2021-03-06 11:14:51 +00:00
Michael Tremer
b0c37190a5 pcre2: New package
pcre is no longer receiving any feature updates, but only bug fixes.

pcre2 is the successor which is replacing pcre.

Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
2021-02-09 16:10:07 +00:00