7 Commits

Author SHA1 Message Date
Matthias Fischer
f43be66179 pcre2: Update to 10.42
See:
https://github.com/PCRE2Project/pcre2/releases/tag/pcre2-10.41
and
https://github.com/PCRE2Project/pcre2/releases/tag/pcre2-10.42

Excerpts from changelogs:

"Version 10.41 06-December-2022
------------------------------

1. Add fflush() before and after a fork callout in pcre2grep to get its output
to be the same on all systems. (There were previously ordering differences in
Alpine Linux).

2. Merged patch from @carenas (GitHub #110) for pthreads support in CMake.

3. SSF scorecards grumbled about possible overflow in an expression in
pcre2test. It never would have overflowed in practice, but some casts have been
added and at the some time there's been some tidying of fprints that output
size_t values.

4. PR #94 showed up an unused enum in pcre2_convert.c, which is now removed.

5. Minor code re-arrangement to remove gcc warning about realloc() in
pcre2test.

6. Change a number of int variables that hold buffer and line lengths in
pcre2grep to PCRE2_SIZE (aka size_t).

7. Added an #ifdef to cut out a call to PRIV(jit_free) when JIT is not
supported (even though that function would do nothing in that case) at the
request of a user who doesn't even want to link with pcre_jit_compile.o. Also
tidied up an untidy #ifdef arrangement in pcre2test.

8. Fixed an issue in the backtracking optimization of character repeats in
JIT. Furthermore optimize star repetitions, not just plus repetitions.

9. Removed the use of an initial backtracking frames vector on the system stack
in pcre2_match() so that it now always uses the heap. (In a multi-thread
environment with very small stacks there had been an issue.) This also is
tidier for JIT matching, which didn't need that vector. The heap vector is now
remembered in the match data block and re-used if that block itself is re-used.
It is freed with the match data block.

10. Adjusted the find_limits code in pcre2test to work with change 9 above.

11. Added find_limits_noheap to pcre2test, because the heap limits are now
different in different environments and so cannot be included in the standard
tests.

12. Created a test for pcre2_match() heap processing that is not part of the
tests run by 'make check', but can be run manually. The current output is from
a 64-bit system.

13. Implemented -Z aka --null in pcre2grep.

14. A minor change to pcre2test and the addition of several new pcre2grep tests
have improved LCOV coverage statistics. At the same time, code in pcre2grep and
elsewhere that can never be obeyed in normal testing has been excluded from
coverage.

15. Fixed a bug in pcre2grep that could cause an extra newline to be written
after output generaed by --output.

16. If a file has a .bz2 extension but is not in fact compressed, pcre2grep
should process it as a plain text file. A bug stopped this happening; now fixed
and added to the tests.

17. When pcre2grep was running not in UTF mode, if a string specified by
--output or obtained from a callout in a pattern contained a character (byte)
greater than 127, it was incorrectly output in UTF-8 format.

18. Added some casts after warnings from Clang sanitize.

19. Merged patch from cbouc (GitHub #139): 4 function prototypes were missing
PCRE2_CALL_CONVENTION in src/pcre2posix.h. All function prototypes returning
pointers had out of place PCRE2_CALL_CONVENTION in src/pcre2.h.*. These
produced errors when building for Windows with #define PCRE2_CALL_CONVENTION
__stdcall.

20. A negative repeat value in a pcre2test subject line was not being
diagnosed, leading to infinite looping.

21. Updated RunGrepTest to discard the warning that Bash now gives when setting
LC_CTYPE to a bad value (because older versions didn't).

22. Updated pcre2grep so that it behaves like GNU grep when matching more than
one pattern and a later pattern matches at an earlier point in the subject when
the matched substrings are being identified by colour or by offsets.

23. Updated the PrepareRelease script so that the man page that it makes for
the pcre2demo demonstration program is more standard and does not cause errors
when processed by lexgrog or mandb -c (GitHub issue #160).

24. The JIT compiler was updated."

Version 10.42 11-December-2022
------------------------------

"This release is mainly to fix a problem with 10.41, which is broken for
programs that include pcre2posix.h but not pcre2.h. Some other minor fixes
are included.

1. Change 19 of 10.41 wasn't quite right; it put the definition of a default,
empty value for PCRE2_CALL_CONVENTION in src/pcre2posix.c instead of
src/pcre2posix.h, which meant that programs that included pcre2posix.h but not
pcre2.h failed to compile.

2. To catch similar issues to the above in future, a new small test program
that includes pcre2posix.h but not pcre2.h has been added to the test suite.

3. When the -S option of pcre2test was used to set a stack size greater than
the allowed maximum, the error message displayed the hard limit incorrectly.
This was pointed out on GitHub pull request #171, but the suggested patch
didn't cope with all cases. Some further modification was required.

4. Supplying an ovector count of more than 65535 to pcre2_match_data_create()
caused a crash because the field in the match data block is only 16 bits. A
maximum of 65535 is now silently applied.

5. Merged @carenas patch #175 which fixes #86 - segfault on aarch64 (ARM),"

Signed-off-by: Matthias Fischer <matthias.fischer@ipfire.org>
Reviewed-by: Peter Müller <peter.mueller@ipfire.org>
2022-12-27 16:31:06 +00:00
Adolf Belka
f86e23906e pcre2: Update to version 10.40
- Update from 10.39 to 10.40
- Update of rootfile
- Changelog
   Version 10.40 15-April-2022
	1. Merged patch from @carenas (GitHub #35, 7db87842) to fix pcre2grep incorrect
	   handling of multiple passes.
	2. Merged patch from @carenas (GitHub #36, dae47509) to fix portability issue
	   in pcre2grep with buffered fseek(stdin).
	3. Merged patch from @carenas (GitHub #37, acc520924) to fix tests when -S is
	   not supported.
	4. Revert an unintended change in JIT repeat detection.
	5. Merged patch from @carenas (GitHub #52, b037bfa1) to fix build on GNU Hurd.
	6. Merged documentation and comments patches from @carenas (GitHub #47).
	7. Merged patch from @carenas (GitHub #49) to remove obsolete JFriedl test code
	   from pcre2grep.
	8. Merged patch from @carenas (GitHub #48) to fix CMake install issue #46.
	9. Merged patch from @carenas (GitHub #53) fixing NULL checks in matching and
	   substituting.
	10. Add null_subject and null_replacement modifiers to pcre2test.
	11. Add check for NULL subject to POSIX regexec() function.
	12. Add check for NULL replacement to pcre2_substitute().
	13. For the subject arguments of pcre2_match(), pcre2_dfa_match(), and
	    pcre2_substitute(), and the replacement argument of the latter, if the pointer
	    is NULL and the length is zero, treat as an empty string. Apparently a number
	    of applications treat NULL/0 in this way.
	14. Added support for Bidi_Class and a number of binary Unicode properties,
	    including Bidi_Control.
	15. Fix some minor issues raised by clang sanitize.
	16. Very minor code speed up for maximizing character property matches.
	17. A number of changes to script matching for \p and \P:
	    (a) Script extensions for a character are now coded as a bitmap instead of
	        a list of script numbers, which should be faster and does not need a
	        loop.
	    (b) Added the syntax \p{script:xxx} and \p{script_extensions:xxx} (synonyms
	        sc and scx).
	    (c) Changed \p{scriptname} from being the same as \p{sc:scriptname} to being
	        the same as \p{scx:scriptname} because this change happened in Perl at
	        release 5.26.
	    (d) The standard Unicode 4-letter abbreviations for script names are now
	        recognized.
	    (e) In accordance with Unicode and Perl's "loose matching" rules, spaces,
	        hyphens, and underscores are ignored in property names, which are then
	        matched independent of case.
	18. The Python scripts in the maint directory have been refactored. There are
	    now three scripts that generate pcre2_ucd.c, pcre2_ucp.h, and pcre2_ucptables.c
	    (which is #included by pcre2_tables.c). The data lists that used to be
	    duplicated are now held in a single common Python module.
	19. On CHERI, and thus Arm's Morello prototype, pointers are represented as
	    hardware capabilities, which consist of both an integer address and additional
	    metadata, meaning they are twice the size of the platform's size_t type, i.e.
	    16 bytes on a 64-bit system. The ovector member of heapframe happens to only be
	    8 byte aligned, and so computing frame_size ended up with a multiple of 8 but
	    not 16. Whilst the first frame was always suitably aligned, this then
	    misaligned the frame that follows, resulting in an alignment fault when storing
	    a pointer to Fecode at the start of match. Patch to fix this issue by Jessica
	    Clarke PR#72.
	20. Added -LP and -LS listing options to pcre2test.
	21. A user discovered that the library names in CMakeLists.txt for MSVC
	    debugger (PDB) files were incorrect - perhaps never tried for PCRE2?
	22. An item such as [Aa] is optimized into a caseless single character match.
	    When this was quantified (e.g. [Aa]{2}) and was also the last literal item in a
	    pattern, the optimizing "must be present for a match" character check was not
	    being flagged as caseless, causing some matches that should have succeeded to
	    fail.
	23. Fixed a unicode properrty matching issue in JIT. The character was not
	    fully read in caseless matching.
	24. Fixed an issue affecting recursions in JIT caused by duplicated data
	    transfers.
	25. Merged patch from @carenas (GitHub #96) which fixes some problems with
	    pcre2test and readline/readedit:
	      * Use the right header for libedit in FreeBSD with autoconf
	      * Really allow libedit with cmake
	      * Avoid using readline headers with libedit

Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Reviewed-by: Peter Müller <peter.mueller@ipfire.org>
2022-04-24 19:13:10 +00:00
Peter Müller
9a7e4d8506 Switch checksums from MD5 to BLAKE2
Historically, the MD5 checksums in our LFS files serve as a protection
against broken downloads, or accidentally corrupted source files.

While the sources are nowadays downloaded via HTTPS, it make sense to
beef up integrity protection for them, since transparently intercepting
TLS is believed to be feasible for more powerful actors, and the state
of the public PKI ecosystem is clearly not helping.

Therefore, this patch switches from MD5 to BLAKE2, updating all LFS
files as well as make.sh to deal with this checksum algorithm. BLAKE2 is
notably faster (and more secure) than SHA2, so the performance penalty
introduced by this patch is negligible, if noticeable at all.

In preparation of this patch, the toolchain files currently used have
been supplied with BLAKE2 checksums as well on
https://source.ipfire.org/.

Cc: Michael Tremer <michael.tremer@ipfire.org>
Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
Acked-by: Michael Tremer <michael.tremeripfire.org>
2022-04-02 14:19:25 +00:00
Adolf Belka
43164c6557 pcre2: Update to version 10.39
- Update from 10.37 to 10.39
- Update of rootfile
- Changelog
  Version 10.39 29-October-2021
    1. Fix incorrect detection of alternatives in first character search in JIT.
    2. Merged patch from @carenas (GitHub #28):
       Visual Studio 2013 includes support for %zu and %td, so let newer
       versions of it avoid the fallback, and while at it, make sure that
       the first check is for DISABLE_PERCENT_ZT so it will be always
       honoured if chosen.
       prtdiff_t is signed, so use a signed type instead, and make sure
       that an appropiate width is chosen if pointers are 64bit wide and
       long is not (ex: Windows 64bit).
       IMHO removing the cast (and therefore the positibilty of truncation)
       make the code cleaner and the fallback is likely portable enough
       with all 64-bit POSIX systems doing LP64 except for Windows.
    3. Merged patch from @carenas (GitHub #29) to update to Unicode 14.0.0.
    4. Merged patch from @carenas (GitHub #30):
       * Cleanup: remove references to no longer used stdint.h
         Since 19c50b9d (Unconditionally use inttypes.h instead of trying for stdint.h
         (simplification) and remove the now unnecessary inclusion in
         pcre2_internal.h., 2018-11-14), stdint.h is no longer used.
         Remove checks for it in autotools and CMake and document better the expected
         build failures for systems that might have stdint.h (C99) and not inttypes.h
         (from POSIX), like old Windows.
       * Cleanup: remove detection for inttypes.h which is a hard dependency
         CMake checks for standard headers are not meant to be used for hard
         dependencies, so will prevent a possible fallback to work.
         Alternatively, the header could be checked to make the configuration fail
         instead of breaking the build, but that was punted, as it was missing anyway
         from autotools.
    5. Merged patch from @carenas (GitHub #32):
       * jit: allow building with ancient MSVC versions
         Visual Studio older than 2013 fails to build with JIT enabled, because it is
         unable to parse non C89 compatible syntax, with mixed declarations and code.
         While most recent compilers wouldn't even report this as a warning since it
         is valid C99, it could be also made visible by adding to gcc/clang the
         -Wdeclaration-after-statement flag at build time.
         Move the code below the affected definitions.
       * pcre2grep: avoid mixing declarations with code
         Since d5a61ee8 (Patch to detect (and ignore) symlink loops in pcre2grep,
         2021-08-28), code will fail to build in a strict C89 compiler.
         Reformat slightly to make it C89 compatible again.
  Version 10.38 01-October-2021
    1. Fix invalid single character repetition issues in JIT when the repetition
       is inside a capturing bracket and the bracket is preceeded by character
       literals.
    2. Installed revised CMake configuration files provided by Jan-Willem Blokland.
       This extends the CMake build system to build both static and shared libraries
       in one go, builds the static library with PIC, and exposes PCRE2 libraries
       using the CMake config files. JWB provided these notes:
       - Introduced CMake variable BUILD_STATIC_LIBS to build the static library.
       - Make a small modification to config-cmake.h.in by removing the PCRE2_STATIC
         variable. Added PCRE2_STATIC variable to the static build using the
         target_compile_definitions() function.
       - Extended the CMake config files.
         - Introduced CMake variable PCRE2_USE_STATIC_LIBS to easily switch between
           the static and shared libraries.
         - Added the PCRE_STATIC variable to the target compile definitions for the
           import of the static library.
       Building static and shared libraries using MSVC results in a name clash of
       the libraries. Both static and shared library builds create, for example, the
       file pcre2-8.lib. Therefore, I decided to change the static library names by
       adding "-static". For example, pcre2-8.lib has become pcre2-8-static.lib.
       [Comment by PH: this is MSVC-specific. It doesn't happen on Linux.]
    3. Increased the minimum release number for CMake to 3.0.0 because older than
       2.8.12 is deprecated (it was set to 2.8.5) and causes warnings. Even 3.0.0 is
       quite old; it was released in 2014.
    4. Implemented a modified version of Thomas Tempelmann's pcre2grep patch for
       detecting symlink loops. This is dependent on the availability of realpath(),
       which is now tested for in ./configure and CMakeLists.txt.
    5. Implemented a modified version of Thomas Tempelmann's patch for faster
       case-independent "first code unit" searches for unanchored patterns in 8-bit
       mode in the interpreters. Instead of just remembering whether one case matched
       or not, it remembers the position of a previous match so as to avoid
       unnecessary repeated searching.
    6. Perl now locks out \K in lookarounds, so PCRE2 now does the same by default.
       However, just in case anybody was relying on the old behaviour, there is an
       option called PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK that enables the old behaviour.
       An option has also been added to pcre2grep to enable this.
    7. Re-enable a JIT optimization which was unintentionally disabled in 10.35.
    8. There is a loop counter to catch excessively crazy patterns when checking
       the lengths of lookbehinds at compile time. This was incorrectly getting reset
       whenever a lookahead was processed, leading to some fuzzer-generated patterns
       taking a very long time to compile when (?|) was present in the pattern,
       because (?|) disables caching of group lengths.

Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Reviewed-by: Michael Tremer <michael.tremer@ipfire.org>
2022-01-14 13:38:23 +00:00
Adolf Belka
7112adbc86 pcre2: Update to 10.37
- Update from 10.36 to 10.37
- Update rootfile
- find-dependencies run to check impact of so lib bump
   No issues found
- Changelog
   Version 10.37 26-May-2021
    1. Change RunGrepTest to use tr instead of sed when testing with binary
       zero bytes, because sed varies a lot from system to system and has problems
       with binary zeros. This is from Bugzilla #2681. Patch from Jeremie
       Courreges-Anglas via Nam Nguyen. This fixes RunGrepTest for OpenBSD. Later:
       it broke it for at least one version of Solaris, where tr can't handle binary
       zeros. However, that system had /usr/xpg4/bin/tr installed, which works OK, so
       RunGrepTest now checks for that command and uses it if found.
    2. Compiling with gcc 10.2's -fanalyzer option showed up a hypothetical problem
       with a NULL dereference. I don't think this case could ever occur in practice,
       but I have put in a check in order to get rid of the compiler error.
    3. An alternative patch for CMakeLists.txt because 10.36 #4 breaks CMake on
       Windows. Patch from email@cs-ware.de fixes bugzilla #2688.
    4. Two bugs related to over-large numbers have been fixed so the behaviour is
       now the same as Perl.
       (a) A pattern such as /\214748364/ gave an overflow error instead of being
           treated as the octal number \214 followed by literal digits.
       (b) A sequence such as {65536 that has no terminating } so is not a
           quantifier was nevertheless complaining that a quantifier number was too big.
    5. A run of autoconf suggested that configure.ac was out-of-date with respect
       to the lastest autoconf. Running autoupdate made some valid changes, some valid
       suggestions, and also some invalid changes, which were fixed by hand. Autoconf
       now runs clean and the resulting "configure" seems to work, so I hope nothing
       is broken. Later: the requirement for autoconf 2.70 broke some automatic test
       robots. It doesn't seem to be necessary: trying a reduction to 2.60.
    6. The pattern /a\K.(?0)*/ when matched against "abac" by the interpreter gave
       the answer "bac", whereas Perl and JIT both yield "c". This was because the
       effect of \K was not propagating back from the full pattern recursion. Other
       recursions such as /(a\K.(?1)*)/ did not have this problem.
    7. Restore single character repetition optimization in JIT. Currently fewer
       character repetitions are optimized than in 10.34.
    8. When the names of the functions in the POSIX wrapper were changed to
       pcre2_regcomp() etc. (see change 10.33 #4 below), functions with the original
       names were left in the library so that pre-compiled programs would still work.
       However, this has proved troublesome when programs link with several libraries,
       some of which use PCRE2 via the POSIX interface while others use a native POSIX
       library. For this reason, the POSIX function names are removed in this release.
       The macros in pcre2posix.h should ensure that re-compiling fixes any programs
       that haven't been compiled since before 10.33.

Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
2021-06-04 10:49:47 +00:00
Michael Tremer
e30e60b1c6 pcre2: Disable JIT for RISC-V
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
2021-03-06 11:14:51 +00:00
Michael Tremer
b0c37190a5 pcre2: New package
pcre is no longer receiving any feature updates, but only bug fixes.

pcre2 is the successor which is replacing pcre.

Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
2021-02-09 16:10:07 +00:00