- Update from version 5.2.8 to 5.4.0
- Update of rootfile
- Changelog
5.4.0 (2022-12-13)
This bumps the minor version of liblzma because new features were
added. The API and ABI are still backward compatible with liblzma
5.2.x and 5.0.x.
Since 5.3.5beta:
* All fixes from 5.2.10.
* The ARM64 filter is now stable. The xz option is now --arm64.
Decompression requires XZ Utils 5.4.0. In the future the ARM64
filter will be supported by XZ for Java, XZ Embedded (including
the version in Linux), LZMA SDK, and 7-Zip.
* Translations:
- Updated Catalan, Croatian, German, Romanian, and Turkish
translations.
- Updated German man page translations.
- Added Romanian man page translations.
Summary of new features added in the 5.3.x development releases:
* liblzma:
- Added threaded .xz decompressor lzma_stream_decoder_mt().
It can use multiple threads with .xz files that have multiple
Blocks with size information in Block Headers. The threaded
encoder in xz has always created such files.
Single-threaded encoder cannot store the size information in
Block Headers even if one used LZMA_FULL_FLUSH to create
multiple Blocks, so this threaded decoder cannot use multiple
threads with such files.
If there are multiple Streams (concatenated .xz files), one
Stream will be decompressed completely before starting the
next Stream.
- A new decoder flag LZMA_FAIL_FAST was added. It makes the
threaded decompressor report errors soon instead of first
flushing all pending data before the error location.
- New Filter IDs:
* LZMA_FILTER_ARM64 is for ARM64 binaries.
* LZMA_FILTER_LZMA1EXT is for raw LZMA1 streams that don't
necessarily use the end marker.
- Added lzma_str_to_filters(), lzma_str_from_filters(), and
lzma_str_list_filters() to convert a preset or a filter chain
string to a lzma_filter[] and vice versa. These should make
it easier to write applications that allow users to specify
custom compression options.
- Added lzma_filters_free() which can be convenient for freeing
the filter options in a filter chain (an array of lzma_filter
structures).
- lzma_file_info_decoder() to makes it a little easier to get
the Index field from .xz files. This helps in getting the
uncompressed file size but an easy-to-use random access
API is still missing which has existed in XZ for Java for
a long time.
- Added lzma_microlzma_encoder() and lzma_microlzma_decoder().
It is used by erofs-utils and may be used by others too.
The MicroLZMA format is a raw LZMA stream (without end marker)
whose first byte (always 0x00) has been replaced with
bitwise-negation of the LZMA properties (lc/lp/pb). It was
created for use in EROFS but may be used in other contexts
as well where it is important to avoid wasting bytes for
stream headers or footers. The format is also supported by
XZ Embedded (the XZ Embedded version in Linux got MicroLZMA
support in Linux 5.16).
The MicroLZMA encoder API in liblzma can compress into a
fixed-sized output buffer so that as much data is compressed
as can be fit into the buffer while still creating a valid
MicroLZMA stream. This is needed for EROFS.
- Added lzma_lzip_decoder() to decompress the .lz (lzip) file
format version 0 and the original unextended version 1 files.
Also lzma_auto_decoder() supports .lz files.
- lzma_filters_update() can now be used with the multi-threaded
encoder (lzma_stream_encoder_mt()) to change the filter chain
after LZMA_FULL_BARRIER or LZMA_FULL_FLUSH.
- In lzma_options_lzma, allow nice_len = 2 and 3 with the match
finders that require at least 3 or 4. Now it is internally
rounded up if needed.
- CLMUL-based CRC64 on x86-64 and E2K with runtime processor
detection. On 32-bit x86 it currently isn't available unless
--disable-assembler is used which can make the non-CLMUL
CRC64 slower; this might be fixed in the future.
- Building with --disable-threads --enable-small
is now thread-safe if the compiler supports
__attribute__((__constructor__)).
* xz:
- Using -T0 (--threads=0) will now use multi-threaded encoder
even on a single-core system. This is to ensure that output
from the same xz binary is identical on both single-core and
multi-core systems.
- --threads=+1 or -T+1 is now a way to put xz into
multi-threaded mode while using only one worker thread.
The + is ignored if the number is not 1.
- A default soft memory usage limit is now used for compression
when -T0 is used and no explicit limit has been specified.
This soft limit is used to restrict the number of threads
but if the limit is exceeded with even one thread then xz
will continue with one thread using the multi-threaded
encoder and this limit is ignored. If the number of threads
is specified manually then no default limit will be used;
this affects only -T0.
This change helps on systems that have very many cores and
using all of them for xz makes no sense. Previously xz -T0
could run out of memory on such systems because it attempted
to reserve memory for too many threads.
This also helps with 32-bit builds which don't have a large
amount of address space that would be required for many
threads. The default soft limit for -T0 is at most 1400 MiB
on all 32-bit platforms.
- Previously a low value in --memlimit-compress wouldn't cause
xz to switch from multi-threaded mode to single-threaded mode
if the limit cannot otherwise be met; xz failed instead. Now
xz can switch to single-threaded mode and then, if needed,
scale down the LZMA2 dictionary size too just like it already
did when it was started in single-threaded mode.
- The option --no-adjust no longer prevents xz from scaling down
the number of threads as that doesn't affect the compressed
output (only performance). Now --no-adjust only prevents
adjustments that affect compressed output, that is, with
--no-adjust xz won't switch from multi-threaded mode to
single-threaded mode and won't scale down the LZMA2
dictionary size.
- Added a new option --memlimit-mt-decompress=LIMIT. This is
used to limit the number of decompressor threads (possibly
falling back to single-threaded mode) but it will never make
xz refuse to decompress a file. This has a system-specific
default value because without any limit xz could end up
allocating memory for the whole compressed input file, the
whole uncompressed output file, multiple thread-specific
decompressor instances and so on. Basically xz could
attempt to use an insane amount of memory even with fairly
common files. The system-specific default value is currently
the same as the one used for compression with -T0.
The new option works together with the existing option
--memlimit-decompress=LIMIT. The old option sets a hard limit
that must not be exceeded (xz will refuse to decompress)
while the new option only restricts the number of threads.
If the limit set with --memlimit-mt-decompress is greater
than the limit set with --memlimit-compress, then the latter
value is used also for --memlimit-mt-decompress.
- Added new information to the output of xz --info-memory and
new fields to the output of xz --robot --info-memory.
- In --lzma2=nice=NUMBER allow 2 and 3 with all match finders
now that liblzma handles it.
- Don't mention endianness for ARM and ARM-Thumb filters in
--long-help. The filters only work for little endian
instruction encoding but modern ARM processors using
big endian data access still use little endian
instruction encoding. So the help text was misleading.
In contrast, the PowerPC filter is only for big endian
32/64-bit PowerPC code. Little endian PowerPC would need
a separate filter.
- Added decompression support for the .lz (lzip) file format
version 0 and the original unextended version 1. It is
autodetected by default. See also the option --format on
the xz man page.
- Sandboxing enabled by default:
* Capsicum (FreeBSD)
* pledge(2) (OpenBSD)
* Scripts now support the .lz format using xz.
* A few new tests were added.
* The liblzma-specific tests are now supported in CMake-based
builds too ("make test").
5.2.10 (2022-12-13)
* xz: Don't modify argv[] when parsing the --memlimit* and
--block-list command line options. This fixes confusing
arguments in process listing (like "ps auxf").
* GNU/Linux only: Use __has_attribute(__symver__) to detect if
that attribute is supported. This fixes build on Mandriva where
Clang is patched to define __GNUC__ to 11 by default (instead
of 4 as used by Clang upstream).
5.2.9 (2022-11-30)
* liblzma:
- Fixed an infinite loop in LZMA encoder initialization
if dict_size >= 2 GiB. (The encoder only supports up
to 1536 MiB.)
- Fixed two cases of invalid free() that can happen if
a tiny allocation fails in encoder re-initialization
or in lzma_filters_update(). These bugs had some
similarities with the bug fixed in 5.2.7.
- Fixed lzma_block_encoder() not allowing the use of
LZMA_SYNC_FLUSH with lzma_code() even though it was
documented to be supported. The sync-flush code in
the Block encoder was already used internally via
lzma_stream_encoder(), so this was just a missing flag
in the lzma_block_encoder() API function.
- GNU/Linux only: Don't put symbol versions into static
liblzma as it breaks things in some cases (and even if
it didn't break anything, symbol versions in static
libraries are useless anyway). The downside of the fix
is that if the configure options --with-pic or --without-pic
are used then it's not possible to build both shared and
static liblzma at the same time on GNU/Linux anymore;
with those options --disable-static or --disable-shared
must be used too.
* New email address for bug reports is <xz@tukaani.org> which
forwards messages to Lasse Collin and Jia Tan.
Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Reviewed-by: Peter Müller <peter.mueller@ipfire.org>
- Update from version 5.2.5 to 5.2.8
- Update of rootfile
- Remove xzgrep-ZDI-CAN-16587 patch as the contents are now integrated into the source
tarball and with an improved quicker method - see changelog below.
- Changelog
5.2.8 (2022-11-13)
* xz:
- If xz cannot remove an input file when it should, this
is now treated as a warning (exit status 2) instead of
an error (exit status 1). This matches GNU gzip and it
is more logical as at that point the output file has
already been successfully closed.
- Fix handling of .xz files with an unsupported check type.
Previously such printed a warning message but then xz
behaved as if an error had occurred (didn't decompress,
exit status 1). Now a warning is printed, decompression
is done anyway, and exit status is 2. This used to work
slightly before 5.0.0. In practice this bug matters only
if xz has been built with some check types disabled. As
instructed in PACKAGERS, such builds should be done in
special situations only.
- Fix "xz -dc --single-stream tests/files/good-0-empty.xz"
which failed with "Internal error (bug)". That is,
--single-stream was broken if the first .xz stream in
the input file didn't contain any uncompressed data.
- Fix displaying file sizes in the progress indicator when
working in passthru mode and there are multiple input files.
Just like "gzip -cdf", "xz -cdf" works like "cat" when the
input file isn't a supported compressed file format. In
this case the file size counters weren't reset between
files so with multiple input files the progress indicator
displayed an incorrect (too large) value.
* liblzma:
- API docs in lzma/container.h:
* Update the list of decoder flags in the decoder
function docs.
* Explain LZMA_CONCATENATED behavior with .lzma files
in lzma_auto_decoder() docs.
- OpenBSD: Use HW_NCPUONLINE to detect the number of
available hardware threads in lzma_physmem().
- Fix use of wrong macro to detect x86 SSE2 support.
__SSE2_MATH__ was used with GCC/Clang but the correct
one is __SSE2__. The first one means that SSE2 is used
for floating point math which is irrelevant here.
The affected SSE2 code isn't used on x86-64 so this affects
only 32-bit x86 builds that use -msse2 without -mfpmath=sse
(there is no runtime detection for SSE2). It improves LZMA
compression speed (not decompression).
- Fix the build with Intel C compiler 2021 (ICC, not ICX)
on Linux. It defines __GNUC__ to 10 but doesn't support
the __symver__ attribute introduced in GCC 10.
* Scripts: Ignore warnings from xz by using --quiet --no-warn.
This is needed if the input .xz files use an unsupported
check type.
* Translations:
- Updated Croatian and Turkish translations.
- One new translations wasn't included because it needed
technical fixes. It will be in upcoming 5.4.0. No new
translations will be added to the 5.2.x branch anymore.
- Renamed the French man page translation file from
fr_FR.po to fr.po and thus also its install directory
(like /usr/share/man/fr_FR -> .../fr).
- Man page translations for upcoming 5.4.0 are now handled
in the Translation Project.
* Update doc/faq.txt a little so it's less out-of-date.
5.2.7 (2022-09-30)
* liblzma:
- Made lzma_filters_copy() to never modify the destination
array if an error occurs. lzma_stream_encoder() and
lzma_stream_encoder_mt() already assumed this. Before this
change, if a tiny memory allocation in lzma_filters_copy()
failed it would lead to a crash (invalid free() or invalid
memory reads) in the cleanup paths of these two encoder
initialization functions.
- Added missing integer overflow check to lzma_index_append().
This affects xz --list and other applications that decode
the Index field from .xz files using lzma_index_decoder().
Normal decompression of .xz files doesn't call this code
and thus most applications using liblzma aren't affected
by this bug.
- Single-threaded .xz decoder (lzma_stream_decoder()): If
lzma_code() returns LZMA_MEMLIMIT_ERROR it is now possible
to use lzma_memlimit_set() to increase the limit and continue
decoding. This was supposed to work from the beginning
but there was a bug. With other decoders (.lzma or
threaded .xz decoder) this already worked correctly.
- Fixed accumulation of integrity check type statistics in
lzma_index_cat(). This bug made lzma_index_checks() return
only the type of the integrity check of the last Stream
when multiple lzma_indexes were concatenated. Most
applications don't use these APIs but in xz it made
xz --list not list all check types from concatenated .xz
files. In xz --list --verbose only the per-file "Check:"
lines were affected and in xz --robot --list only the "file"
line was affected.
- Added ABI compatibility with executables that were linked
against liblzma in RHEL/CentOS 7 or other liblzma builds
that had copied the problematic patch from RHEL/CentOS 7
(xz-5.2.2-compat-libs.patch). For the details, see the
comment at the top of src/liblzma/validate_map.sh.
WARNING: This uses __symver__ attribute with GCC >= 10.
In other cases the traditional __asm__(".symver ...")
is used. Using link-time optimization (LTO, -flto) with
GCC versions older than 10 can silently result in
broken liblzma.so.5 (incorrect symbol versions)! If you
want to use -flto with GCC, you must use GCC >= 10.
LTO with Clang seems to work even with the traditional
__asm__(".symver ...") method.
* xzgrep: Fixed compatibility with old shells that break if
comments inside command substitutions have apostrophes (').
This problem was introduced in 5.2.6.
* Build systems:
- New #define in config.h: HAVE_SYMBOL_VERSIONS_LINUX
- Windows: Fixed liblzma.dll build with Visual Studio project
files. It broke in 5.2.6 due to a change that was made to
improve CMake support.
- Windows: Building liblzma with UNICODE defined should now
work.
- CMake files are now actually included in the release tarball.
They should have been in 5.2.5 already.
- Minor CMake fixes and improvements.
* Added a new translation: Turkish
5.2.6 (2022-08-12)
* xz:
- The --keep option now accepts symlinks, hardlinks, and
setuid, setgid, and sticky files. Previously this required
using --force.
- When copying metadata from the source file to the destination
file, don't try to set the group (GID) if it is already set
correctly. This avoids a failure on OpenBSD (and possibly on
a few other OSes) where files may get created so that their
group doesn't belong to the user, and fchown(2) can fail even
if it needs to do nothing.
- Cap --memlimit-compress to 2000 MiB instead of 4020 MiB on
MIPS32 because on MIPS32 userspace processes are limited
to 2 GiB of address space.
* liblzma:
- Fixed a missing error-check in the threaded encoder. If a
small memory allocation fails, a .xz file with an invalid
Index field would be created. Decompressing such a file would
produce the correct output but result in an error at the end.
Thus this is a "mild" data corruption bug. Note that while
a failed memory allocation can trigger the bug, it cannot
cause invalid memory access.
- The decoder for .lzma files now supports files that have
uncompressed size stored in the header and still use the
end of payload marker (end of stream marker) at the end
of the LZMA stream. Such files are rare but, according to
the documentation in LZMA SDK, they are valid.
doc/lzma-file-format.txt was updated too.
- Improved 32-bit x86 assembly files:
* Support Intel Control-flow Enforcement Technology (CET)
* Use non-executable stack on FreeBSD.
- Visual Studio: Use non-standard _MSVC_LANG to detect C++
standard version in the lzma.h API header. It's used to
detect when "noexcept" can be used.
* xzgrep:
- Fixed arbitrary command injection via a malicious filename
(CVE-2022-1271, ZDI-CAN-16587). A standalone patch for
this was released to the public on 2022-04-07. A slight
robustness improvement has been made since then and, if
using GNU or *BSD grep, a new faster method is now used
that doesn't use the old sed-based construct at all. This
also fixes bad output with GNU grep >= 3.5 (2020-09-27)
when xzgrepping binary files.
This vulnerability was discovered by:
cleemy desu wayo working with Trend Micro Zero Day Initiative
- Fixed detection of corrupt .bz2 files.
- Improved error handling to fix exit status in some situations
and to fix handling of signals: in some situations a signal
didn't make xzgrep exit when it clearly should have. It's
possible that the signal handling still isn't quite perfect
but hopefully it's good enough.
- Documented exit statuses on the man page.
- xzegrep and xzfgrep now use "grep -E" and "grep -F" instead
of the deprecated egrep and fgrep commands.
- Fixed parsing of the options -E, -F, -G, -P, and -X. The
problem occurred when multiple options were specied in
a single argument, for example,
echo foo | xzgrep -Fe foo
treated foo as a filename because -Fe wasn't correctly
split into -F -e.
- Added zstd support.
* xzdiff/xzcmp:
- Fixed wrong exit status. Exit status could be 2 when the
correct value is 1.
- Documented on the man page that exit status of 2 is used
for decompression errors.
- Added zstd support.
* xzless:
- Fix less(1) version detection. It failed if the version number
from "less -V" contained a dot.
* Translations:
- Added new translations: Catalan, Croatian, Esperanto,
Korean, Portuguese, Romanian, Serbian, Spanish, Swedish,
and Ukrainian
- Updated the Brazilian Portuguese translation.
- Added French man page translation. This and the existing
German translation aren't complete anymore because the
English man pages got a few updates and the translators
weren't reached so that they could update their work.
* Build systems:
- Windows: Fix building of resource files when config.h isn't
used. CMake + Visual Studio can now build liblzma.dll.
- Various fixes to the CMake support. Building static or shared
liblzma should work fine in most cases. In contrast, building
the command line tools with CMake is still clearly incomplete
and experimental and should be used for testing only.
Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Reviewed-by: Peter Müller <peter.mueller@ipfire.org>
Reviewed-by: Michael Tremer <michael.tremer@ipfire.org>
- Malicious filenames can make xzgrep to write to arbitrary files
or (with a GNU sed extension) lead to arbitrary code execution.
- xzgrep from XZ Utils versions up to and including 5.2.5 are
affected. 5.3.1alpha and 5.3.2alpha are affected as well.
- This bug was inherited from gzip's zgrep. gzip 1.12 includes
a fix for zgrep.
- CU167 has gzip-1.12 with the fix already merged.
Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
Reviewed-by: Peter Müller <peter.mueller@ipfire.org>
Historically, the MD5 checksums in our LFS files serve as a protection
against broken downloads, or accidentally corrupted source files.
While the sources are nowadays downloaded via HTTPS, it make sense to
beef up integrity protection for them, since transparently intercepting
TLS is believed to be feasible for more powerful actors, and the state
of the public PKI ecosystem is clearly not helping.
Therefore, this patch switches from MD5 to BLAKE2, updating all LFS
files as well as make.sh to deal with this checksum algorithm. BLAKE2 is
notably faster (and more secure) than SHA2, so the performance penalty
introduced by this patch is negligible, if noticeable at all.
In preparation of this patch, the toolchain files currently used have
been supplied with BLAKE2 checksums as well on
https://source.ipfire.org/.
Cc: Michael Tremer <michael.tremer@ipfire.org>
Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
Acked-by: Michael Tremer <michael.tremeripfire.org>
This will allow us to run multiple builds on the same
system at the same time (or at least have them on disk).
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>