Age | Commit message (Collapse) | Author |
|
Thanks to the meta buffer infrastructure, metadata-compressed inodes are
just read from the metabox inode instead of the blockdevice (or backing
file) inode.
The same is true for shared extended attributes.
When metadata compression is enabled, inode numbers are divided from
on-disk NIDs because of non-LTS 32-bit application compatibility.
Co-developed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Bo Liu (OpenAnolis) <liubo03@inspur.com>
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250722003229.2121752-1-hsiangkao@linux.alibaba.com
|
|
There is no need to keep additional local metabufs since we already
have one in `struct erofs_map_blocks`.
This was actually a leftover when applying meta buffers to zmap
operations, see commit 09c543798c3c ("erofs: use meta buffers for
zmap operations").
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250716064152.3537457-1-hsiangkao@linux.alibaba.com
|
|
- need_kmap is always true except for a ztailpacking case; thus, just
open-code that one;
- The upcoming metadata compression will add a new boolean, so simplify
this first.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20250714090907.4095645-1-hsiangkao@linux.alibaba.com
|
|
All below functions will do sanity check on m->type, let's move sanity
check to z_erofs_load_compact_lcluster() for cleanup.
- z_erofs_map_blocks_fo
- z_erofs_get_extent_compressedlen
- z_erofs_get_extent_decompressedlen
- z_erofs_extent_lookback
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250708110928.3110375-1-chao@kernel.org
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Fragments aren't limited by Z_EROFS_PCLUSTER_MAX_DSIZE. However, if
a fragment's logical length is larger than Z_EROFS_PCLUSTER_MAX_DSIZE
but the fragment is not the whole inode, it currently returns
-EOPNOTSUPP because m_flags has the wrong EROFS_MAP_ENCODED flag set.
It is not intended by design but should be rare, as it can only be
reproduced by mkfs with `-Eall-fragments` in a specific case.
Let's normalize fragment m_flags using the new EROFS_MAP_FRAGMENT.
Reported-by: Axel Fontaine <axel@axelfontaine.com>
Closes: https://github.com/erofs/erofs-utils/issues/23
Fixes: 7c3ca1838a78 ("erofs: restrict pcluster size limitations")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250711195826.3601157-1-hsiangkao@linux.alibaba.com
|
|
It is possible when an inode is split into segments for multi-threaded
compression, and the tail extent of a segment could also be small.
Fixes: 1d191b4ca51d ("erofs: implement encoded extent metadata")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250620153108.1368029-1-hsiangkao@linux.alibaba.com
|
|
Crafted encoded extents could record out-of-range `lstart`, which should
not happen in normal cases.
It caused an iomap_iter_done() complaint [1] reported by syzbot.
[1] https://lore.kernel.org/r/684cb499.a00a0220.c6bd7.0010.GAE@google.com
Fixes: 1d191b4ca51d ("erofs: implement encoded extent metadata")
Reported-and-tested-by: syzbot+d8f000c609f05f52d9b5@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d8f000c609f05f52d9b5
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250619032839.2642193-1-hsiangkao@linux.alibaba.com
|
|
- The MSB 32 bits of `z_fragmentoff` are available only in extent
records of size >= 8B.
- Use round_down() to calculate `lstart` as well as increase `pos`
correspondingly for extent records of size == 8B.
Fixes: 1d191b4ca51d ("erofs: implement encoded extent metadata")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250408114448.4040220-2-hsiangkao@linux.alibaba.com
|
|
Implement the extent metadata parsing described in the previous commit.
For 16-byte and 32-byte extent records, currently it is just a trivial
binary search without considering the last access footprint, but it can
be optimized for better sequential performance later.
Tail fragments are supported, but ztailpacking feature is not
for simplicity.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Acked-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20250310095459.2620647-9-hsiangkao@linux.alibaba.com
|
|
Previously, EROFS provided both (non-)compact compressed indexes to
keep necessary hints for each logical block, enabling O(1) random
indexing. This approach was originally designed for small compression
units (e.g., 4KiB), where compressed data is strictly block-aligned via
fixed-sized output compression.
However, EROFS now supports big pclusters up to 1MiB and many users use
large configurations to minimize image sizes. For such configurations,
the total number of extents decreases significantly (e.g., only 1,024
extents for a 1GiB file using 1MiB pclusters), then runtime metadata
overhead becomes negligible compared to data I/O and decoding costs.
Additionally, some popular compression algorithm (mainly Zstd) still
lacks native fixed-sized output compression support (although it's
planned by their authors). Instead of just waiting for compressor
improvements, let's adopt byte-oriented extents, allowing these
compressors to retain their current methods.
For example, it speeds up Zstd compression a lot:
Processor: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz * 96
Dataset: enwik9
Build time Size Type Command Line
3m52.339s 266653696 FO -C524288 -zzstd,22
3m48.549s 266174464 FO -E48bit -C524288 -zzstd,22
0m12.821s 272134144 FI -E48bit -C1048576 --max-extent-bytes=1048576 -zzstd,22
0m14.528s 248987648 FO -C1048576 -zlzma,9
0m14.605s 248504320 FO -E48bit -C1048576 -zlzma,9
Encoded extents are structured as an array of `struct z_erofs_extent`,
sorted by logical address in ascending order:
__le32 plen // encoded length, algorithm id and flags
__le32 pstart_lo // physical offset LSB
__le32 pstart_hi // physical offset MSB
__le32 lstart_lo // logical offset
__le32 lstart_hi // logical offset MSB
..
Note that prefixed reduced records can be used to minimize metadata for
specific cases (e.g. lstart less than 32 bits, then 32 to 16 bytes).
If the logical lengths of all encoded extents are the same, 4-byte
(plen) and 8-byte (plen, pstart_lo) records can be used. Or, 16-byte
(plen .. lstart_lo) and 32-byte full records have to be used instead.
If 16-byte and 32-byte records are used, the total number of extents
is kept in `struct z_erofs_map_header`, and binary search can be
applied on them. Note that `eytzinger order` is not considerd because
data sequential access is important.
If 4-byte records are used, 8-byte start physical offset is between
`struct z_erofs_map_header` and the `plen` array.
In addition, 64-bit physical offsets can be applied with new encoded
extent format to match full 48-bit block addressing.
Remove redundant comments around `struct z_erofs_lcluster_index` too.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Acked-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20250310095459.2620647-8-hsiangkao@linux.alibaba.com
|
|
Simplify the logic in z_erofs_fill_inode_lazy() by combining the
handling of ztailpacking and fragments, as they are mutually exclusive.
Note that `h->h_clusterbits >> Z_EROFS_FRAGMENT_INODE_BIT` is handled
above, so no need to duplicate the check.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250224123747.1387122-2-hsiangkao@linux.alibaba.com
|
|
Use `z_idata_size != 0` to indicate that ztailpacking is enabled.
`Z_EROFS_ADVISE_INLINE_PCLUSTER` cannot be ignored, as `h_idata_size`
could be non-zero prior to erofs-utils 1.6 [1].
Additionally, merge `z_idataoff` and `z_fragmentoff` since these two
features are mutually exclusive for a given inode.
[1] https://git.kernel.org/xiang/erofs-utils/c/547bea3cb71a
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250225114038.3259726-1-hsiangkao@linux.alibaba.com
|
|
Since EROFS_KMAP_ATOMIC is no longer valid, get rid of erofs_kmap_type too.
Signed-off-by: Bo Liu <liubo03@inspur.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20250217093141.2659-1-liubo03@inspur.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
There's no need to enumerate each type. No logic changes.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20250210032923.3382136-1-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
- Set `compressedblks = 1` directly for non-bigpcluster cases. This
simplifies the logic a bit since lcluster sizes larger than one block
are unsupported and the details remain unclear.
- For Z_EROFS_LCLUSTER_TYPE_PLAIN pclusters, avoid assuming
`compressedblks = 1` by default. Instead, check if
Z_EROFS_ADVISE_BIG_PCLUSTER_2 is set.
It basically has no impact to existing valid images, but it's useful to
find the gap to prepare for large PLAIN pclusters.
Link: https://lore.kernel.org/r/20250123090109.973463-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
- Get rid of unpack_compacted_index() and fold it into
z_erofs_load_compact_lcluster();
- Avoid a goto.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250114034429.431408-1-hsiangkao@linux.alibaba.com
|
|
syzbot reported a WARNING in iomap_iter_done:
iomap_fiemap+0x73b/0x9b0 fs/iomap/fiemap.c:80
ioctl_fiemap fs/ioctl.c:220 [inline]
Generally, NONHEAD lclusters won't have delta[1]==0, except for crafted
images and filesystems created by pre-1.0 mkfs versions.
Previously, it would immediately bail out if delta[1]==0, which led to
inadequate decompressed lengths (thus FIEMAP is impacted). Treat it as
delta[1]=1 to work around these legacy mkfs versions.
`lclusterbits > 14` is illegal for compact indexes, error out too.
Reported-by: syzbot+6c0b301317aa0156f9eb@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/67373c0c.050a0220.2a2fcc.0079.GAE@google.com
Tested-by: syzbot+6c0b301317aa0156f9eb@syzkaller.appspotmail.com
Fixes: d95ae5e25326 ("erofs: add support for the full decompressed length")
Fixes: 001b8ccd0650 ("erofs: fix compact 4B support for 16k block size")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20241115173651.3339514-1-hsiangkao@linux.alibaba.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
"The main one fixes a syzbot issue due to the invalid inode type out of
file-backed mounts. The others are minor cleanups without actual logic
changes.
Summary:
- Make sure only regular inodes can be used for file-backed mounts
- Two minor codebase cleanups"
* tag 'erofs-for-6.12-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: get rid of kaddr in `struct z_erofs_maprecorder`
erofs: get rid of z_erofs_try_to_claim_pcluster()
erofs: ensure regular inodes for file-backed mounts
|
|
`kaddr` becomes useless after switching to metabuf.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20241010235830.1535616-1-hsiangkao@linux.alibaba.com
|
|
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
|
|
Error out if {en,de}encoded size of a pcluster is unsupported:
Maximum supported encoded size (of a pcluster): 1 MiB
Maximum supported decoded size (of a pcluster): 12 MiB
Users can still choose to use supported large configurations (e.g.,
for archival purposes), but there may be performance penalties in
low-memory scenarios compared to smaller pclusters.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240912074156.2925394-1-hsiangkao@linux.alibaba.com
|
|
Consolidate them under erofs_map_blocks_* for simplicity since we
have many other ways to know if a given inode is compressed or not.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240710083459.208362-1-hongzhen@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Sometimes, the on-disk metadata might be invalid due to user
interrupts, storage failures, or other unknown causes.
In that case, z_erofs_map_blocks_iter() may still return a valid
m_llen while other fields remain invalid (e.g., m_plen can be 0).
Due to the return value of z_erofs_scan_folio() in some path will
be ignored on purpose, the following z_erofs_scan_folio() could
then use the invalid value by accident.
Let's reset m_llen to 0 to prevent this.
Link: https://lore.kernel.org/r/20240629185743.2819229-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
There's only one place where struct z_erofs_maprecorder ->kaddr is
used not in the same function that has assigned it -
the value read in unpack_compacted_index() gets calculated in
z_erofs_load_compact_lcluster(). With minor massage we can switch
to storing it with offset in block already added.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20240425195944.GE1031757@ZenIV
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Most of the callers of erofs_read_metabuf() have the following form:
block = erofs_blknr(sb, offset);
off = erofs_blkoff(sb, offset);
p = erofs_read_metabuf(...., erofs_pos(sb, block), ...);
if (IS_ERR(p))
return PTR_ERR(p);
q = p + off;
// no further uses of p, block or off.
The value passed to erofs_read_metabuf() is offset rounded down to block
size, i.e. offset - off. Passing offset as-is would increase the return
value by off in case of success and keep the return value unchanged in
in case of error. In other words, the same could be achieved by
q = erofs_read_metabuf(...., offset, ...);
if (IS_ERR(q))
return PTR_ERR(q);
This commit convert these simple cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20240425195915.GD1031757@ZenIV
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
just lift the call of erofs_pos() into the callers; it will
collapse in most of them, but that's better done caller-by-caller.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20240425195846.GC1031757@ZenIV
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Add Zstandard compression as the 4th supported algorithm since it
becomes more popular now and some end users have asked this for
quite a while [1][2].
Each EROFS physical cluster contains only one valid standard
Zstandard frame as described in [3] so that decompression can be
performed on a per-pcluster basis independently.
Currently, it just leverages multi-call stream decompression APIs with
internal sliding window buffers. One-shot or bufferless decompression
could be implemented later for even better performance if needed.
[1] https://github.com/erofs/erofs-utils/issues/6
[2] https://lore.kernel.org/r/Y08h+z6CZdnS1XBm@B-P7TQMD6M-0146.lan
[3] https://www.rfc-editor.org/rfc/rfc8478.txt
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240508234453.17896-1-xiang@kernel.org
|
|
Only four lcluster types here, remove redundant code.
No real logic changes.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240508123357.3266173-1-hsiangkao@linux.alibaba.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
- Fix a "BUG: kernel NULL pointer dereference" issue due to
inconsistent on-disk indices of compressed inodes against
per-sb `available_compr_algs` generated by Syzkaller
- Don't use certain unnecessary folio_*() helpers if the folio
type (page cache) is known
* tag 'erofs-for-6.8-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: Don't use certain unnecessary folio_*() functions
erofs: fix inconsistent per-file compression format
|
|
EROFS can select compression algorithms on a per-file basis, and each
per-file compression algorithm needs to be marked in the on-disk
superblock for initialization.
However, syzkaller can generate inconsistent crafted images that use
an unsupported algorithmtype for specific inodes, e.g. use MicroLZMA
algorithmtype even it's not set in `sbi->available_compr_algs`. This
can lead to an unexpected "BUG: kernel NULL pointer dereference" if
the corresponding decompressor isn't built-in.
Fix this by checking against `sbi->available_compr_algs` for each
m_algorithmformat request. Incorrect !erofs_sb_has_compr_cfgs preset
bitmap is now fixed together since it was harmless previously.
Reported-by: <bugreport@ubisectech.com>
Fixes: 8f89926290c4 ("erofs: get compression algorithms directly on mapping")
Fixes: 622ceaddb764 ("erofs: lzma compression support")
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Link: https://lore.kernel.org/r/20240113150602.1471050-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Previously, the block size always equaled to PAGE_SIZE, therefore
`lclusterbits` couldn't be less than 12.
Since sub-page compressed blocks are now considered, `lobits` for
a lcluster in each pack cannot always be `lclusterbits` as before.
Otherwise, there is no enough room for the special value
`Z_EROFS_LI_D0_CBLKCNT`.
To support smaller block sizes, `lobits` for each compacted lcluster is
now calculated as:
lobits = max(lclusterbits, ilog2(Z_EROFS_LI_D0_CBLKCNT) + 1)
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231206091057.87027-4-hsiangkao@linux.alibaba.com
|
|
Add DEFLATE compression as the 3rd supported algorithm.
DEFLATE is a popular generic-purpose compression algorithm for quite
long time (many advanced formats like gzip, zlib, zip, png are all
based on that) as Apple documentation written "If you require
interoperability with non-Apple devices, use COMPRESSION_ZLIB. [1]".
Due to its popularity, there are several hardware on-market DEFLATE
accelerators, such as (s390) DFLTCC, (Intel) IAA/QAT, (HiSilicon) ZIP
accelerator, etc. In addition, there are also several high-performence
IP cores and even open-source FPGA approches available for DEFLATE.
Therefore, it's useful to support DEFLATE compression in order to find
a way to utilize these accelerators for asynchronous I/Os and get
benefits from these later.
Besides, it's a good choice to trade off between compression ratios
and performance compared to LZ4 and LZMA. The DEFLATE core format is
simple as well as easy to understand, therefore the code size of its
decompressor is small even for the bootloader use cases. The runtime
memory consumption is quite limited too (e.g. 32K + ~7K for each zlib
stream). As usual, EROFS ourperforms similar approaches too.
Alternatively, DEFLATE could still be used for some specific files
since EROFS supports multiple compression algorithms in one image.
[1] https://developer.apple.com/documentation/compression/compression_algorithm
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230810154859.118330-1-hsiangkao@linux.alibaba.com
|
|
Several trivial cleanups which aren't quite necessary to split:
- Rename lcluster load functions as well as justify full indexes
since they are typically used for global deduplication for
compressed data;
- Avoid unnecessary lines, comments for simplicity.
No logic changes.
Reviewed-by: Guo Xuenan <guoxuenan@huaweicloud.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230615064421.103178-1-hsiangkao@linux.alibaba.com
|
|
In compact 4B, two adjacent lclusters are packed together as a unit to
form on-disk indexes for effective random access, as below:
(amortized = 4, vcnt = 2)
_____________________________________________
|___@_____ encoded bits __________|_ blkaddr _|
0 . amortized * vcnt = 8
. .
. . amortized * vcnt - 4 = 4
. .
.____________________________.
|_type (2 bits)_|_clusterofs_|
Therefore, encoded bits for each pack are 32 bits (4 bytes). IOWs,
since each lcluster can get 16 bits for its type and clusterofs, the
maximum supported lclustersize for compact 4B format is 16k (14 bits).
Fix this to enable compact 4B format for 16k lclusters (blocks), which
is tested on an arm64 server with 16k page size.
Fixes: 152a333a5895 ("staging: erofs: add compacted compression indexes support")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230601112341.56960-1-hsiangkao@linux.alibaba.com
|
|
Such debug messages are rarely used now. Let's get rid of these,
and revert locally if they are needed for debugging.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230414083027.12307-1-hsiangkao@linux.alibaba.com
|
|
Prior to big pclusters, non-compact compression indexes could have
empty headers.
Let's just avoid the legacy path since it can be handled properly
as a specific compression header with z_erofs_fill_inode_lazy() too.
Tested with erofs-utils exist versions.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230413092241.73829-1-hsiangkao@linux.alibaba.com
|
|
Syzbot generated a crafted image [1] with a non-compact HEAD index of
clusterofs 33024 while valid numbers should be 0 ~ lclustersize-1,
which causes the following unexpected behavior as below:
BUG: unable to handle page fault for address: fffff52101a3fff9
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 23ffed067 P4D 23ffed067 PUD 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 4398 Comm: kworker/u5:1 Not tainted 6.3.0-rc6-syzkaller-g09a9639e56c0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/30/2023
Workqueue: erofs_worker z_erofs_decompressqueue_work
RIP: 0010:z_erofs_decompress_queue+0xb7e/0x2b40
...
Call Trace:
<TASK>
z_erofs_decompressqueue_work+0x99/0xe0
process_one_work+0x8f6/0x1170
worker_thread+0xa63/0x1210
kthread+0x270/0x300
ret_from_fork+0x1f/0x30
Note that normal images or images using compact indexes are not
impacted. Let's fix this now.
[1] https://lore.kernel.org/r/000000000000ec75b005ee97fbaa@google.com
Reported-and-tested-by: syzbot+aafb3f37cfeb6534c4ac@syzkaller.appspotmail.com
Fixes: 02827e1796b3 ("staging: erofs: add erofs_map_blocks_iter")
Fixes: 152a333a5895 ("staging: erofs: add compacted compression indexes support")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230410173714.104604-1-hsiangkao@linux.alibaba.com
|
|
- Get rid of all "vle" (variable-length extents) expressions
since they only expand overall name lengths unnecessarily;
- Rename COMPRESSION_LEGACY to COMPRESSED_FULL;
- Move on-disk directory definitions ahead of compression;
- Drop unused extended attribute definitions;
- Move inode ondisk union `i_u` out as `union erofs_inode_i_u`.
No actual logical change.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230331063149.25611-1-hsiangkao@linux.alibaba.com
|
|
As the first step of converting hardcoded blocksize to that specified in
on-disk superblock, convert all call sites of hardcoded blocksize to
sb->s_blocksize except for:
1) use sbi->blkszbits instead of sb->s_blocksize in
erofs_superblock_csum_verify() since sb->s_blocksize has not been
updated with the on-disk blocksize yet when the function is called.
2) use inode->i_blkbits instead of sb->s_blocksize in erofs_bread(),
since the inode operated on may be an anonymous inode in fscache mode.
Currently the anonymous inode is allocated from an anonymous mount
maintained in erofs, while in the near future we may allocate anonymous
inodes from a generic API directly and thus have no access to the
anonymous inode's i_sb. Thus we keep the block size in i_blkbits for
anonymous inodes in fscache mode.
Be noted that this patch only gets rid of the hardcoded blocksize, in
preparation for actually setting the on-disk block size in the following
patch. The hard limit of constraining the block size to PAGE_SIZE still
exists until the next patch.
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230313135309.75269-2-jefflexu@linux.alibaba.com
[ Gao Xiang: fold a patch to fix incorrect truncated offsets. ]
Link: https://lore.kernel.org/r/20230413035734.15457-1-zhujia.zj@bytedance.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
`err` could be -EINTR and it should not be the case. Actually such
DBG_BUGON is useless.
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230309053148.9223-2-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
The code can be neater without forward declarations. Let's
get rid of z_erofs_do_map_blocks() forward declaration.
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230204093040.97967-5-hsiangkao@linux.alibaba.com
|
|
Actually we could pass in inodes directly to clean up all callers.
Also rename iloc() as erofs_iloc().
Link: https://lore.kernel.org/r/20230114150823.432069-1-xiang@kernel.org
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Effective offset to add to length was being incorrectly calculated,
which resulted in iomap->length being set to 0, triggering a WARN_ON
in iomap_iter_done().
Fix that, and describe it in comments.
This was reported as a crash by syzbot under an issue about a warning
encountered in iomap_iter_done(), but unrelated to erofs.
C reproducer: https://syzkaller.appspot.com/text?tag=ReproC&x=1037a6b2880000
Kernel config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=e2021a61197ebe02
Dashboard link: https://syzkaller.appspot.com/bug?extid=a8e049cd3abd342936b6
Reported-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Siddh Raman Pant <code@siddh.me>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20221209102151.311049-1-code@siddh.me
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
syzkaller reported a KASAN use-after-free:
https://syzkaller.appspot.com/bug?extid=2ae90e873e97f1faf6f2
The referenced fuzzed image actually has two issues:
- m_pa == 0 as a non-inlined pcluster;
- The logical length is longer than its physical length.
The first issue has already been addressed. This patch addresses
the second issue by checking the extent length validity.
Reported-by: syzbot+2ae90e873e97f1faf6f2@syzkaller.appspotmail.com
Fixes: 02827e1796b3 ("staging: erofs: add erofs_map_blocks_iter")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221205150050.47784-2-hsiangkao@linux.alibaba.com
|
|
Otherwise, meta buffers could be leaked.
Fixes: cec6e93beadf ("erofs: support parsing big pcluster compress indexes")
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221205150050.47784-1-hsiangkao@linux.alibaba.com
|
|
Convert all mapped erofs_bread() users to use kmap_local_page()
instead of kmap() or kmap_atomic().
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-and-tested-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20221018105313.4940-1-hsiangkao@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Note that we are still accessing 'h_idata_size' and 'h_fragmentoff'
after calling erofs_put_metabuf(), that is not correct. Fix it.
Fixes: ab92184ff8f1 ("erofs: add on-disk compressed tail-packing inline support")
Fixes: b15b2e307c3a ("erofs: support on-disk compressed fragments data")
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20221005013528.62977-1-zbestahu@163.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Some conditional macros and comments are useless.
Link: https://lore.kernel.org/r/20220927063607.54832-1-hsiangkao@linux.alibaba.com
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
The name of this function looks not very accurate compared to it's
implementation and it's only a wrapper to erofs_read_metabuf(). So,
let's fold it directly instead.
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220927032518.25266-1-zbestahu@gmail.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Due to deduplication for compressed data, pclusters can be partially
referenced with their prefixes.
Together with the user-space implementation, it enables EROFS
variable-length global compressed data deduplication with rolling
hash.
Link: https://lore.kernel.org/r/20220923014915.4362-1-hsiangkao@linux.alibaba.com
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|