summaryrefslogtreecommitdiff
path: root/fs/xfs/libxfs/xfs_bmap.c
AgeCommit message (Collapse)Author
2021-02-03xfs: reserve data and rt quota at the same timeDarrick J. Wong
Modify xfs_trans_reserve_quota_nblks so that we can reserve data and realtime blocks from the dquot at the same time. This change has the theoretical side effect that for allocations to realtime files we will reserve from the dquot both the number of rtblocks being allocated and the number of bmbt blocks that might be needed to add the mapping. However, since the mount code disables quota if it finds a realtime device, this should not result in any behavior changes. Now that we've moved the inode creation callers away from using the _nblks function, we can repurpose the (now unused) ninos argument for realtime blocks, so make that change. This also replaces the flags argument with a boolean parameter to force the reservation since we don't need to distinguish between data and rt quota reservations any more, and the only flag being passed in was FORCE_RES. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-03xfs: create convenience wrappers for incore quota block reservationsDarrick J. Wong
Create a couple of convenience wrappers for creating and deleting quota block reservations against future changes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-03xfs: clean up quota reservation callsitesDarrick J. Wong
Convert a few xfs_trans_*reserve* callsites that are open-coding other convenience functions. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-02-01xfs: Fix 'set but not used' warning in xfs_bmap_compute_alignments()Chandan Babu R
With both CONFIG_XFS_DEBUG and CONFIG_XFS_WARN disabled, the only reference to local variable "error" in xfs_bmap_compute_alignments() gets eliminated during pre-processing stage of the compilation process. This causes the compiler to generate a "set but not used" warning. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-01-22xfs: Introduce error injection to allocate only minlen size extents for filesChandan Babu R
This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which helps userspace test programs to get xfs_bmap_btalloc() to always allocate minlen sized extents. This is required for test programs which need a guarantee that minlen extents allocated for a file do not get merged with their existing neighbours in the inode's BMBT. "Inode fork extent overflow check" for Directories, Xattrs and extension of realtime inodes need this since the file offset at which the extents are being allocated cannot be explicitly controlled from userspace. One way to use this error tag is to, 1. Consume all of the free space by sequentially writing to a file. 2. Punch alternate blocks of the file. This causes CNTBT to contain sufficient number of one block sized extent records. 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag. After step 3, xfs_bmap_btalloc() will issue space allocation requests for minlen sized extents only. ENOSPC error code is returned to userspace when there aren't any "one block sized" extents left in any of the AGs. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22xfs: Process allocated extent in a separate functionChandan Babu R
This commit moves over the code in xfs_bmap_btalloc() which is responsible for processing an allocated extent to a new function. Apart from xfs_bmap_btalloc(), the new function will be invoked by another function introduced in a future commit. Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22xfs: Compute bmap extent alignments in a separate functionChandan Babu R
This commit moves over the code which computes stripe alignment and extent size hint alignment into a separate function. Apart from xfs_bmap_btalloc(), the new function will be used by another function introduced in a future commit. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22xfs: Remove duplicate assert statement in xfs_bmap_btalloc()Chandan Babu R
The check for verifying if the allocated extent is from an AG whose index is greater than or equal to that of tp->t_firstblock is already done a couple of statements earlier in the same function. Hence this commit removes the redundant assert statement. Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22xfs: Check for extent overflow when renaming dir entriesChandan Babu R
A rename operation is essentially a directory entry remove operation from the perspective of parent directory (i.e. src_dp) of rename's source. Hence the only place where we check for extent count overflow for src_dp is in xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns -ENOSPC when it detects a possible extent count overflow and in response, the higher layers of directory handling code do the following: 1. Data/Free blocks: XFS lets these blocks linger until a future remove operation removes them. 2. Dabtree blocks: XFS swaps the blocks with the last block in the Leaf space and unmaps the last block. For target_dp, there are two cases depending on whether the destination directory entry exists or not. When destination directory entry does not exist (i.e. target_ip == NULL), extent count overflow check is performed only when transaction has a non-zero sized space reservation associated with it. With a zero-sized space reservation, XFS allows a rename operation to continue only when the directory has sufficient free space in its data/leaf/free space blocks to hold the new entry. When destination directory entry exists (i.e. target_ip != NULL), all we need to do is change the inode number associated with the already existing entry. Hence there is no need to perform an extent count overflow check. Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22xfs: Check for extent overflow when removing dir entriesChandan Babu R
Directory entry removal must always succeed; Hence XFS does the following during low disk space scenario: 1. Data/Free blocks linger until a future remove operation. 2. Dabtree blocks would be swapped with the last block in the leaf space and then the new last block will be unmapped. This facility is reused during low inode extent count scenario i.e. this commit causes xfs_bmap_del_extent_real() to return -ENOSPC error code so that the above mentioned behaviour is exercised causing no change to the directory's extent count. Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2021-01-22xfs: Check for extent overflow when trivally adding a new extentChandan Babu R
When adding a new data extent (without modifying an inode's existing extents) the extent count increases only by 1. This commit checks for extent count overflow in such cases. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-12-16xfs: remove xfs_buf_t typedefDave Chinner
Prepare for kernel xfs_buf alignment by getting rid of the xfs_buf_t typedef from userspace. [darrick: This patch is a port of a userspace patch removing the xfs_buf_t typedef in preparation to make the userspace xfs_buf code behave more like its kernel counterpart.] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-12-09xfs: refactor file range validationDarrick J. Wong
Refactor all the open-coded validation of file block ranges into a single helper, and teach the bmap scrubber to check the ranges. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-12-09xfs: refactor realtime volume extent validationDarrick J. Wong
Refactor all the open-coded validation of realtime device extents into a single helper. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-12-09xfs: refactor data device extent validationDarrick J. Wong
Refactor all the open-coded validation of non-static data device extents into a single helper. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-12-09xfs: detect overflows in bmbt recordsDarrick J. Wong
Detect file block mappings with a blockcount that's either so large that integer overflows occur or are zero, because neither are valid in the filesystem. Worse yet, attempting directory modifications causes the iext code to trip over the bmbt key handling and takes the filesystem down. We can fix most of this by preventing the bad metadata from entering the incore structures in the first place. Found by setting blockcount=0 in a directory data fork mapping and watching the fireworks. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2020-09-23xfs: don't free rt blocks when we're doing a REMAP bunmapi callDarrick J. Wong
When callers pass XFS_BMAPI_REMAP into xfs_bunmapi, they want the extent to be unmapped from the given file fork without the extent being freed. We do this for non-rt files, but we forgot to do this for realtime files. So far this isn't a big deal since nobody makes a bunmapi call to a rt file with the REMAP flag set, but don't leave a logic bomb. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-09-03xfs: fix xfs_bmap_validate_extent_raw when checking attr fork of rt filesDarrick J. Wong
The realtime flag only applies to the data fork, so don't use the realtime block number checks on the attr fork of a realtime file. Fixes: 30b0984d9117 ("xfs: refactor bmap record validation") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2020-07-28xfs: Remove kmem_zone_zalloc() usageCarlos Maiolino
Use kmem_cache_zalloc() directly. With the exception of xlog_ticket_alloc() which will be dealt on the next patch for readability. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-07-28xfs: Remove kmem_zone_alloc() usageCarlos Maiolino
Use kmem_cache_alloc() directly. All kmem_zone_alloc() users pass 0 as flags, which are translated into: GFP_KERNEL | __GFP_NOWARN, and kmem_zone_alloc() loops forever until the allocation succeeds. We can use __GFP_NOFAIL to tell the allocator to loop forever rather than doing it ourself, and because the allocation will never fail, we do not need to use __GFP_NOWARN anymore. Hence, all callers can be converted to use GFP_KERNEL | __GFP_NOFAIL Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: add a comment back in about nofail] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-05-27xfs: force writes to delalloc regions to unwrittenDarrick J. Wong
When writing to a delalloc region in the data fork, commit the new allocations (of the da reservation) as unwritten so that the mappings are only marked written once writeback completes successfully. This fixes the problem of stale data exposure if the system goes down during targeted writeback of a specific region of a file, as tested by generic/042. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-05-19xfs: move the fork format fields into struct xfs_iforkChristoph Hellwig
Both the data and attr fork have a format that is stored in the legacy idinode. Move it into the xfs_ifork structure instead, where it uses up padding. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-19xfs: move the per-fork nextents fields into struct xfs_iforkChristoph Hellwig
There are there are three extents counters per inode, one for each of the forks. Two are in the legacy icdinode and one is directly in struct xfs_inode. Switch to a single counter in the xfs_ifork structure where it uses up padding at the end of the structure. This simplifies various bits of code that just wants the number of extents counter and can now directly dereference it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-19xfs: remove the NULL fork handling in xfs_bmapi_readChristoph Hellwig
Now that we fully verify the inode forks before they are added to the inode cache, the crash reported in https://bugzilla.kernel.org/show_bug.cgi?id=204031 can't happen anymore, as we'll never let an inode that has inconsistent nextents counts vs the presence of an in-core attr fork leak into the inactivate code path. So remove the work around to try to handle the case, and just return an error and warn if the fork is not present. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-05-19xfs: remove the special COW fork handling in xfs_bmapi_readChristoph Hellwig
We don't call xfs_bmapi_read for the COW fork anymore, so remove the special casing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-19xfs: only check the superblock version for dinode size calculationChristoph Hellwig
The size of the dinode structure is only dependent on the file system version, so instead of checking the individual inode version just use the newly added xfs_sb_version_has_large_dinode helper, and simplify various calling conventions. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Chandan Rajendra <chandanrlinux@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-03-13xfs: rename btree cursor private btree member flagsDave Chinner
BPRV is not longer appropriate because bc_private is going away. Script: $ sed -i 's/BTCUR_BPRV/BTCUR_BMBT/g' fs/xfs/*[ch] fs/xfs/*/*[ch] With manual cleanup to the definitions in fs/xfs/libxfs/xfs_btree.h Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: change "BC_BT" to "BTCUR_BMBT", fix subject line typo] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-03-13xfs: convert btree cursor inode-private member namesDave Chinner
bc_private.b -> bc_ino conversion via script: $ sed -i 's/bc_private\.b/bc_ino/g' fs/xfs/*[ch] fs/xfs/*/*[ch] And then revert the change to the bc_ino #define in fs/xfs/libxfs/xfs_btree.h manually. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: tweak the subject line slightly] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2020-03-02xfs: open code insert range extent split helperBrian Foster
The insert range operation currently splits the extent at the target offset in a separate transaction and lock cycle from the one that shifts extents. In preparation for reworking insert range into an atomic operation, lift the code into the caller so it can be easily condensed to a single rolling transaction and lock cycle and eliminate the helper. No functional changes. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-01-26xfs: make xfs_*read_agf return EAGAIN to ALLOC_FLAG_TRYLOCK callersDarrick J. Wong
Refactor xfs_read_agf and xfs_alloc_read_agf to return EAGAIN if the caller passed TRYLOCK and we weren't able to get the lock; and change the callers to recognize this. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2020-01-26xfs: remove the xfs_btree_get_buf[ls] functionsDarrick J. Wong
Remove the xfs_btree_get_bufs and xfs_btree_get_bufl functions, since they're pretty trivial oneliners. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2019-12-19libxfs: resync with the userspace libxfsDarrick J. Wong
Prepare to resync the userspace libxfs with the kernel libxfs. There were a few things I missed -- a couple of static inline directory functions that have to be exported for xfs_repair; a couple of directory naming functions that make porting much easier if they're /not/ static inline; and a u16 usage that should have been uint16_t. None of these things are bugs in their own right; this just makes porting xfsprogs easier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-12-11xfs: stabilize insert range start boundary to avoid COW writeback raceBrian Foster
generic/522 (fsx) occasionally fails with a file corruption due to an insert range operation. The primary characteristic of the corruption is a misplaced insert range operation that differs from the requested target offset. The reason for this behavior is a race between the extent shift sequence of an insert range and a COW writeback completion that causes a front merge with the first extent in the shift. The shift preparation function flushes and unmaps from the target offset of the operation to the end of the file to ensure no modifications can be made and page cache is invalidated before file data is shifted. An insert range operation then splits the extent at the target offset, if necessary, and begins to shift the start offset of each extent starting from the end of the file to the start offset. The shift sequence operates at extent level and so depends on the preparation sequence to guarantee no changes can be made to the target range during the shift. If the block immediately prior to the target offset was dirty and shared, however, it can undergo writeback and move from the COW fork to the data fork at any point during the shift. If the block is contiguous with the block at the start offset of the insert range, it can front merge and alter the start offset of the extent. Once the shift sequence reaches the target offset, it shifts based on the latest start offset and silently changes the target offset of the operation and corrupts the file. To address this problem, update the shift preparation code to stabilize the start boundary along with the full range of the insert. Also update the existing corruption check to fail if any extent is shifted with a start offset behind the target offset of the insert range. This prevents insert from racing with COW writeback completion and fails loudly in the event of an unexpected extent shift. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02xfs: don't check for AG deadlock for realtime files in bunmapiOmar Sandoval
Commit 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") added a check in __xfs_bunmapi() to stop early if we would touch multiple AGs in the wrong order. However, this check isn't applicable for realtime files. In most cases, it just makes us do unnecessary commits. However, without the fix from the previous commit ("xfs: fix realtime file data space leak"), if the last and second-to-last extents also happen to have different "AG numbers", then the break actually causes __xfs_bunmapi() to return without making any progress, which sends xfs_itruncate_extents_flags() into an infinite loop. Fixes: 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-02xfs: fix realtime file data space leakOmar Sandoval
Realtime files in XFS allocate extents in rextsize units. However, the written/unwritten state of those extents is still tracked in blocksize units. Therefore, a realtime file can be split up into written and unwritten extents that are not necessarily aligned to the realtime extent size. __xfs_bunmapi() has some logic to handle these various corner cases. Consider how it handles the following case: 1. The last extent is unwritten. 2. The last extent is smaller than the realtime extent size. 3. startblock of the last extent is not aligned to the realtime extent size, but startblock + blockcount is. In this case, __xfs_bunmapi() calls xfs_bmap_add_extent_unwritten_real() to set the second-to-last extent to unwritten. This should merge the last and second-to-last extents, so __xfs_bunmapi() moves on to the second-to-last extent. However, if the size of the last and second-to-last extents combined is greater than MAXEXTLEN, xfs_bmap_add_extent_unwritten_real() does not merge the two extents. When that happens, __xfs_bunmapi() skips past the last extent without unmapping it, thus leaking the space. Fix it by only unwriting the minimum amount needed to align the last extent to the realtime extent size, which is guaranteed to merge with the last extent. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-13xfs: convert open coded corruption check to use XFS_IS_CORRUPTDarrick J. Wong
Convert the last of the open coded corruption check and report idioms to use the XFS_IS_CORRUPT macro. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-12xfs: kill the XFS_WANT_CORRUPT_* macrosDarrick J. Wong
The XFS_WANT_CORRUPT_* macros conceal subtle side effects such as the creation of local variables and redirections of the code flow. This is pretty ugly, so replace them with explicit XFS_IS_CORRUPT tests that remove both of those ugly points. The change was performed with the following coccinelle script: @@ expression mp, test; identifier label; @@ - XFS_WANT_CORRUPTED_GOTO(mp, test, label); + if (XFS_IS_CORRUPT(mp, !test)) { error = -EFSCORRUPTED; goto label; } @@ expression mp, test; @@ - XFS_WANT_CORRUPTED_RETURN(mp, test); + if (XFS_IS_CORRUPT(mp, !test)) return -EFSCORRUPTED; @@ expression mp, lval, rval; @@ - XFS_IS_CORRUPT(mp, !(lval == rval)) + XFS_IS_CORRUPT(mp, lval != rval) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 && e2)) + XFS_IS_CORRUPT(mp, !e1 || !e2) @@ expression e1, e2; @@ - !(e1 == e2) + e1 != e2 @@ expression e1, e2, e3, e4, e5, e6; @@ - !(e1 == e2 && e3 == e4) || e5 != e6 + e1 != e2 || e3 != e4 || e5 != e6 @@ expression e1, e2, e3, e4, e5, e6; @@ - !(e1 == e2 || (e3 <= e4 && e5 <= e6)) + e1 != e2 && (e3 > e4 || e5 > e6) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 <= e2)) + XFS_IS_CORRUPT(mp, e1 > e2) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 < e2)) + XFS_IS_CORRUPT(mp, e1 >= e2) @@ expression mp, e1; @@ - XFS_IS_CORRUPT(mp, !!e1) + XFS_IS_CORRUPT(mp, e1) @@ expression mp, e1, e2; @@ - XFS_IS_CORRUPT(mp, !(e1 || e2)) + XFS_IS_CORRUPT(mp, !e1 && !e2) @@ expression mp, e1, e2, e3, e4; @@ - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 == e4)) + XFS_IS_CORRUPT(mp, e1 != e2 && e3 != e4) @@ expression mp, e1, e2, e3, e4; @@ - XFS_IS_CORRUPT(mp, !(e1 <= e2) || !(e3 >= e4)) + XFS_IS_CORRUPT(mp, e1 > e2 || e3 < e4) @@ expression mp, e1, e2, e3, e4; @@ - XFS_IS_CORRUPT(mp, !(e1 == e2) && !(e3 <= e4)) + XFS_IS_CORRUPT(mp, e1 != e2 && e3 > e4) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-10xfs: refactor "does this fork map blocks" predicateDarrick J. Wong
Replace the open-coded checks for whether or not an inode fork maps blocks with a macro that will implant the code for us. This helps us declutter the bmap code a bit. Note that I had to use a macro instead of a static inline function because of C header dependency problems between xfs_inode.h and xfs_inode_fork.h. Conversion was performed with the following Coccinelle script: @@ expression ip, w; @@ - XFS_IFORK_FORMAT(ip, w) == XFS_DINODE_FMT_EXTENTS || XFS_IFORK_FORMAT(ip, w) == XFS_DINODE_FMT_BTREE + xfs_ifork_has_extents(ip, w) @@ expression ip, w; @@ - XFS_IFORK_FORMAT(ip, w) != XFS_DINODE_FMT_EXTENTS && XFS_IFORK_FORMAT(ip, w) != XFS_DINODE_FMT_BTREE + !xfs_ifork_has_extents(ip, w) @@ expression ip, w; @@ - XFS_IFORK_FORMAT(ip, w) == XFS_DINODE_FMT_BTREE || XFS_IFORK_FORMAT(ip, w) == XFS_DINODE_FMT_EXTENTS + xfs_ifork_has_extents(ip, w) @@ expression ip, w; @@ - XFS_IFORK_FORMAT(ip, w) != XFS_DINODE_FMT_BTREE && XFS_IFORK_FORMAT(ip, w) != XFS_DINODE_FMT_EXTENTS + !xfs_ifork_has_extents(ip, w) @@ expression ip, w; @@ - (xfs_ifork_has_extents(ip, w)) + xfs_ifork_has_extents(ip, w) @@ expression ip, w; @@ - (!xfs_ifork_has_extents(ip, w)) + !xfs_ifork_has_extents(ip, w) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-07xfs: null out bma->prev if no previous extentDarrick J. Wong
Coverity complains that we don't check the return value of xfs_iext_peek_prev_extent like we do nearly all of the time. If there is no previous extent then just null out bma->prev like we do elsewhere in the bmap code. Coverity-id: 1424057 Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-04xfs: always log corruption errorsDarrick J. Wong
Make sure we log something to dmesg whenever we return -EFSCORRUPTED up the call stack. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-11-03xfs: cleanup use of the XFS_ALLOC_ flagsChristoph Hellwig
Always set XFS_ALLOC_USERDATA for data fork allocations, and check it in xfs_alloc_is_userdata instead of the current obsfucated check. Also remove the xfs_alloc_is_userdata and xfs_alloc_allow_busy_reuse helpers to make the code a little easier to understand. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-03xfs: move extent zeroing to xfs_bmapi_allocateChristoph Hellwig
Move the extent zeroing case there for the XFS_BMAPI_ZERO flag outside the low-level allocator and into xfs_bmapi_allocate, where is still is in transaction context, but outside the very lowlevel code where it doesn't belong. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-11-03xfs: refactor xfs_bmapi_allocateChristoph Hellwig
Avoid duplicate userdata and data fork checks by restructuring the code so we only have a helper for userdata allocations that combines these checks in a straight foward way. That also helps to obsoletes the comments explaining what the code does as it is now clearly obvious. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-29xfs: refactor xfs_iread_extents to use xfs_btree_visit_blocksDarrick J. Wong
xfs_iread_extents open-codes everything in xfs_btree_visit_blocks, so refactor the btree helper to be able to iterate only the records on level 0, then port iread_extents to use it. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-10-29xfs: replace -EIO with -EFSCORRUPTED for corrupt metadataDarrick J. Wong
There are a few places where we return -EIO instead of -EFSCORRUPTED when we find corrupt metadata. Fix those places. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-10-23xfs: don't set bmapi total block req where minleft isBrian Foster
xfs_bmapi_write() takes a total block requirement parameter that is passed down to the block allocation code and is used to specify the total block requirement of the associated transaction. This is used to try and select an AG that can not only satisfy the requested extent allocation, but can also accommodate subsequent allocations that might be required to complete the transaction. For example, additional bmbt block allocations may be required on insertion of the resulting extent to an inode data fork. While it's important for callers to calculate and reserve such extra blocks in the transaction, it is not necessary to pass the total value to xfs_bmapi_write() in all cases. The latter automatically sets minleft to ensure that sufficient free blocks remain after the allocation attempt to expand the format of the associated inode (i.e., such as extent to btree conversion, btree splits, etc). Therefore, any callers that pass a total block requirement of the bmap mapping length plus worst case bmbt expansion essentially specify the additional reservation requirement twice. These callers can pass a total of zero to rely on the bmapi minleft policy. Beyond being superfluous, the primary motivation for this change is that the total reservation logic in the bmbt code is dubious in scenarios where minlen < maxlen and a maxlen extent cannot be allocated (which is more common for data extent allocations where contiguity is not required). The total value is based on maxlen in the xfs_bmapi_write() caller. If the bmbt code falls back to an allocation between minlen and maxlen, that allocation will not succeed until total is reset to minlen, which essentially throws away any additional reservation included in total by the caller. In addition, the total value is not reset until after alignment is dropped, which means that such callers drop alignment far too aggressively than necessary. Update all callers of xfs_bmapi_write() that pass a total block value of the mapping length plus bmbt reservation to instead pass zero and rely on xfs_bmapi_minleft() to enforce the bmbt reservation requirement. This trades off slightly less conservative AG selection for the ability to preserve alignment in more scenarios. xfs_bmapi_write() callers that incorporate unrelated or additional reservations in total beyond what is already included in minleft must continue to use the former. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-23xfs: cap longest free extent to maximum allocatableDave Chinner
Cap longest extent to the largest we can allocate based on limits calculated at mount time. Dynamic state (such as finobt blocks) can result in the longest free extent exceeding the size we can allocate, and that results in failure to align full AG allocations when the AG is empty. Result: xfs_io-4413 [003] 426.412459: xfs_alloc_vextent_loopfailed: dev 8:96 agno 0 agbno 32 minlen 243968 maxlen 244000 mod 0 prod 1 minleft 1 total 262148 alignment 32 minalignslop 0 len 0 type NEAR_BNO otype START_BNO wasdel 0 wasfromfl 0 resv 0 datatype 0x5 firstblock 0xffffffffffffffff minlen and maxlen are now separated by the alignment size, and allocation fails because args.total > free space in the AG. [bfoster: Added xfs_bmap_btalloc() changes.] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-21xfs: use a struct iomap in xfs_writepage_ctxChristoph Hellwig
In preparation for moving the XFS writeback code to fs/iomap.c, switch it to use struct iomap instead of the XFS-specific struct xfs_bmbt_irec. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-10-09xfs: move local to extent inode logging into bmap helperBrian Foster
The callers of xfs_bmap_local_to_extents_empty() log the inode external to the function, yet this function is where the on-disk format value is updated. Push the inode logging down into the function itself to help prevent future mistakes. Note that internal bmap callers track the inode logging flags independently and thus may log the inode core twice due to this change. This is harmless, so leave this code around for consistency with the other attr fork conversion functions. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-09-23xfs: revert 1baa2800e62d ("xfs: remove the unused XFS_ALLOC_USERDATA flag")Darrick J. Wong
Revert this commit, as it caused periodic regressions in xfs/173 w/ 1k blocks. [1] https://lore.kernel.org/lkml/20190919014602.GN15734@shao2-debian/ Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>