summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-11-10Merge branch 'xfs-4.10-libxfs-cleanups' into for-nextDave Chinner
2016-11-10Merge branch 'dax-4.10-iomap-pmd' into for-nextDave Chinner
2016-11-10dax: Introduce IOMAP_FAULT flagJan Kara
Introduce a flag telling iomap operations whether they are handling a fault or other IO. That may influence behavior wrt inode size and similar things. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-09xfs: check minimum block size for CRC filesystemsDarrick J. Wong
Check the minimum block size on v5 filesystems. [dchinner: cleaned up XFS_MIN_CRC_BLOCKSIZE check] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08xfs: check return value of _trans_reserve_quota_nblksDarrick J. Wong
Check the return value of xfs_trans_reserve_quota_nblks for errors. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08xfs: move dir_ino_validate declaration per xfsprogsDarrick J. Wong
Move the declaration of _dir_ino_validate out of the private dir2 header file into the public one, since xfsprogs did that for the benefit of xfs_repair. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08xfs: don't call xfs_sb_quota_from_disk twiceEric Sandeen
Source xfsprogs commit: ee3754254e8c186c99b6cdd4d59f741759d04acb Kernel commit 5ef828c4 ("xfs: avoid false quotacheck after unclean shutdown") made xfs_sb_from_disk() also call xfs_sb_quota_from_disk by default. However, when this was merged to libxfs, existing separate calls to libxfs_sb_quota_from_disk remained, and calling it twice in a row on a V4 superblock leads to issues, because: if (sbp->sb_qflags & XFS_PQUOTA_ACCT) { ... sbp->sb_pquotino = sbp->sb_gquotino; sbp->sb_gquotino = NULLFSINO; and after the second call, we have set both pquotino and gquotino to NULLFSINO. Fix this by making it safe to call twice, and also remove the extra calls to libxfs_sb_quota_from_disk. This is only spotted when running xfstests with "-m crc=0" because the sb_from_disk change came about after V5 became default, and the above behavior only exists on a V4 superblock. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08libxfs: clean up _dir2_data_freescanDarrick J. Wong
Refactor the implementations of xfs_dir2_data_freescan into a routine that takes the raw directory block parameters and a second function that figures out the raw parameters from the directory inode. This enables us to use the exact same code for both userspace and the kernel, since repair knows exactly which directory block geometry parameters it needs. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08libxfs: fix xfs_attr_shortform_bytesfit declarationDarrick J. Wong
Change the xfs_attr_shortform_bytesfit declaration to have struct xfs_inode to avoid tripping up the libxfs-diff scanner. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08libxfs: fix whitespace problemsDarrick J. Wong
Fix some whitespace problems that trip up my libxfs-diff script. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08libxfs: synchronize dinode_verify with userspaceDarrick J. Wong
The userspace version of _dinode_verify takes a raw inode number instead of an inode itself. Since neither version actually needs the inode, port the changes to the kernel. This will also reduce the libxfs diff noise. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08libxfs: convert ushort to unsigned shortDarrick J. Wong
Since xfsprogs dropped ushort in favor of unsigned short, do that here too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: remove "depends on BROKEN" from FS_DAX_PMDRoss Zwisler
Now that DAX PMD faults are once again working and are now participating in DAX's radix tree locking scheme, allow their config option to be enabled. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08xfs: use struct iomap based DAX PMD fault pathRoss Zwisler
Switch xfs_filemap_pmd_fault() from using dax_pmd_fault() to the new and improved dax_iomap_pmd_fault(). Also, now that it has no more users, remove xfs_get_blocks_dax_fault(). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: add struct iomap based DAX PMD supportRoss Zwisler
DAX PMDs have been disabled since Jan Kara introduced DAX radix tree based locking. This patch allows DAX PMDs to participate in the DAX radix tree based locking scheme so that they can be re-enabled using the new struct iomap based fault handlers. There are currently three types of DAX 4k entries: 4k zero pages, 4k DAX mappings that have an associated block allocation, and 4k DAX empty entries. The empty entries exist to provide locking for the duration of a given page fault. This patch adds three equivalent 2MiB DAX entries: Huge Zero Page (HZP) entries, PMD DAX entries that have associated block allocations, and 2 MiB DAX empty entries. Unlike the 4k case where we insert a struct page* into the radix tree for 4k zero pages, for HZP we insert a DAX exceptional entry with the new RADIX_DAX_HZP flag set. This is because we use a single 2 MiB zero page in every 2MiB hole mapping, and it doesn't make sense to have that same struct page* with multiple entries in multiple trees. This would cause contention on the single page lock for the one Huge Zero Page, and it would break the page->index and page->mapping associations that are assumed to be valid in many other places in the kernel. One difficult use case is when one thread is trying to use 4k entries in radix tree for a given offset, and another thread is using 2 MiB entries for that same offset. The current code handles this by making the 2 MiB user fall back to 4k entries for most cases. This was done because it is the simplest solution, and because the use of 2MiB pages is already opportunistic. If we were to try to upgrade from 4k pages to 2MiB pages for a given range, we run into the problem of how we lock out 4k page faults for the entire 2MiB range while we clean out the radix tree so we can insert the 2MiB entry. We can solve this problem if we need to, but I think that the cases where both 2MiB entries and 4K entries are being used for the same range will be rare enough and the gain small enough that it probably won't be worth the complexity. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: move put_(un)locked_mapping_entry() in dax.cRoss Zwisler
No functional change. The static functions put_locked_mapping_entry() and put_unlocked_mapping_entry() will soon be used in error cases in grab_mapping_entry(), so move their definitions above this function. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: move RADIX_DAX_* defines to dax.hRoss Zwisler
The RADIX_DAX_* defines currently mostly live in fs/dax.c, with just RADIX_DAX_ENTRY_LOCK being in include/linux/dax.h so it can be used in mm/filemap.c. When we add PMD support, though, mm/filemap.c will also need access to the RADIX_DAX_PTE type so it can properly construct a 4k sized empty entry. Instead of shifting the defines between dax.c and dax.h as they are individually used in other code, just move them wholesale to dax.h so they'll be available when we need them. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: dax_iomap_fault() needs to call iomap_end()Ross Zwisler
Currently iomap_end() doesn't do anything for DAX page faults for both ext2 and XFS. ext2_iomap_end() just checks for a write underrun, and xfs_file_iomap_end() checks to see if it needs to finish a delayed allocation. However, in the future iomap_end() calls might be needed to make sure we have balanced allocations, locks, etc. So, add calls to iomap_end() with appropriate error handling to dax_iomap_fault(). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: add dax_iomap_sector() helper functionRoss Zwisler
To be able to correctly calculate the sector from a file position and a struct iomap there is a complex little bit of logic that currently happens in both dax_iomap_actor() and dax_iomap_fault(). This will need to be repeated yet again in the DAX PMD fault handler when it is added, so break it out into a helper function. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: correct dax iomap code namespaceRoss Zwisler
The recently added DAX functions that use the new struct iomap data structure were named iomap_dax_rw(), iomap_dax_fault() and iomap_dax_actor(). These are actually defined in fs/dax.c, though, so should be part of the "dax" namespace and not the "iomap" namespace. Rename them to dax_iomap_rw(), dax_iomap_fault() and dax_iomap_actor() respectively. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: remove dax_pmd_fault()Ross Zwisler
dax_pmd_fault() is the old struct buffer_head + get_block_t based 2 MiB DAX fault handler. This fault handler has been disabled for several kernel releases, and support for PMDs will be reintroduced using the struct iomap interface instead. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: coordinate locking for offsets in PMD rangeRoss Zwisler
DAX radix tree locking currently locks entries based on the unique combination of the 'mapping' pointer and the pgoff_t 'index' for the entry. This works for PTEs, but as we move to PMDs we will need to have all the offsets within the range covered by the PMD to map to the same bit lock. To accomplish this, for ranges covered by a PMD entry we will instead lock based on the page offset of the beginning of the PMD entry. The 'mapping' pointer is still used in the same way. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: consistent variable naming for DAX entriesRoss Zwisler
No functional change. Consistently use the variable name 'entry' instead of 'ret' for DAX radix tree entries. This was already happening in most of the code, so update get_unlocked_mapping_entry(), grab_mapping_entry() and dax_unlock_mapping_entry(). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: remove the last BUG_ON() from fs/dax.cRoss Zwisler
Don't take down the kernel if we get an invalid 'from' and 'length' argument pair. Just warn once and return an error. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: make 'wait_table' global variable staticRoss Zwisler
The global 'wait_table' variable is only used within fs/dax.c, and generates the following sparse warning: fs/dax.c:39:19: warning: symbol 'wait_table' was not declared. Should it be static? Make it static so it has scope local to fs/dax.c, and to make sparse happy. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08ext2: remove support for DAX PMD faultsRoss Zwisler
DAX PMD support was added via the following commit: commit e7b1ea2ad658 ("ext2: huge page fault support") I believe this path to be untested as ext2 doesn't reliably provide block allocations that are aligned to 2MiB. In my testing I've been unable to get ext2 to actually fault in a PMD. It always fails with a "pfn unaligned" message because the sector returned by ext2_get_block() isn't aligned. I've tried various settings for the "stride" and "stripe_width" extended options to mkfs.ext2, without any luck. Since we can't reliably get PMDs, remove support so that we don't have an untested code path that we may someday traverse when we happen to get an aligned block allocation. This should also make 4k DAX faults in ext2 a bit faster since they will no longer have to call the PMD fault handler only to get a response of VM_FAULT_FALLBACK. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08dax: remove buffer_size_valid()Ross Zwisler
Now that ext4 properly sets bh.b_size when we call get_block() for a hole, rely on that value and remove the buffer_size_valid() sanity check. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-11-08ext4: tell DAX the size of allocation holesRoss Zwisler
When DAX calls _ext4_get_block() and the file offset points to a hole we currently don't set bh->b_size. This is current worked around via buffer_size_valid() in fs/dax.c. _ext4_get_block() has the hole size information from ext4_map_blocks(), so populate bh->b_size so we can remove buffer_size_valid() in a later patch. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-24xfs: defer should abort intent items if the trans roll failsDarrick J. Wong
If the deferred ops transaction roll fails, we need to abort the intent items if we haven't already logged a done item for it, regardless of whether or not the deferred ops has had a transaction committed. Dave found this while running generic/388. Move the tracepoint to make it easier to track object lifetimes. Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-24xfs: clear cowblocks tag when cow fork is emptiedBrian Foster
The background cowblocks scan job takes care of scanning for inodes with potentially lingering blocks in the cow fork and clearing them out. If the background scanner reclaims the cow fork blocks, however, it doesn't immediately clear the cowblocks tag from the inode. Instead, the inode remains tagged until the background scanner comes around again, discovers the inode cow fork has no blocks, clears the tag and fires the trace_xfs_inode_free_cowblocks_invalid() tracepoint to indicate that the inode may have been incorrectly tagged. This is not a major functional problem as the tag is ultimately cleared. Nonetheless, clear the tag when an inode cow fork is explicitly emptied to avoid the extra round trip through the background scanner and spurious "invalid" tracepoint. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-24xfs: fix up inode cowblocks tracking tracepointsBrian Foster
These calls are still using the eofblocks tracepoints. The cowblocks equivalents are already defined, we just aren't actually calling them. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-24fs: Do to trim high file position bits in iomap_page_mkwrite_actorJan Kara
iomap_page_mkwrite_actor() calls __block_write_begin_int() with position masked as pos & ~PAGE_MASK which is equivalent to pos & (PAGE_SIZE-1). Thus it masks off high bits of file position. However __block_write_begin_int() expects full file position on input. This does not cause any visible issues because all __block_write_begin_int() really cares about are low file position bits but still it is a bug waiting to happen. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: remove xfs_bunmapi_cowChristoph Hellwig
Since no one uses it anymore. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: optimize xfs_reflink_end_cowChristoph Hellwig
Instead of doing a full extent list search for each extent that is to be deleted using xfs_bmapi_read and then doing another one inside of xfs_bunmapi_cow use the same scheme that xfs_bumapi uses: look up the last extent to be deleted and then use the extent index to walk downward until we are outside the range to be deleted. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: optimize xfs_reflink_cancel_cow_blocksChristoph Hellwig
Rewrite xfs_reflink_cancel_cow_blocks so that we only do a search for the first extent in the extent list and then iterate over the remaining extents using the extent index, passing the extent we operate on directly to xfs_bmap_del_extent_delay or xfs_bmap_del_extent_cow instead of going through xfs_bunmapi and doing yet another extent list lookup. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: refactor xfs_bunmapi_cowChristoph Hellwig
Split out two helpers for deleting delayed or real extents from the COW fork. This allows to call them directly from xfs_reflink_cow_end_io once that function is refactored to iterate the extent tree. It will also allow to reuse the delalloc deletion from xfs_bunmapi in the future. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: optimize writes to reflink filesChristoph Hellwig
Instead of reserving space as the first thing in write_begin move it past reading the extent in the data fork. That way we only have to read from the data fork once and can reuse that information for trimming the extent to the shared/unshared boundary. Additionally this allows to easily limit the actual write size to said boundary, and avoid a roundtrip on the ilock. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: don't bother looking at the refcount tree for readsChristoph Hellwig
There is no need to trim an extent into a shared or non-shared one, or report any flags for plain old reads. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: handle "raw" delayed extents xfs_reflink_trim_around_sharedChristoph Hellwig
Delalloc extents in the extent list contain the number of reserved indirect blocks in their startblock value and don't use the magic DELAYSTARTBLOCK constant. Ensure that xfs_reflink_trim_around_shared handles them properly by checking for isnullstartblock(). Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: add xfs_trim_extentDarrick J. Wong
This helpers allows to trim an extent to a subset of it's original range while making sure the block numbers in it remain valid, In the future xfs_trim_extent and xfs_bmapi_trim_map should probably be merged in some form. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [hch: split from a previous patch from Darrick, moved around and added support for "raw" delayed extents"] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20iomap: add IOMAP_REPORTChristoph Hellwig
This allows the file system to tell a FIEMAP from a read operation, and thus avoids the need to report flags that aren't actually used in the read path. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: merge xfs_reflink_remap_range and xfs_file_share_rangeChristoph Hellwig
There is no clear division of responsibility between those functions, so just merge them into one to keep the code simple. Also move xfs_file_wait_for_io to xfs_reflink.c together with its only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: remove xfs_file_wait_for_ioChristoph Hellwig
filemap_write_and_wait_range operates on full pages, so there is no need for the rounding operations. Additionally this allows us to micro-optimize by skipping the second inode_dio_wait for a intra-file clone. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_rangeChristoph Hellwig
We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE values, as well as preventing new buffered writers re-dirtying the file data that we just wrote out. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: fix the same_inode check in xfs_file_share_rangeChristoph Hellwig
The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit wide, so checking i_ino for equality could lead to rate false positives on 32-bit architectures. Just compare the inode pointers themselves to be safe. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: remove the same fs check from xfs_file_share_rangeChristoph Hellwig
The VFS already does the check, and the placement of this duplicate is in the way of the following locking rework. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20libxfs: v3 inodes are only valid on crc-enabled filesystemsRoger Willcocks
xfs_repair was not detecting that version 3 inodes are invalid for for non-CRC filesystems. The result is specific inode corruptions go undetected and hence aren't repaired if only the version number is out of range. The core of the problem is that the XFS_DINODE_GOOD_VERSION() macro doesn't know that valid inode versions are dependent on a superblock version number. Fix this in libxfs, and propagate the new function out into the rest of xfsprogs to fix the issue. [Darrick: port to kernel from xfsprogs] Reported-by: Leslie Rhorer <lrhorer@mygrande.net> Signed-off-by: Roger Willcocks <roger@filmlight.ltd.uk> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20libxfs: clean up _calc_dquots_per_chunkDarrick J. Wong
The function xfs_calc_dquots_per_chunk takes a parameter in units of basic blocks. The kernel seems to get the units wrong, but userspace got 'fixed' by commenting out the unnecessary conversion. Fix both. cc: <stable@vger.kernel.org> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: unset MS_ACTIVE if mount failsDarrick J. Wong
As part of the inode block map intent log item recovery process, we had to set the IRECOVERY flag to prevent an unlinked inode from being truncated during the first iput call. This required us to set MS_ACTIVE so that iput puts the inode on the lru instead of immediately evicting the inode. Unfortunately, if the mount fails later on, the inodes that have been loaded (root dir and realtime) actually need to be evicted since we're aborting the mount. If we don't clear MS_ACTIVE in the failure step, those inodes are not evicted and therefore leak. The leak was found by running xfs/130 and rmmoding xfs immediately after the test. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-10-20xfs: remove pointless error goto in xfs_bmap_remap_allocEric Sandeen
The commit: f65306ea xfs: map an inode's offset to an exact physical block added a pointless error0: target; remove it. Addresses-Coverity-Id: 1373865 Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Dave Chinner <david@fromorbit.com>