Age | Commit message (Collapse) | Author |
|
The kernfs_get_inode() returns NULL on error, it never returns error
pointers.
Fixes: aa8188253474 ("kernfs: add exportfs operations")
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
From Boris:
"
This pull request contains the following core changes:
* Fix memory leaks in the core
* Remove unused NAND locking support
* Rename nand.h into rawnand.h (preparing support for spi NANDs)
* Use NAND_MAX_ID_LEN where appropriate
* Fix support for 20nm Hynix chips
* Fix support for Samsung and Hynix SLC NANDs
and the following driver changes:
* Various cleanup, improvements and fixes in the qcom driver
* Fixes for bugs detected by various static code analysis tools
* Fix mxc ooblayout definition
* Add a new part_parsers to tmio and sharpsl platform data in order to
define a custom list of partition parsers
* Request the reset line in exclusive mode in the sunxi driver
* Fix a build error in the orion-nand driver when compiled for ARMv4
* Allow 64-bit mvebu platforms to select the PXA3XX driver
"
|
|
When mounting to older servers, such as Windows XP (or even Windows 7),
the limited error messages that can be passed back to user space can
get confusing since the default dialect has changed from SMB1 (CIFS) to
more secure SMB3 dialect. Log additional information when the user chooses
to use the default dialects and when the server does not support the
dialect requested.
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Two cifs bug fixes for stable"
* tag 'cifs-fixes-for-4.13-rc7-and-stable' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: remove endian related sparse warning
CIFS: Fix maximum SMB2 header size
|
|
Merge mmu_notifier fixes from Jérôme Glisse:
"The invalidate_page callback suffered from 2 pitfalls. First it used
to happen after page table lock was release and thus a new page might
have been setup for the virtual address before the call to
invalidate_page().
This is in a weird way fixed by commit c7ab0d2fdc84 ("mm: convert
try_to_unmap_one() to use page_vma_mapped_walk()") which moved the
callback under the page table lock. Which also broke several existing
user of the mmu_notifier API that assumed they could sleep inside this
callback.
The second pitfall was invalidate_page being the only callback not
taking a range of address in respect to invalidation but was giving an
address and a page. Lot of the callback implementer assumed this could
never be THP and thus failed to invalidate the appropriate range for
THP pages.
By killing this callback we unify the mmu_notifier callback API to
always take a virtual address range as input.
There is now two clear API (I am not mentioning the youngess API which
is seldomly used):
- invalidate_range_start()/end() callback (which allow you to sleep)
- invalidate_range() where you can not sleep but happen right after
page table update under page table lock
Note that a lot of existing user feels broken in respect to
range_start/ range_end. Many user only have range_start() callback but
there is nothing preventing them to undo what was invalidated in their
range_start() callback after it returns but before any CPU page table
update take place.
The code pattern use in kvm or umem odp is an example on how to
properly avoid such race. In a nutshell use some kind of sequence
number and active range invalidation counter to block anything that
might undo what the range_start() callback did.
If you do not care about keeping fully in sync with CPU page table (ie
you can live with CPU page table pointing to new different page for a
given virtual address) then you can take a reference on the pages
inside the range_start callback and drop it in range_end or when your
driver is done with those pages.
Last alternative is to use invalidate_range() if you can do
invalidation without sleeping as invalidate_range() callback happens
under the CPU page table spinlock right after the page table is
updated.
The first two patches convert existing mmu_notifier_invalidate_page()
calls to mmu_notifier_invalidate_range() and bracket those call with
call to mmu_notifier_invalidate_range_start()/end().
The next ten patches remove existing invalidate_page() callback as it
can no longer happen.
Finally the last page remove the invalidate_page() callback completely
so it can RIP.
Changes since v1:
- remove more dead code in kvm (no testing impact)
- more accurate end address computation (patch 2) in page_mkclean_one
and try_to_unmap_one
- added tested-by/reviewed-by gotten so far"
* emailed patches from Jérôme Glisse <jglisse@redhat.com>:
mm/mmu_notifier: kill invalidate_page
KVM: update to new mmu_notifier semantic v2
xen/gntdev: update to new mmu_notifier semantic
sgi-gru: update to new mmu_notifier semantic
misc/mic/scif: update to new mmu_notifier semantic
iommu/intel: update to new mmu_notifier semantic
iommu/amd: update to new mmu_notifier semantic
IB/hfi1: update to new mmu_notifier semantic
IB/umem: update to new mmu_notifier semantic
drm/amdgpu: update to new mmu_notifier semantic
powerpc/powernv: update to new mmu_notifier semantic
mm/rmap: update to new mmu_notifier semantic v2
dax: update to new mmu_notifier semantic
|
|
jfs had previously avoided the use of MAX_LFS_FILESIZE because it hadn't
accounted for the whole 32-bit index range on 32-bit systems. That has
been fixed by commit 0cc3b0ec23ce ("Clarify (and fix) MAX_LFS_FILESIZE
macros"), so we can simplify the code now.
Suggested by Andreas Dilger.
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: jfs-discussion@lists.sourceforge.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Replace all mmu_notifier_invalidate_page() calls by *_invalidate_range()
and make sure it is bracketed by calls to *_invalidate_range_start()/end().
Note that because we can not presume the pmd value or pte value we have
to assume the worst and unconditionaly report an invalidation as
happening.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ceph_readpage() unlocks page prematurely prematurely in the case
that page is reading from fscache. Caller of readpage expects that
page is uptodate when it get unlocked. So page shoule get locked
by completion callback of fscache_read_or_alloc_pages()
Cc: stable@vger.kernel.org # 4.1+, needs backporting for < 4.7
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
[AV: added missing annotations in syscalls.h/compat.h]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The ->iomap_begin() operation is a hot path, so cache the
fs_dax_get_by_host() result at mount time to avoid the incurring the
hash lookup overhead on a per-i/o basis.
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Reviewed-by: Jan Kara <jack@suse.cz>
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The ->iomap_begin() operation is a hot path, so cache the
fs_dax_get_by_host() result at mount time to avoid the incurring the
hash lookup overhead on a per-i/o basis.
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Reviewed-by: Jan Kara <jack@suse.cz>
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
The ->iomap_begin() operation is a hot path, so cache the
fs_dax_get_by_host() result at mount time to avoid the incurring the
hash lookup overhead on a per-i/o basis.
Reported-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
Avoid a 32-bit time overflow in recently_deleted() since i_dtime
(inode deletion time) is stored only as a 32-bit value on disk.
Since i_dtime isn't used for much beyond a boolean value in e2fsck
and is otherwise only used in this function in the kernel, there is
no benefit to use more space in the inode for this field on disk.
Instead, compare only the relative deletion time with the low
32 bits of the time using the newly-added time_before32() helper,
which is similar to time_before() and time_after() for jiffies.
Increase RECENTCY_DIRTY to 300s based on Ted's comments about
usage experience at Google.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
|
|
When changing a file's acl mask, __gfs2_set_acl() will first set the
group bits of i_mode to the value of the mask, and only then set the
actual extended attribute representing the new acl.
If the second part fails (due to lack of space, for example) and the
file had no acl attribute to begin with, the system will from now on
assume that the mask permission bits are actual group permission bits,
potentially granting access to the wrong users.
Prevent this by only changing the inode mode after the acl has been set.
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
The function __gfs2_xattr_set() will return -ENODATA when called to
remove a xattr that does not exist. The result is that setfacl will
show an exit status of 1 when called to set only a file's mode bits
(on a file with no ACLs), despite succeeding. A "No data available"
error will be printed as well.
To fix this return 0 instead, except when the XATTR_REPLACE flag is
set, in which case -ENODATA is appropriate. This is consistent with
how most other xattr setting functions work, in other filesystems.
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Recent patch had an endian warning ie
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
|
|
Currently the maximum size of SMB2/3 header is set incorrectly which
leads to hanging of directory listing operations on encrypted SMB3
connections. Fix this by setting the maximum size to 170 bytes that
is calculated as RFC1002 length field size (4) + transform header
size (52) + SMB2 header size (64) + create response size (56).
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
|
|
Before this patch if you truncated a file to a smaller size it
wasn't freeing all the blocks properly. There are two reasons.
First, the metapath comparison was not comparing previous heights.
I added a function, mp_eq_to_hgt, which checks the metapath at
all heights prior to the target height.
Second, in function find_nonnull_ptr, it needed to zero out all
pointers for heights following the target height. Translated into
decimal integer terms, this way a number like 299, when incremented,
becomes 300, not 399. The 2 gets incremented to 3, and the following
digits need to be reset.
These two things allow the truncate state machine to properly find
the blocks it needs to delete.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Make this const as it is never modified.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
rhashtable_params are not supposed to change at runtime. All
Functions rhashtable_* working with const rhashtable_params
provided by <linux/rhashtable.h>. So mark the non-const structs
as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
The following cleanup is needed to avoid spilling the syslog with
false warnings.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Sqlite only cares about synchronization of file data instead of other data
unrelated attribute of inode, so in commit flow, call fdatasync is enough.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
We won't wait DIO synchronously when doing AIO, so there will be potential
IO reorder in between AIO and GC, which will cause data corruption.
This patch adds inode_dio_wait to serialize aio and data GC to avoid this
issue.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch fixes to avoid needless wake ups.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If file offset is insane, we have to return error instead of kernel panic.
Reported-by: Eric Zhang <followme999@163.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If file was not opened with atomic write mode, but user uses atomic write
ioctl to fsync datas, in the flow, we should not fsync that file with
atomic write mode.
Fixes: 608514deba38 ("f2fs: set fsync mark only for the last dnode")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch fixes to clear FI_HOT_DATA correctly in below path:
- error handling in f2fs_ioc_start_atomic_write
- after commit atomic write in f2fs_ioc_commit_atomic_write
- after drop atomic write in drop_inmem_pages
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In f2fs_issue_flush, due to out-of-order execution of CPU, wake_up can
be called before we insert issue_list, result in long latency of
wait_for_completion. Fix this by adding smp_mb() to force the order of
related codes.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
It's time to issue all the discard commands, if user sets the idle time.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Add a callback to rxrpc_kernel_send_data() so that a kernel service can get
a notification that the AF_RXRPC call has transitioned out the Tx phase and
is now waiting for a reply or a final ACK.
This is called from AF_RXRPC with the call state lock held so the
notification is guaranteed to come before any reply is passed back.
Further, modify the AFS filesystem to make use of this so that we don't have
to change the afs_call state before sending the last bit of data.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Commit 464d62421cb8 ("select: switch compat_{get,put}_fd_set() to
compat_{get,put}_bitmap()") changed the calculation on how many bytes
need to be zeroed when userspace handed over a NULL pointer for a fdset
array in the select syscall.
The calculation was changed in compat_get_fd_set() wrongly from
memset(fdset, 0, ((nr + 1) & ~1)*sizeof(compat_ulong_t));
to
memset(fdset, 0, ALIGN(nr, BITS_PER_LONG));
The ALIGN(nr, BITS_PER_LONG) calculates the number of _bits_ which need
to be zeroed in the target fdset array (rounded up to the next full bits
for an unsigned long).
But the memset() call expects the number of _bytes_ to be zeroed.
This leads to clearing more memory than wanted (on the stack area or
even at kmalloc()ed memory areas) and to random kernel crashes as we
have seen them on the parisc platform.
The correct change should have been
memset(fdset, 0, (ALIGN(nr, BITS_PER_LONG) / BITS_PER_LONG) * BYTES_PER_LONG);
which is the same as can be archieved with a call to
zero_fd_set(nr, fdset).
Fixes: 464d62421cb8 ("select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap()"
Acked-by:: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The reference count in kernfs_node structure is treated like a rwsem by
using lockdep instrumentation code. The lockdep name, however, is still
"s_active" which is carried over from the old sysfs code. As s_active
is no longer the variable name, its use may confuse users on where the
lock is when it is reported by lockdep. So it is changed to "kn->count"
which is how this variable is normally referenced in kernfs code.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We want the binder fix in here as well for testing and merge issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Although llist provides proper APIs, they are not used. Make them used.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Although llist provides proper APIs, they are not used. Make them used.
Signed-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Merge misc fixes from Andrew Morton:
"6 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/memblock.c: reversed logic in memblock_discard()
fork: fix incorrect fput of ->exe_file causing use-after-free
mm/madvise.c: fix freeing of locked page with MADV_FREE
dax: fix deadlock due to misaligned PMD faults
mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled
PM/hibernate: touch NMI watchdog when creating snapshot
|
|
Pull nfsd fixes from Bruce Fields:
"Two nfsd bugfixes, neither 4.13 regressions, but both potentially
serious"
* tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linux:
net: sunrpc: svcsock: fix NULL-pointer exception
nfsd: Limit end of page list when decoding NFSv4 WRITE
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Some bug fixes for stable for cifs"
* tag 'cifs-fixes-for-4.13-rc6-and-stable' of git://git.samba.org/sfrench/cifs-2.6:
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
cifs: Fix df output for users with quota limits
|
|
This patch cleans up various pieces of GFS2 to avoid sparse errors.
This doesn't fix them all, but it fixes several. The first error,
in function glock_hash_walk was a genuine bug where the rhashtable
could be started and not stopped.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
In DAX there are two separate places where the 2MiB range of a PMD is
defined.
The first is in the page tables, where a PMD mapping inserted for a
given address spans from (vmf->address & PMD_MASK) to ((vmf->address &
PMD_MASK) + PMD_SIZE - 1). That is, from the 2MiB boundary below the
address to the 2MiB boundary above the address.
So, for example, a fault at address 3MiB (0x30 0000) falls within the
PMD that ranges from 2MiB (0x20 0000) to 4MiB (0x40 0000).
The second PMD range is in the mapping->page_tree, where a given file
offset is covered by a radix tree entry that spans from one 2MiB aligned
file offset to another 2MiB aligned file offset.
So, for example, the file offset for 3MiB (pgoff 768) falls within the
PMD range for the order 9 radix tree entry that ranges from 2MiB (pgoff
512) to 4MiB (pgoff 1024).
This system works so long as the addresses and file offsets for a given
mapping both have the same offsets relative to the start of each PMD.
Consider the case where the starting address for a given file isn't 2MiB
aligned - say our faulting address is 3 MiB (0x30 0000), but that
corresponds to the beginning of our file (pgoff 0). Now all the PMDs in
the mapping are misaligned so that the 2MiB range defined in the page
tables never matches up with the 2MiB range defined in the radix tree.
The current code notices this case for DAX faults to storage with the
following test in dax_pmd_insert_mapping():
if (pfn_t_to_pfn(pfn) & PG_PMD_COLOUR)
goto unlock_fallback;
This test makes sure that the pfn we get from the driver is 2MiB
aligned, and relies on the assumption that the 2MiB alignment of the pfn
we get back from the driver matches the 2MiB alignment of the faulting
address.
However, faults to holes were not checked and we could hit the problem
described above.
This was reported in response to the NVML nvml/src/test/pmempool_sync
TEST5:
$ cd nvml/src/test/pmempool_sync
$ make TEST5
You can grab NVML here:
https://github.com/pmem/nvml/
The dmesg warning you see when you hit this error is:
WARNING: CPU: 13 PID: 2900 at fs/dax.c:641 dax_insert_mapping_entry+0x2df/0x310
Where we notice in dax_insert_mapping_entry() that the radix tree entry
we are about to replace doesn't match the locked entry that we had
previously inserted into the tree. This happens because the initial
insertion was done in grab_mapping_entry() using a pgoff calculated from
the faulting address (vmf->address), and the replacement in
dax_pmd_load_hole() => dax_insert_mapping_entry() is done using
vmf->pgoff.
In our failure case those two page offsets (one calculated from
vmf->address, one using vmf->pgoff) point to different order 9 radix
tree entries.
This failure case can result in a deadlock because the radix tree unlock
also happens on the pgoff calculated from vmf->address. This means that
the locked radix tree entry that we swapped in to the tree in
dax_insert_mapping_entry() using vmf->pgoff is never unlocked, so all
future faults to that 2MiB range will block forever.
Fix this by validating that the faulting address's PMD offset matches
the PMD offset from the start of the file. This check is done at the
very beginning of the fault and covers faults that would have mapped to
storage as well as faults to holes. I left the COLOUR check in
dax_pmd_insert_mapping() in place in case we ever hit the insanity
condition where the alignment of the pfn we get from the driver doesn't
match the alignment of the userspace address.
Link: http://lkml.kernel.org/r/20170822222436.18926-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: "Slusarz, Marcin" <marcin.slusarz@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Enlarge sd_fsname to be big enough for the longest long lock table name
and an arbitrary journal number. This silences two -Wformat-truncation
warnings with gcc 7.1.1.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Before this patch, if GFS2 encountered IO errors while writing to
the journal, it would not report the problem, so they would go
unnoticed, sometimes for many hours. Sometimes this would only be
noticed later, when recovery tried to do journal replay and failed
due to invalid metadata at the blocks that resulted in IO errors.
This patch makes GFS2's log daemon check for IO errors. If it
encounters one, it withdraws from the file system and reports
why in dmesg. A similar action is taken when IO errors occur when
writing to the system statfs file.
These errors are also reported back to any callers of fsync, since
that requires the journal to be flushed. Therefore, any IO errors
that would previously go unnoticed are now noticed and the file
system is withdrawn as early as possible, thus preventing further
file system damage.
Also note that this reintroduces superblock variable sd_log_error,
which Christoph removed with commit f729b66fca.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Close an attack vector by moving the arrays of per-server methods to
read-only memory.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Close an attack vector by moving the arrays of encoding and decoding
methods to read-only memory.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
With a few exceptions, most individual encoders don't handle error
cases.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Most encoders do nothing in the error case. But they can still screw
things up in that case: most errors happen very early in rpc processing,
possibly before argument fields are filled in and bounds-tested, so
encoders that do anything other than immediately bail on error can
easily crash in odd error cases.
So just handle errors centrally most of the time to remove the chance of
error.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Run a separate ->op_release function if necessary instead of depending
on the xdr encoder to do this.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|
Trivial cleanup, no change in behavior.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|