Age | Commit message (Collapse) | Author |
|
Christoph Hellwig says:
====================
get rid of the address_space override in setsockopt v2
setsockopt is the last place in architecture-independ code that still
uses set_fs to force the uaccess routines to operate on kernel pointers.
This series adds a new sockptr_t type that can contained either a kernel
or user pointer, and which has accessors that do the right thing, and
then uses it for setsockopt, starting by refactoring some low-level
helpers and moving them over to it before finally doing the main
setsockopt method.
Note that apparently the eBPF selftests do not even cover this path, so
the series has been tested with a testing patch that always copies the
data first and passes a kernel pointer. This is something that works for
most common sockopts (and is something that the ePBF support relies on),
but unfortunately in various corner cases we either don't use the passed
in length, or in one case actually copy data back from setsockopt, or in
case of bpfilter straight out do not work with kernel pointers at all.
Against net-next/master.
Changes since v1:
- check that users don't pass in kernel addresses
- more bpfilter cleanups
- cosmetic mptcp tweak
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For architectures like x86 and arm64 we don't need the separate bit to
indicate that a pointer is a kernel pointer as the address spaces are
unified. That way the sockptr_t can be reduced to a union of two
pointers, which leads to nicer calling conventions.
The only caveat is that we need to check that users don't pass in kernel
address and thus gain access to kernel memory. Thus the USER_SOCKPTR
helper is replaced with a init_user_sockptr function that does this check
and returns an error if it fails.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rework the remaining setsockopt code to pass a sockptr_t instead of a
plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
outside of architecture specific code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154]
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Factour out a helper to set the IPv6 option headers from
do_ipv6_setsockopt.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Note that the get case is pretty weird in that it actually copies data
back to userspace from setsockopt.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Split ipv6_flowlabel_opt into a subfunction for each action and a small
wrapper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the sockptr_t type to merge the versions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is mostly to prepare for cleaning up the callers, as bpfilter by
design can't handle kernel pointers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a uptr_t type that can hold a pointer to either a user or kernel
memory region, and simply helpers to copy to and from it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bpfilter user mode helper processes the optval address using
process_vm_readv. Don't send it kernel addresses fed under
set_fs(KERNEL_DS) as that won't work.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Split __bpfilter_process_sockopt into a low-level send request routine and
the actual setsockopt hook to split the init time ping from the actual
setsockopt processing.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The __user doesn't make sense when casting to an integer type, just
switch to a uintptr_t cast which also removes the need for the __force.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Ariel Levkovich says:
====================
TC datapath hash api
Hash based packet classification allows user to set up rules that
provide load balancing of traffic across multiple vports and
for ECMP path selection while keeping the number of rule at minimum.
Instead of matching on exact flow spec, which requires a rule per
flow, user can define rules based on a their hash value and distribute
the flows to different buckets. The number of rules
in this case will be constant and equal to the number of buckets.
The series introduces an extention to the cls flower classifier
and allows user to add rules that match on the hash value that
is stored in skb->hash while assuming the value was set prior to
the classification.
Setting the skb->hash can be done in various ways and is not defined
in this series - for example:
1. By the device driver upon processing an rx packet.
2. Using tc action bpf with a program which computes and sets the
skb->hash value.
$ tc filter add dev ens1f0_0 ingress \
prio 1 chain 2 proto ip \
flower hash 0x0/0xf \
action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_0 ingress \
prio 1 chain 2 proto ip \
flower hash 0x1/0xf \
action mirred egress redirect dev ens1f0_2
v3 -> v4:
*Drop hash setting code leaving only the classidication parts.
Setting the hash will be possible via existing tc action bpf.
v2 -> v3:
*Split hash algorithm option into 2 different actions.
Asym_l4 available via act_skbedit and bpf via new act_hash.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Adding new cls flower keys for hash value and hash
mask and dissect the hash info from the skb into
the flow key towards flow classication.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Retreive a hash value from the SKB and store it
in the dissector key for future matching.
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
An imbalanced TX indirection table causes netvsc to have low
performance. This table is created and managed during runtime. To help
better diagnose performance issues caused by imbalanced tables, it needs
make TX indirection tables visible.
Because TX indirection table is driver specified information, so
display it via ethtool register dump.
Signed-off-by: Chi Song <chisong@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I noticed that touching linux/rhashtable.h causes lib/vsprintf.c to
be rebuilt. This dependency came through a bogus inclusion in the
file net/flow_offload.h. This patch moves it to the right place.
This patch also removes a lingering rhashtable inclusion in cls_api
created by the same commit.
Fixes: 4e481908c51b ("flow_offload: move tc indirect block to...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Merge misc fixes from Andrew Morton:
"Subsystems affected by this patch series: mm/pagemap, mm/shmem,
mm/hotfixes, mm/memcg, mm/hugetlb, mailmap, squashfs, scripts,
io-mapping, MAINTAINERS, and gdb"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: fix lx-symbols 'gdb.error' while loading modules
MAINTAINERS: add KCOV section
io-mapping: indicate mapping failure
scripts/decode_stacktrace: strip basepath from all paths
squashfs: fix length field overlap check in metadata reading
mailmap: add entry for Mike Rapoport
khugepaged: fix null-pointer dereference due to race
mm/hugetlb: avoid hardcoding while checking if cma is enabled
mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
mm/memcg: fix refcount error while moving and swapping
mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages()
mm: initialize return of vm_insert_pages
vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into master
Pull xtensa csum regression fix from Al Viro:
"Max Filippov caught a breakage introduced in xtensa this cycle
by the csum_and_copy_..._user() series.
Cut'n'paste from the wrong source - the check that belongs
in csum_and_copy_to_user() ended up both there and in
csum_and_copy_from_user()"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
xtensa: fix access check in csum_and_copy_from_user
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into master
Pull arm64 fix from Will Deacon:
"Fix compat vDSO build flags for recent versions of clang to tell it
where to find the assembler"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: vdso32: Fix '--prefix=' value for newer versions of clang
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into master
Pull btrfs fixes from David Sterba:
"A few resouce leak fixes from recent patches, all are stable material.
The problems have been observed during testing or have a reproducer"
* tag 'for-5.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix mount failure caused by race with umount
btrfs: fix page leaks after failure to lock page for delalloc
btrfs: qgroup: fix data leak caused by race between writeback and truncate
btrfs: fix double free on ulist after backref resolution failure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs into master
Pull zonefs fixes from Damien Le Moal:
"Two fixes, the first one to remove compilation warnings and the second
to avoid potentially inefficient allocation of BIOs for direct writes
into sequential zones"
* tag 'zonefs-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: count pages after truncating the iterator
zonefs: Fix compilation warning
|
|
master
Pull io_uring fixes from Jens Axboe:
- Fix discrepancy in how sqe->flags are treated for a few requests,
this makes it consistent (Daniele)
- Ensure that poll driven retry works with double waitqueue poll users
- Fix a missing io_req_init_async() (Pavel)
* tag 'io_uring-5.8-2020-07-24' of git://git.kernel.dk/linux-block:
io_uring: missed req_init_async() for IOSQE_ASYNC
io_uring: always allow drain/link/hardlink/async sqe flags
io_uring: ensure double poll additions work with both request types
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into master
Pull iommu fix from Joerg Roedel:
"Fix a NULL-ptr dereference in the QCOM IOMMU driver"
* tag 'iommu-fix-v5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/qcom: Use domain rather than dev as tlb cookie
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma into master
Pull rdma fixes from Jason Gunthorpe:
"One merge window regression, some corruption bugs in HNS and a few
more syzkaller fixes:
- Two long standing syzkaller races
- Fix incorrect HW configuration in HNS
- Restore accidentally dropped locking in IB CM
- Fix ODP prefetch bug added in the big rework several versions ago"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Prevent prefetch from racing with implicit destruction
RDMA/cm: Protect access to remote_sidr_table
RDMA/core: Fix race in rdma_alloc_commit_uobject()
RDMA/hns: Fix wrong PBL offset when VA is not aligned to PAGE_SIZE
RDMA/hns: Fix wrong assignment of lp_pktn_ini in QPC
RDMA/mlx5: Use xa_lock_irq when access to SRQ table
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm into master
Pull device mapper fix from Mike Snitzer:
"A stable fix for DM integrity target's integrity recalculation that
gets skipped when resuming a device. This is a fix for a previous
stable@ fix"
* tag 'for-5.8/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm integrity: fix integrity recalculation that is improperly skipped
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into master
Pull i2c fixes from Wolfram Sang:
"Again some driver bugfixes and some documentation fixes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-qcom-geni: Fix DMA transfer race
i2c: rcar: always clear ICSAR to avoid side effects
MAINTAINERS: i2c: at91: handover maintenance to Codrin Ciubotariu
i2c: drop duplicated word in the header file
i2c: cadence: Clear HOLD bit at correct time in Rx path
Revert "i2c: cadence: Fix the hold bit setting"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc into master
Pull MMC fix from Ulf Hansson:
"Fix clock divider calculation in the ASPEED SDHCI controller"
* tag 'mmc-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-of-aspeed: Fix clock divider calculation
|
|
into master
Pull drm fixes from Dave Airlie:
"Quiet fixes, I may have a single regression fix follow up to this for
nouveau, but it might be next week, Ben was testing it a bit more .
Otherwise two amdgpu fixes, one lima and one sun4i:
amdgpu:
- Fix crash when overclocking VegaM
- Fix possible crash when editing dpm levels
sun4i:
- Fix inverted HPD result; fixes an earlier fix
lima:
- fix timeout during reset"
* tag 'drm-fixes-2020-07-24' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: Fix NULL dereference in dpm sysfs handlers
drm/amd/powerplay: fix a crash when overclocking Vega M
drm/lima: fix wait pp reset timeout
drm: sun4i: hdmi: Fix inverted HPD result
|
|
Commit ed66f991bb19 ("module: Refactor section attr into bin attribute")
removed the 'name' field from 'struct module_sect_attr' triggering the
following error when invoking lx-symbols:
(gdb) lx-symbols
loading vmlinux
scanning for modules in linux/build
loading @0xffffffffc014f000: linux/build/drivers/net/tun.ko
Python Exception <class 'gdb.error'> There is no member named name.:
Error occurred in Python: There is no member named name.
This patch fixes the issue taking the module name from the 'struct
attribute'.
Fixes: ed66f991bb19 ("module: Refactor section attr into bin attribute")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Kieran Bingham <kbingham@kernel.org>
Link: http://lkml.kernel.org/r/20200722102239.313231-1-sgarzare@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
To link KCOV to the kasan-dev@ mailing list.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Link: http://lkml.kernel.org/r/5fa344db7ac4af2213049e5656c0f43d6ecaa379.1595331682.git.andreyknvl@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look like
this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had been
propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
[akpm@linux-foundation.org: detect ioremap_wc() errors earlier]
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently the basepath is removed only from the beginning of the string.
When the symbol is inlined and there's multiple line outputs of
addr2line, only the first line would have basepath removed.
Change to remove the basepath prefix from all lines.
Fixes: 31013836a71e ("scripts/decode_stacktrace: match basepath using shell prefix operator, not regex")
Co-developed-by: Shik Chen <shik@chromium.org>
Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org>
Signed-off-by: Shik Chen <shik@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Nicolas Boichat <drinkcat@chromium.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Link: http://lkml.kernel.org/r/20200720082709.252805-1-pihsun@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is a regression introduced by the "migrate from ll_rw_block usage
to BIO" patch.
Squashfs packs structures on byte boundaries, and due to that the length
field (of the metadata block) may not be fully in the current block.
The new code rewrote and introduced a faulty check for that edge case.
Fixes: 93e72b3c612adcaca1 ("squashfs: migrate from ll_rw_block usage to BIO")
Reported-by: Bernd Amend <bernd.amend@gmail.com>
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Adrien Schildknecht <adrien+dev@schischi.me>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Daniel Rosenberg <drosen@google.com>
Link: http://lkml.kernel.org/r/20200717195536.16069-1-phillip@squashfs.org.uk
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add an entry to correct my email addresses.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200708095414.12275-1-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|