Age | Commit message (Collapse) | Author |
|
The kmalloc was not being checked - if it fails issue a warning
and return -ENOMEM to the caller.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Fixes: b8da344b74c8 ("cifs: dynamic allocation of ntlmssp blob")
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
cc: Stable <stable@vger.kernel.org>`
|
|
Some SMB2/3 servers, Win2016 but possibly others too, adds padding
not only between PDUs in a compound but also to the final PDU.
This padding extends the PDU to a multiple of 8 bytes.
Check if the unexpected length looks like this might be the case
and avoid triggering the log messages for :
"SMB2 server sent bad RFC1001 len %d not %d\n"
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Masami found an off by one bug in the code that keeps "notrace"
functions from being traced by kprobes. During my testing, I found
that there's places that we may want to add kprobes to notrace, thus
we may end up changing this code before 4.19 is released.
The history behind this change is that we found that adding kprobes to
various notrace functions caused the kernel to crashed. We took the
safe route and decided not to allow kprobes to trace any notrace
function.
But because notrace is added to functions that just cause weird side
effects to the function tracer, but are still safe, preventing kprobes
for all notrace functios may be too much of a big hammer.
One such place is __schedule() is marked notrace, to keep function
tracer from doing strange recursive loops when it gets traced with
NEED_RESCHED set. With this change, one can not add kprobes to the
scheduler.
Masami also added code to use gcov on ftrace"
* tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Fix to check notrace function with correct range
tracing: Allow gcov profiling on only ftrace subsystem
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Fixes 2018-08-23
This series contains bug fixes to the ice driver.
Anirudh provides several fixes, starting with static analysis fixes by
replacing a bitwise-and with a constant value and replace "magic"
numbers with defines. Fixed the control queue processing by removing
unnecessary read/writes to registers, as well as getting a accurate
value for "pending". Added additional checks to avoid NULL pointer
dereferences. Fixed up code formatting issues, by cleaning up code
comments and coding style.
Bruce cleans up a duplicate check for owner, within the same function.
Also cleans up interrupt causes that are not handled or applicable.
Fix checkpatch warning about the use of bool in structures due to the
wasted space and size of bool, so convert struct members to u8 instead.
Jake fixes a number of potential bugs in the reporting of stats via
ethtool, by simply reporting all the queue statistics, even for the
queues that are not activated. Fixed a compiler warning, as well as
make the code a bit cleaner but just using order_base_2() for
calculating the power-of-2.
Preethi adds a check to avoid a NULL pointer dereference crash during
initialization.
Brett clarifies the code when it comes to port VLANs and regular VLANs,
by renaming defines and field values to match their intended use and
purpose.
Jesse initializes a variable to avoid garbage values being returned to
the caller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If we don't use paravirt; don't play unnecessary and complicated games
to free page-tables.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The generic tlb_end_vma does not call invalidate_range mmu notifier, and
it resets resets the mmu_gather range, which means the notifier won't be
called on part of the range in case of an unmap that spans multiple
vmas.
ARM64 seems to be the only arch I could see that has notifiers and uses
the generic tlb_end_vma. I have not actually tested it.
[ Catalin and Will point out that ARM64 currently only uses the
notifiers for KVM, which doesn't use the ->invalidate_range()
callback right now, so it's a bug, but one that happens to
not affect them. So not necessary for stable. - Linus ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Jann reported that x86 was missing required TLB invalidates when he
hit the !*batch slow path in tlb_remove_table().
This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache)
invalidates, the PowerPC-hash where this code originated and the
Sparc-hash where this was subsequently used did not need that. ARM
which later used this put an explicit TLB invalidate in their
__p*_free_tlb() functions, and PowerPC-radix followed that example.
But when we hooked up x86 we failed to consider this. Fix this by
(optionally) hooking tlb_remove_table() into the TLB invalidate code.
NOTE: s390 was also needing something like this and might now
be able to use the generic code again.
[ Modified to be on top of Nick's cleanups, which simplified this patch
now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]
Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Will noted that only checking mm_users is incorrect; we should also
check mm_count in order to cover CPUs that have a lazy reference to
this mm (and could do speculative TLB operations).
If removing this turns out to be a performance issue, we can
re-instate a more complete check, but in tlb_table_flush() eliding the
call_rcu_sched().
Fixes: 267239116987 ("mm, powerpc: move the RCU page-table freeing into generic code")
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rik van Riel <riel@surriel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
There is no need to call this from tlb_flush_mmu_tlbonly, it logically
belongs with tlb_flush_mmu_free. This makes future fixes simpler.
[ This was originally done to allow code consolidation for the
mmu_notifier fix, but it also ends up helping simplify the
HAVE_RCU_TABLE_INVALIDATE fix. - Linus ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When running in a container with a user namespace, if you call getxattr
with name = "system.posix_acl_access" and size % 8 != 4, then getxattr
silently skips the user namespace fixup that it normally does resulting in
un-fixed-up data being returned.
This is caused by posix_acl_fix_xattr_to_user() being passed the total
buffer size and not the actual size of the xattr as returned by
vfs_getxattr().
This commit passes the actual length of the xattr as returned by
vfs_getxattr() down.
A reproducer for the issue is:
touch acl_posix
setfacl -m user:0:rwx acl_posix
and the compile:
#define _GNU_SOURCE
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <unistd.h>
#include <attr/xattr.h>
/* Run in user namespace with nsuid 0 mapped to uid != 0 on the host. */
int main(int argc, void **argv)
{
ssize_t ret1, ret2;
char buf1[128], buf2[132];
int fret = EXIT_SUCCESS;
char *file;
if (argc < 2) {
fprintf(stderr,
"Please specify a file with "
"\"system.posix_acl_access\" permissions set\n");
_exit(EXIT_FAILURE);
}
file = argv[1];
ret1 = getxattr(file, "system.posix_acl_access",
buf1, sizeof(buf1));
if (ret1 < 0) {
fprintf(stderr, "%s - Failed to retrieve "
"\"system.posix_acl_access\" "
"from \"%s\"\n", strerror(errno), file);
_exit(EXIT_FAILURE);
}
ret2 = getxattr(file, "system.posix_acl_access",
buf2, sizeof(buf2));
if (ret2 < 0) {
fprintf(stderr, "%s - Failed to retrieve "
"\"system.posix_acl_access\" "
"from \"%s\"\n", strerror(errno), file);
_exit(EXIT_FAILURE);
}
if (ret1 != ret2) {
fprintf(stderr, "The value of \"system.posix_acl_"
"access\" for file \"%s\" changed "
"between two successive calls\n", file);
_exit(EXIT_FAILURE);
}
for (ssize_t i = 0; i < ret2; i++) {
if (buf1[i] == buf2[i])
continue;
fprintf(stderr,
"Unexpected different in byte %zd: "
"%02x != %02x\n", i, buf1[i], buf2[i]);
fret = EXIT_FAILURE;
}
if (fret == EXIT_SUCCESS)
fprintf(stderr, "Test passed\n");
else
fprintf(stderr, "Test failed\n");
_exit(fret);
}
and run:
./tester acl_posix
On a non-fixed up kernel this should return something like:
root@c1:/# ./t
Unexpected different in byte 16: ffffffa0 != 00
Unexpected different in byte 17: ffffff86 != 00
Unexpected different in byte 18: 01 != 00
and on a fixed kernel:
root@c1:~# ./t
Test passed
Cc: stable@vger.kernel.org
Fixes: 2f6f0654ab61 ("userns: Convert vfs posix_acl support to use kuids and kgids")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199945
Reported-by: Colin Watson <cjwatson@ubuntu.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
1) Add missing "\n" when printing link event error message.
2) Update dev_err statement in probe.
3) Add function description for ice_clear_pf_cfg.
4) Fix coding style for ice_acquire_nvm.
5) netdev->mtu is unsigned so use %u.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Recent versions of checkpatch have a new warning based on a documented
preference of Linus to not use bool in structures due to wasted space and
the size of bool is implementation dependent. For more information, see
the email thread at https://lkml.org/lkml/2017/11/21/384.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
In ice_vsi_setup_[tx|rx]_rings, err is uninitialized which can result in
a garbage value return to the caller. Fix that.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
1) When ice_ena_msix_range() fails to reserve vectors, a devm_kfree()
warning was seen in the error flow path. So check pf->irq_tracker
before use in ice_clear_interrupt_scheme().
2) In ice_vsi_cfg(), check vsi->netdev before use.
3) In ice_get_link_status, check link_up before use.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Remove the following interrupt causes that are not applicable or not
handled:
- PFINT_OICR_HLP_RDY_M
- PFINT_OICR_CPM_RDY_M
- PFINT_OICR_GPIO_M
- PFINT_OICR_STORM_DETECT_M
Add the following interrupt cause that's actually handled in ice_misc_intr:
- PFINT_OICR_PE_CRITERR_M
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
When command line parsing fails in the while loop in do_event_pipe()
because the number of arguments is incorrect or because the keyword is
unknown, an error message is displayed, but bpftool remains stuck in
the loop. Make sure we exit the loop upon failure.
Fixes: f412eed9dfde ("tools: bpftool: add simple perf event output reader")
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Prior to doing compiler feature detection in Kconfig, attempts to build
GCC plugins with Clang would fail the build, much in the same way missing
GCC plugin headers would fail the build. However, now that this logic
has been lifted into Kconfig, add an explicit test for GCC (instead of
duplicating it in the feature-test script).
Reported-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
|
|
In the struct ice_aqc_vsi_props the field port_vlan_flags is an
overloaded term because it is used for both port VLANs (PVLANs) and
regular VLANs. This is an issue and is very confusing especially when
dealing with VFs because normal VLANs and port VLANs are not the same.
To fix this the field was renamed to vlan_flags and all of the #define's
labeled *_PVLAN_* were renamed to *_VLAN_* if they are not specific to
port VLANs.
Also in ice_vsi_manage_vlan_stripping, set the ICE_AQ_VSI_VLAN_MODE_ALL
bit to allow the driver to add a VLAN tag to all packets it sends.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Currently, we use a combination of ilog2 and is_power_of_2() to
calculate the next power of 2 for the qcount. This appears to be causing
a warning on some combinations of GCC and the Linux kernel:
MODPOST 1 modules
WARNING: "____ilog2_NaN" [ice.ko] undefined!
This appears to because because GCC realizes that qcount could be zero
in some circumstances and thus attempts to link against the
intentionally undefined ___ilog2_NaN function.
The order_base_2 function is intentionally defined to return 0 when
passed 0 as an argument, and thus will be safe to use here.
This not only fixes the warning but makes the resulting code slightly
cleaner, and is really what we should have used originally.
Also update the comment to make it more clear that we are rounding up,
not just incrementing the ilog2 of qcount unconditionally.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
This patch is a consolidation of multiple bug fixes for control queue
processing.
1) In ice_clean_adminq_subtask() remove unnecessary reads/writes to
registers. The bits PFINT_FW_CTL, PFINT_MBX_CTL and PFINT_SB_CTL
are not set when an interrupt arrives, which means that clearing them
again can be omitted.
2) Get an accurate value in "pending" by re-reading the control queue
head register from the hardware.
3) Fix a corner case involving lost control queue messages by checking
for new control messages (using ice_ctrlq_pending) before exiting the
cleanup routine.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Clean control queues only when they are initialized. One of the ways to
validate if the basic initialization is done is by checking value of
cq->sq.head and cq->rq.head variables that specify the register address.
This patch adds a check to avoid NULL pointer dereference crash when tried
to shutdown uninitialized control queue.
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
It is not safe to have the string table for statistics change order or
size over the lifetime of a given netdevice. This is because of the
nature of the 3-step process for obtaining stats. First, user space
performs a request for the size of the strings table. Second it performs
a separate request for the strings themselves, after allocating space
for the table. Third, it requests the stats themselves, also allocating
space for the table.
If the size decreased, there is potential to see garbage data or stats
values. In the worst case, we could potentially see stats values become
mis-aligned with their strings, so that it looks like a statistic is
being reported differently than it actually is.
Even worse, if the size increased, there is potential that the strings
table or stats table was not allocated large enough and the stats code
could access and write to memory it should not, potentially resulting in
undefined behavior and system crashes.
It isn't even safe if the size always changes under the RTNL lock. This
is because the calls take place over multiple user space commands, so it
is not possible to hold the RTNL lock for the entire duration of
obtaining strings and stats. Further, not all consumers of the ethtool
API are the user space ethtool program, and it is possible that one
assumes the strings will not change (valid under the current contract),
and thus only requests the stats values when requesting stats in a loop.
Finally, it's not possible in the general case to detect when the size
changes, because it is quite possible that one value which could impact
the stat size increased, while another decreased. This would result in
the same total number of stats, but reordering them so that stats no
longer line up with the strings they belong to. Since only size changes
aren't enough, we would need some sort of hash or token to determine
when the strings no longer match. This would require extending the
ethtool stats commands, but there is no more space in the relevant
structures.
The real solution to resolve this would be to add a completely new API
for stats, probably over netlink.
In the ice driver, the only thing impacting the stats that is not
constant is the number of queues. Instead of reporting stats for each
used queue, report stats for each allocated queue. We do not change the
number of queues allocated for a given netdevice, as we pass this into
the alloc_etherdev_mq() function to set the num_tx_queues and
num_rx_queues.
This resolves the potential bugs at the slight cost of displaying many
queue statistics which will not be activated.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
Use define for the unit size shift of the Rx LAN context descriptor base
address instead of the magic number 7.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
All BPF hash and LRU maps currently have a known and global seed
we feed into jhash() which is 0. This is suboptimal, thus fix it
by generating a random seed upon hashtab setup time which we can
later on feed into jhash() on lookup, update and deletions.
Fixes: 0f8e4bd8a1fc8 ("bpf: add hashtable type of eBPF maps")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
|
|
There is already a check for owner == ICE_SCHED_NODE_OWNER_LAN at the
beginning of ice_sched_update_vsi_child_nodes. Remove the additional
check to address the static analysis tool smatch issue "warn: we tested
'owner' before and it was 'false'".
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
This patch fixes the following smatch errors:
1) Fix "odd binop '0x0 & 0xc'" when performing the bitwise-and with a
constant value of zero (ICE_AQC_GSET_RSS_LUT_TABLE_SIZE_128_FLAG).
Remove a similar bitwise-and with 0 in ice_add_marker_act() and use the
right mask ICE_LG_ACT_GENERIC_OFFSET_M in the expression.
2) Fix a similar issue "odd binop '0x0 & 0x1800' in ice_req_irq_msix_misc.
3) Fix "odd binop '0x380000 & 0x7fff8'" in ice_add_marker_act(). Also, use
a new define ICE_LG_ACT_GENERIC_OFF_RX_DESC_PROF_IDX instead of magic
number '7'.
4) Fix warn: odd binop '0x0 & 0x18' in ice_set_dflt_vsi_ctx() by removing
unnecessary logic to explicitly unset bits 3 and 4 in port_vlan_bits.
These bits are unset already by the memset on ctxt->info.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
|
|
A previous commit removed the ability to have per-rq flags. We used
those flags to maintain inflight counts. Since we don't have those
anymore, we have to always maintain inflight counts, even if wbt is
disabled. This is clearly suboptimal.
Add a queue quiesce around changing the wbt latency settings from sysfs
to work around this. With that, we can reliably put the enabled check in
our bio_to_wbt_flags(), since we know the WBT_TRACKED flag will be
consistent for the lifetime of the request.
Fixes: c1c80384c8f ("block: remove external dependency on wbt_flags")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The commit e7e81847478 ("powerpc/64s: move machine check SLB flushing
to mm/slb.c") introduced a bug in reloading bolted SLB entries. Unused
bolted entries are stored with .esid=0 in the slb_shadow area, and
that value is now used directly as the RB input to slbmte, which means
the RB[52:63] index field is set to 0, which causes SLB entry 0 to be
cleared.
Fix this by storing the index bits in the unused bolted entries, which
directs the slbmte to the right place.
The SLB shadow area is also used by the hypervisor, but PAPR is okay
with that, from LoPAPR v1.1, 14.11.1.3 SLB Shadow Buffer:
Note: SLB is filled sequentially starting at index 0
from the shadow buffer ignoring the contents of
RB field bits 52-63
Fixes: e7e81847478b ("powerpc/64s: move machine check SLB flushing to mm/slb.c")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
Commit 76fa4975f3ed ("KVM: PPC: Check if IOMMU page is contained in
the pinned physical page", 2018-07-17) added some checks to ensure
that guest DMA mappings don't attempt to map more than the guest is
entitled to access. However, errors in the logic mean that legitimate
guest requests to map pages for DMA are being denied in some
situations. Specifically, if the first page of the range passed to
mm_iommu_get() is mapped with a normal page, and subsequent pages are
mapped with transparent huge pages, we end up with mem->pageshift ==
0. That means that the page size checks in mm_iommu_ua_to_hpa() and
mm_iommu_up_to_hpa_rm() will always fail for every page in that
region, and thus the guest can never map any memory in that region for
DMA, typically leading to a flood of error messages like this:
qemu-system-ppc64: VFIO_MAP_DMA: -22
qemu-system-ppc64: vfio_dma_map(0x10005f47780, 0x800000000000000, 0x10000, 0x7fff63ff0000) = -22 (Invalid argument)
The logic errors in mm_iommu_get() are:
(a) use of 'ua' not 'ua + (i << PAGE_SHIFT)' in the find_linux_pte()
call (meaning that find_linux_pte() returns the pte for the
first address in the range, not the address we are currently up
to);
(b) use of 'pageshift' as the variable to receive the hugepage shift
returned by find_linux_pte() - for a normal page this gets set
to 0, leading to us setting mem->pageshift to 0 when we conclude
that the pte returned by find_linux_pte() didn't match the page
we were looking at;
(c) comparing 'compshift', which is a page order, i.e. log base 2 of
the number of pages, with 'pageshift', which is a log base 2 of
the number of bytes.
To fix these problems, this patch introduces 'cur_ua' to hold the
current user address and uses that in the find_linux_pte() call;
introduces 'pteshift' to hold the hugepage shift found by
find_linux_pte(); and compares 'pteshift' with 'compshift +
PAGE_SHIFT' rather than 'compshift'.
The patch also moves the local_irq_restore to the point after the PTE
pointer returned by find_linux_pte() has been dereferenced because
otherwise the PTE could change underneath us, and adds a check to
avoid doing the find_linux_pte() call once mem->pageshift has been
reduced to PAGE_SHIFT, as an optimization.
Fixes: 76fa4975f3ed ("KVM: PPC: Check if IOMMU page is contained in the pinned physical page")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
The Nest MMU workaround is only needed for RW upgrades. Avoid doing
that for other PTE updates.
We also avoid clearing the PTE while marking it invalid. This is
because other page table walkers will find this PTE none and can
result in unexpected behaviour due to that. Instead we clear
_PAGE_PRESENT and set the software PTE bit _PAGE_INVALID.
pte_present() is already updated to check for both bits. This makes
sure page table walkers will find the PTE present and things like
pte_pfn(pte) returns the right value.
Based on an original patch from Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
LLVM/clang/eBPF: (Arnaldo Carvalho de Melo)
- Allow passing options to llc in addition to to clang.
Hardware tracing: (Jack Henschel)
- Improve error message for PMU address filters, clarifying availability of
that feature in hardware having hardware tracing such as Intel PT.
Python interface: (Jiri Olsa)
- Fix read_on_cpu() interface.
ELF/DWARF libraries: (Jiri Olsa)
- Fix handling of the combo compressed module file + decompressed associated
debuginfo file.
Build (Rasmus Villemoes)
- Disable parallelism for 'make clean', avoiding multiple submakes deleting
the same files and causing the build to fail on systems such as Yocto.
Kernel ABI copies: (Arnaldo Carvalho de Melo)
- Update tools's copy of x86's cpufeatures.h.
- Update arch/x86/lib/memcpy_64.S copy used in 'perf bench mem memcpy'.
Miscellaneous: (Steven Rostedt)
- Change libtraceevent to SPDX License format.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Another panel that reports "DFP 1.x compliant TMDS" but it supports 6bpc
instead of 8 bpc.
Apply 6 bpc quirk for the panel to fix it.
BugLink: https://bugs.launchpad.net/bugs/1788308
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180823055332.7723-1-kai.heng.feng@canonical.com
|
|
My fix for a recursive Kconfig dependency caused another issue where the
ACPI specific options end up in the top-level menu in 'menuconfig'. This
was an unintended side-effect of having a silent option between
'menuconfig ACPI' and 'if ACPI'.
Moving the ARCH_SUPPORTS_ACPI symbol ahead of the ACPI menu solves that
problem and restores the previous presentation.
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: 2c870e61132c (arm64: fix ACPI dependencies)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Johan Hedberg says:
====================
pull request: bluetooth 2018-08-23
Here are two important Bluetooth fixes for the MediaTek and RealTek HCI
drivers.
Please let me know if there are any issues pulling, thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Huazhong Tan says:
====================
net: hns3: bug fix & optimization for HNS3 driver
This patchset presents a bug fix found out when CONFIG_ARM64_64K_PAGES
enable and an optimization for HNS3 driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'truesize' is supposed to be u32, not int, so fix it.
Signed-off-by: Huazhong tan <tanhuazhong@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When enable the config item "CONFIG_ARM64_64K_PAGES", the size of
PAGE_SIZE is 65536(64K). But the type of page_offset is u16, it will
overflow. So change it to u32, when "CONFIG_ARM64_64K_PAGES" enabled.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path")
forgot to handle anycast route and init anycast rt->dst.input to ip6_forward.
Fix it by setting anycast rt->dst.input back to ip6_input.
Fixes: 6edb3c96a5f02 ("net/ipv6: Defer initialization of dst to data path")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Huazhong Tan says:
====================
net: hns: bug fixes & optimization for HNS driver
This patchset presents some bug fixes found out when CONFIG_ARM64_64K_PAGES
enable and an optimization for HNS driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Update hns to drop the hns_nic_get_headlen function in favour of
eth_get_headlen, and hence also removes now redundant hns_nic_get_headlen.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb->truesize is not meant to be tracking amount of used bytes in a skb,
but amount of reserved/consumed bytes in memory.
For instance, if we use a single byte in last page fragment, we have to
account the full size of the fragment.
So skb_add_rx_frag needs to calculate the length of the entire buffer into
turesize.
Fixes: 9cbe9fd5214e ("net: hns: optimize XGE capability by reducing cpu usage")
Signed-off-by: Huazhong tan <tanhuazhong@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
'truesize' is supposed to be u32, not int, so fix it.
Signed-off-by: Huazhong tan <tanhuazhong@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When enable the config item "CONFIG_ARM64_64K_PAGES", the size of PAGE_SIZE
is 65536(64K). But the type of length and page_offset are u16, they will
overflow. So change them to u32.
Fixes: 6fe6611ff275 ("net: add Hisilicon Network Subsystem hnae framework support")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Kevin Yang says:
====================
tcp_bbr: PROBE_RTT minor bug fixes
This series includes two minor bug fixes for the TCP BBR PROBE_RTT
mechanism, and one preparatory patch:
(1) A preparatory patch to reorganize the PROBE_RTT logic by refactoring
(into its own function) the code to exit PROBE_RTT, since the next
patch will be using that code in a new context.
(2) Fix: When BBR restarts from idle and if BBR is in PROBE_RTT mode,
BBR should check if it's time to exit PROBE_RTT. If yes, then BBR
should exit PROBE_RTT mode and restore the cwnd to its full value.
(3) Fix: Apply the PROBE_RTT cwnd cap even if the count of fully-ACKed
packets is 0.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit fixes a corner case where TCP BBR would enter PROBE_RTT
mode but not reduce its cwnd. If a TCP receiver ACKed less than one
full segment, the number of delivered/acked packets was 0, so that
bbr_set_cwnd() would short-circuit and exit early, without cutting
cwnd to the value we want for PROBE_RTT.
The fix is to instead make sure that even when 0 full packets are
ACKed, we do apply all the appropriate caps, including the cap that
applies in PROBE_RTT mode.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch fix the case where BBR does not exit PROBE_RTT mode when
it restarts from idle. When BBR restarts from idle and if BBR is in
PROBE_RTT mode, BBR should check if it's time to exit PROBE_RTT. If
yes, then BBR should exit PROBE_RTT mode and restore the cwnd to its
full value.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch add a helper function bbr_check_probe_rtt_done() to
1. check the condition to see if bbr should exit probe_rtt mode;
2. process the logic of exiting probe_rtt mode.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tcp uses per-cpu (and per namespace) sockets (net->ipv4.tcp_sk) internally
to send some control packets.
1) RST packets, through tcp_v4_send_reset()
2) ACK packets in SYN-RECV and TIME-WAIT state, through tcp_v4_send_ack()
These packets assert IP_DF, and also use the hashed IP ident generator
to provide an IPv4 ID number.
Geoff Alexander reported this could be used to build off-path attacks.
These packets should not be fragmented, since their size is smaller than
IPV4_MIN_MTU. Only some tunneled paths could eventually have to fragment,
regardless of inner IPID.
We really can use zero IPID, to address the flaw, and as a bonus,
avoid a couple of atomic operations in ip_idents_reserve()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Geoff Alexander <alexandg@cs.unm.edu>
Tested-by: Geoff Alexander <alexandg@cs.unm.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
All the 3 callers of addrconf_add_mroute() assert RTNL
lock, they don't take any additional lock either, so
it is safe to convert it to GFP_KERNEL.
Same for sit_add_v4_addrs().
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|