summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-19nfp: release global resources only on the remove pathJakub Kicinski
NFP app is currently shut down as soon as all the vNICs are gone. This means we can't depend on the app existing throughout the lifetime of the device. Free the app only from PCI remove path. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19nfp: core: make scalar CPP helpers fail on short accessesJakub Kicinski
Currently the helpers for accessing 4 or 8 byte values over the CPP bus return the length of IO on success. If the IO was short caller has to deal with error handling. The short IO for 4/8B values is completely impractical. Make the helpers return an error if full access was not possible. Fix the few places which are actually dealing with errors correctly, most call sites already only deal with negative return codes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19net/mlx5e: Add likely to the common RX checksum flowGal Pressman
Most of the packets return true for is_last_ethertype_ip, surround it with likely compiler hint. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Extend the stats group API to have update_stats()Kamal Heib
Extend the stats group API to have an update_stats() callback which will be used to fetch the hardware or software counters data. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Merge per priority stats groupsKamal Heib
Merge the per priority traffic and pfc groups into one group, because both groups share the same update_stats() callback which will be introduced in the upcoming patch. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Add per-channel counters infrastructure, use it upon TX timeoutEran Ben Elisha
Add per-channel counter ch#_eq_rearm to monitor how many lost interrupt recovery actions happened upon TX timeouts. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Poll event queue upon TX timeout before performing full channels ↵Eran Ben Elisha
recovery Up until this patch, on every TX timeout we would try to do channels recovery. However, in case of a lost interrupt for an EQ, the channel associated to it cannot be recovered if reopened as it would never get another interrupt on sent/received traffic, and eventually ends up with another TX timeout (Restarting the EQ is not part of channel recovery). This patch adds a mechanism for explicitly polling EQ in case of a TX timeout in order to recover from a lost interrupt. If this is not the case (no pending EQEs), perform a channels full recovery as usual. Once a lost EQE is recovered, it triggers the NAPI to run and handle all pending completions. This will free some budget in the bql (via calling netdev_tx_completed_queue) or by clearing pending TXWQEs and waking up the queue. One of the above actions will move the queue to be ready for transmit again. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Add Event Queue meta data info for TX timeout logsEran Ben Elisha
When TX timeout occurs, EQ consumer index and irqn can help in debug for understanding the SW state of EQ. Add them to the logger prints for the relevant EQ only. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Print delta since last transmit per SQ upon TX timeoutEran Ben Elisha
When driver callback for TX timeout is being called, it handles all stopped xmit queues (not only the ones which their timeout expired). Add usecs since last transmit to TX timeout logs per send queue in order to monitor if the queue timeout expired. Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Set hairpin queue sizeOr Gerlitz
For a given hairpin packet buffer size, different queue sizes (values of log_hairpin_num_packets) determine how the data is broken to strides on the RQ. Currently the chosen value is set to 64B strides. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5: Enable setting hairpin queue sizeOr Gerlitz
Allow to specify the size of the hairpin queues along with the packet buffer data size from the core setup code. If the driver doesn't provide this, the FW applies proper value that matches the provided data size and a FW chosen RQ stride size. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Add RSS support for hairpinOr Gerlitz
Support RSS for hairpin traffic. We create multiple hairpin RQ/SQ pairs and RSS TTC table per hairpin instance and steer the related flows through that table so they are spread between the pairs. We open one pair per 50Gbs link speed, for all speeds <= 50Gbs, there is one pair and no RSS while for 100Gbs ports two RSSed pairs. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5: Vectorize the low level core hairpin objectOr Gerlitz
Enhance the hairpin setup code at the core to support a set of N (RQ,SQ) pairs. This will be later used by the caller to set RSS spreading among the different RQs. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Enlarge the NIC TC offload steering prio to support two levelsOr Gerlitz
This will allow to be able and set TC rule whose steering dest is RSS TTC steering table. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Refactor RSS related objects and codeOr Gerlitz
In order to use RSS for hairpin, we refactor the code that deals with setup of the TTC steering tables. This is done using an interim ttc params object that has the flow table attributes, TIR numbers, etc. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Set per priority hairpin pairsOr Gerlitz
As part of the QoS model, on xmit, the HW mandates that all packets going through a given SQ have the same priority. To align hairpin SQs with that, we use the priority given as part of the matching for the hairpin hash key. This ensures that flows/packets mapped to different HW priorities will go through different hairpin instances. If no priority is given for matching, we treat that as an 8th priority, this is in order not to harm cases where priority is specified. Only the PCP priority trust model is supported. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19net/mlx5e: Use vhca id as the hairpin peer identifierOr Gerlitz
The peer vhca id spans less bits vs the ifindex and can well serve for the hairpin hash key, move to use that. This is a pre-step to put more info into the hairpin hash key in downstream patch while keeping it at 32 bits. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-01-19Merge branch 'tcp-min-rtt'David S. Miller
Yuchung Cheng says: ==================== tcp: do not use RTT from delayed ACKs for min-RTT This patch set prevents TCP sender from using RTT samples from (suspected) delayed ACKs as the minimum RTT, to avoid unbounded over-estimation of the network path delay. This issue is common when a connection has extended periods of one packet chit-chat beyond the min RTT filter window. The first patch does that for TCP general min RTT estimation. The second patch addresses specifically the BBR congestion control's min RTT filter. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19tcp: avoid min RTT bloat by skipping RTT from delayed-ACK in BBRYuchung Cheng
A persistent connection may send tiny amount of data (e.g. health-check) for a long period of time. BBR's windowed min RTT filter may only see RTT samples from delayed ACKs causing BBR to grossly over-estimate the path delay depending how much the ACK was delayed at the receiver. This patch skips RTT samples that are likely coming from delayed ACKs. Note that it is possible the sender never obtains a valid measure to set the min RTT. In this case BBR will continue to set cwnd to initial window which seems fine because the connection is thin stream. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19tcp: avoid min-RTT overestimation from delayed ACKsYuchung Cheng
This patch avoids having TCP sender or congestion control overestimate the min RTT by orders of magnitude. This happens when all the samples in the windowed filter are one-packet transfer like small request and health-check like chit-chat, which is farily common for applications using persistent connections. This patch tries to conservatively labels and skip RTT samples obtained from this type of workload. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19tipc: fix race between poll() and setsockopt()Jon Maloy
Letting tipc_poll() dereference a socket's pointer to struct tipc_group entails a race risk, as the group item may be deleted in a concurrent tipc_sk_join() or tipc_sk_leave() thread. We now move the 'open' flag in struct tipc_group to struct tipc_sock, and let the former retain only a pointer to the moved field. This will eliminate the race risk. Reported-by: syzbot+799dafde0286795858ac@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19l2tp: remove switch block in l2tp_nl_cmd_session_create()Lorenzo Bianconi
Remove the switch block in l2tp_nl_cmd_session_create() that checks pseudowire-specific parameters since just L2TP_PWTYPE_ETH and L2TP_PWTYPE_PPP are currently supported and no actual checks are performed. Moreover the L2TP_PWTYPE_IP/default case presents a harmless issue in error handling (break instead of goto out_tunnel) Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19cxgb4: IPv6 filter takes 2 tidsGanesh Goudar
on T6, IPv6 filter would occupy 2 tids instead of 4. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge branch 'l2tp-set-l2specific_len-based-on-l2specific_type'David S. Miller
Lorenzo Bianconi says: ==================== l2tp: set l2specific_len based on l2specific_type Do not rely on l2specific_len value provided by userspace but set sublayer length according to l2specific_type. Mark L2TP_ATTR_L2SPEC_LEN attribute as not used Changes since v2: - drop the patch related to a fix in the switch default case in l2tp_nl_cmd_session_create() - use L2SPECTYPE_NONE as default case in l2tp_get_l2specific_len() Changes since v1: - remove l2specific_len parameter - add sanity check on l2specific_type provided by userspace ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19l2tp: mark L2TP_ATTR_L2SPEC_LEN as not usedLorenzo Bianconi
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19l2tp: remove l2specific_len configurable parameterLorenzo Bianconi
Remove l2specific_len configuration parameter since now L2-Specific Sublayer length is computed according to l2specific_type provided by userspace. Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19l2tp: remove l2specific_len dependency in l2tp_coreLorenzo Bianconi
Remove l2specific_len dependency while building l2tpv3 header or parsing the received frame since default L2-Specific Sublayer is always four bytes long and we don't need to rely on a user supplied value. Moreover in l2tp netlink code there are no sanity checks to enforce the relation between l2specific_len and l2specific_type, so sending a malformed netlink message is possible to set l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than 4 leaking memory on the wire and sending corrupted frames. Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19l2tp: double-check l2specific_type provided by userspaceLorenzo Bianconi
Add sanity check on l2specific_type provided by userspace in l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and L2TP_L2SPECTYPE_NONE are currently supported. Moreover explicitly set l2specific_type to L2TP_L2SPECTYPE_DEFAULT only if the userspace does not provide a value for it Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge branch 'cxgb4-reduce-memory-footprint-for-collecting-firmware-dump'David S. Miller
Rahul Lakkireddy says: ==================== cxgb4: reduce memory footprint for collecting firmware dump Firmware dump can be large (upto 2 GB). In low memory conditions, ethtool fails to allocate such large memory. So, use zlib deflate to compress collected firmware dump. Patch 1 updates collection logic to use compression. Patch 2 adds zlib deflate to compress collected firmware dump. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19cxgb4: use zlib deflate to compress firmware dumpRahul Lakkireddy
Use zlib deflate to compress firmware dump. Collect and compress as much firmware dump as possible into a 32 MB buffer. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19cxgb4: update dump collection logic to use compressionRahul Lakkireddy
Update firmware dump collection logic to use compression when available. Let collection logic attempt to do compression, instead of returning out of memory early. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19net/dim: Fix fixpoint divide exception in net_dim_stats_compareTalat Batheesh
Helmut reported a bug about devision by zero while running traffic and doing physical cable pull test. When the cable unplugged the ppms become zero, so when dividing the current ppms by the previous ppms in the next dim iteration there is devision by zero. This patch prevent this division for both ppms and epms. Fixes: c3164d2fc48f ("net/mlx5e: Added BW check for DIM decision mechanism") Fixes: 4c4dbb4a7363 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Reported-by: Helmut Grauer <helmut.grauer@de.ibm.com> Signed-off-by: Talat Batheesh <talatb@mellanox.com> Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge tag 'trace-v4.15-rc4-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Two more small fixes - The conversion of enums into their actual numbers to display in the event format file had an off-by-one bug, that could cause an enum not to be converted, and break user space parsing tools. - A fix to a previous fix to bring back the context recursion checks. The interrupt case checks for NMI, IRQ and softirq, but the softirq returned the same number regardless if it was set or not, although the logic would force it to be set if it were hit" * tag 'trace-v4.15-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix converting enum's from the map in trace_event_eval_update() ring-buffer: Fix duplicate results in mapping context to bits in recursive lock
2018-01-19devlink: Make some functions staticWei Yongjun
Fixes the following sparse warnings: net/core/devlink.c:2297:25: warning: symbol 'devlink_resource_find' was not declared. Should it be static? net/core/devlink.c:2322:6: warning: symbol 'devlink_resource_validate_children' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a fix for use-after-free in Synaptics RMI4 driver - correction to multitouch contact tracking on certain ALPS touchpads (which got broken when we tried to fix the 2-finger scrolling) - touchpad on Lenovo T640p is switched over to SMbus/RMI - a few device node refcount fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics-rmi4 - prevent UAF reported by KASAN Input: ALPS - fix multi-touch decoding on SS4 plus touchpads Input: synaptics - Lenovo Thinkpad T460p devices should use RMI Input: of_touchscreen - add MODULE_LICENSE Input: 88pm860x-ts - fix child-node lookup Input: twl6040-vibra - fix child-node lookup Input: twl4030-vibra - fix sibling-node lookup
2018-01-19mlxsw: spectrum: Make function mlxsw_sp_kvdl_part_occ() staticWei Yongjun
Fixes the following sparse warning: drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:289:5: warning: symbol 'mlxsw_sp_kvdl_part_occ' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19forcedeth: remove unused variableZhu Yanjun
The variable miistat is not used. So it is removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Two bugfixes for the I2C core: Lixing Wang fixed a refcounting problem with DT nodes. Jeremy Compostella fixed a buffer overflow possibility when using a 'don't use' ioctl interface directly" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA i2c: core: decrease reference count of device node in i2c_unregister_device
2018-01-19Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixlet from Tejun Heo: "This just adds one more entry for liteon optical drives to the device blacklist for large IOs. The change is very low risk" * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: apply MAX_SEC_1024 to all LITEON EP1 series devices
2018-01-19Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "cgroup.threads should be delegatable (ie. a container should be able to write to it from inside) but was missing the flag. The change is very low risk" * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: make cgroup.threads delegatable
2018-01-19Merge branch 'for-4.15-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixlet from Tejun Heo: "One patch to add touch_nmi_watchdog() while dumping workqueue debug messages to avoid triggering the lockup detector spuriously. The change is very low risk" * 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: avoid hard lockups in show_workqueue_state()
2018-01-19Merge tag 'armsoc-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "We have various small DT fixes, and one important regression fix: The recent device tree bugfixes that were intended to address issues that 'dtc' started warning about in 4.15 fixed various USB PHY device nodes, but it turns out that we had code that depended on those nodes being incorrect and the probe failing with a particular error code. With the workaround we can also deal with correct device nodes. The DT fixes include: - Allwinner A10 and A20 had the display pipeline set up incorrectly (introduced in v4.15) - The Altera PMU lacked an interrupt-parent (never worked) - Pin muxing on the Openblocks A7 (never worked) - Clocks might get set up wrong on Armada 7K/8K (4.15 regression) We now have additional device tree patches to address all the remaining warnings introduced in 4.15, but decided to queue them for 4.16 instead, to avoid risking another regression like the USB PHY thing mentioned above. * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: phy: work around 'phys' references to usb-nop-xceiv devices ARM: sunxi_defconfig: Enable CMA arm64: dts: socfpga: add missing interrupt-parent ARM: dts: sun[47]i: Fix display backend 1 output to TCON0 remote endpoint ARM64: dts: marvell: armada-cp110: Fix clock resources for various node ARM: dts: da850-lcdk: Remove leading 0x and 0s from unit address ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7
2018-01-19Merge tag 'powerpc-4.15-8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "More than we'd like after rc8, but nothing very alarming either, just tying up loose ends before the release: Since we changed powernv to use cpufreq_get() from show_cpuinfo(), we see warnings with PREEMPT enabled. But the preempt_disable() in show_cpuinfo() doesn't actually prevent CPU hotplug as it suggests, so remove it. Two updates to the recently merged RFI flush code. Wire up the generic sysfs file to report the status, and add a debugfs file to allow enabling/disabling it at runtime. Two updates to xmon, one to add the RFI flush related fields to the paca dump, and another to not use hashed pointers in the paca dump. And one minor fix to add a missing include of linux/types.h in asm/hvcall.h, not seen to break the build in upstream, but correct anyway. Thanks to: Benjamin Herrenschmidt, Michal Suchanek, Nicholas Piggin" * tag 'powerpc-4.15-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: include linux/types.h in asm/hvcall.h powerpc/64s: Allow control of RFI flush via debugfs powerpc/64s: Wire up cpu_show_meltdown() powerpc: Don't preempt_disable() in show_cpuinfo() powerpc/xmon: Don't print hashed pointers in paca dump powerpc/xmon: Add RFI flush related fields to paca dump
2018-01-19ipv6: mcast: remove dead codeEric Dumazet
Since commit 41033f029e39 ("snmp: Remove duplicate OUTMCAST stat increment") one line of code became unneeded. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge tag 'drm-fixes-for-v4.15-rc9' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Nouveau, i915, vmwgfx and sun4i regression fixes. The i915 change fixes a display corruption problem introduced in 4.15, the nouveau changes are for regressions in 4.15, one of the vmwgfx fixes goes back a little further, the other is a 4.15 regression fix, the 3 sun4i changes fix blank HDMI output on those devices" * tag 'drm-fixes-for-v4.15-rc9' of git://people.freedesktop.org/~airlied/linux: drm/nouveau/mmu/mcp77: fix regressions in stolen memory handling drm/nouveau/bar/gk20a: Avoid bar teardown during init drm/nouveau/drm/nouveau: Pass the proper arguments to nvif_object_map_handle() drm/vmwgfx: fix memory corruption with legacy/sou connectors drm/vmwgfx: Fix a boot time warning drm/i915: Fix deadlock in i830_disable_pipe() drm/i915: Redo plane sanitation during readout drm/i915: Add .get_hw_state() method for planes drm/sun4i: hdmi: Add missing rate halving check in sun4i_tmds_determine_rate drm/sun4i: hdmi: Fix incorrect assignment in sun4i_tmds_determine_rate drm/sun4i: hdmi: Check for unset best_parent in sun4i_tmds_determine_rate
2018-01-19caif: reduce stack size with KASANArnd Bergmann
When CONFIG_KASAN is set, we can use relatively large amounts of kernel stack space: net/caif/cfctrl.c:555:1: warning: the frame size of 1600 bytes is larger than 1280 bytes [-Wframe-larger-than=] This adds convenience wrappers around cfpkt_extr_head(), which is responsible for most of the stack growth. With those wrapper functions, gcc apparently starts reusing the stack slots for each instance, thus avoiding the problem. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "6 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: sparse doesn't support struct randomization proc: fix coredump vs read /proc/*/stat race scripts/gdb/linux/tasks.py: fix get_thread_info scripts/decodecode: fix decoding for AArch64 (arm64) instructions mm/page_owner.c: remove drain_all_pages from init_early_allocated_pages mm/memory.c: release locked page in do_swap_page()
2018-01-19ia64: Rewrite atomic_add and atomic_subMatthew Wilcox
Force __builtin_constant_p to evaluate whether the argument to atomic_add & atomic_sub is constant in the front-end before optimisations which can lead GCC to output a call to __bad_increment_for_ia64_fetch_and_add(). See GCC bugzilla 83653. Signed-off-by: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-19sparse doesn't support struct randomizationMatthew Wilcox
Without this patch, I drown in a sea of unknown attribute warnings Link: http://lkml.kernel.org/r/20180117024539.27354-1-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-19proc: fix coredump vs read /proc/*/stat raceAlexey Dobriyan
do_task_stat() accesses IP and SP of a task without bumping reference count of a stack (which became an entity with independent lifetime at some point). Steps to reproduce: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/time.h> #include <sys/resource.h> #include <unistd.h> #include <sys/wait.h> int main(void) { setrlimit(RLIMIT_CORE, &(struct rlimit){}); while (1) { char buf[64]; char buf2[4096]; pid_t pid; int fd; pid = fork(); if (pid == 0) { *(volatile int *)0 = 0; } snprintf(buf, sizeof(buf), "/proc/%u/stat", pid); fd = open(buf, O_RDONLY); read(fd, buf2, sizeof(buf2)); close(fd); waitpid(pid, NULL, 0); } return 0; } BUG: unable to handle kernel paging request at 0000000000003fd8 IP: do_task_stat+0x8b4/0xaf0 PGD 800000003d73e067 P4D 800000003d73e067 PUD 3d558067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 1417 Comm: a.out Not tainted 4.15.0-rc8-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014 RIP: 0010:do_task_stat+0x8b4/0xaf0 Call Trace: proc_single_show+0x43/0x70 seq_read+0xe6/0x3b0 __vfs_read+0x1e/0x120 vfs_read+0x84/0x110 SyS_read+0x3d/0xa0 entry_SYSCALL_64_fastpath+0x13/0x6c RIP: 0033:0x7f4d7928cba0 RSP: 002b:00007ffddb245158 EFLAGS: 00000246 Code: 03 b7 a0 01 00 00 4c 8b 4c 24 70 4c 8b 44 24 78 4c 89 74 24 18 e9 91 f9 ff ff f6 45 4d 02 0f 84 fd f7 ff ff 48 8b 45 40 48 89 ef <48> 8b 80 d8 3f 00 00 48 89 44 24 20 e8 9b 97 eb ff 48 89 44 24 RIP: do_task_stat+0x8b4/0xaf0 RSP: ffffc90000607cc8 CR2: 0000000000003fd8 John Ogness said: for my tests I added an else case to verify that the race is hit and correctly mitigated. Link: http://lkml.kernel.org/r/20180116175054.GA11513@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: "Kohli, Gaurav" <gkohli@codeaurora.org> Tested-by: John Ogness <john.ogness@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>