Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Borislav Petkov:
- Fix a CONFIG symbol's spelling
* tag 'locking_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rtmutex: Use the correct rtmutex debugging config option
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Borislav Petkov:
"A batch of fixes for the arm64 stub image loader:
- fix a logic bug that can make the random page allocator fail
spuriously
- force reallocation of the Image when it overlaps with firmware
reserved memory regions
- fix an oversight that defeated on optimization introduced earlier
where images loaded at a suitable offset are never moved if booting
without randomization
- complain about images that were not loaded at the right offset by
the firmware image loader"
* tag 'efi_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub: arm64: Double check image alignment at entry
efi/libstub: arm64: Warn when efi_random_alloc() fails
efi/libstub: arm64: Relax 2M alignment again for relocatable kernels
efi/libstub: arm64: Force Image reallocation if BSS was not reserved
arm64: efi: kaslr: Fix occasional random alloc (and boot) failure
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"Two fixes:
- An objdump checker fix to ignore parenthesized strings in the
objdump version
- Fix resctrl default monitoring groups reporting when new subgroups
get created"
* tag 'x86_urgent_for_v5.14_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Fix default monitoring groups reporting
x86/tools: Fix objdump version check again
|
|
Pull KVM fixes from Paolo Bonzini:
"ARM:
- Plug race between enabling MTE and creating vcpus
- Fix off-by-one bug when checking whether an address range is RAM
x86:
- Fixes for the new MMU, especially a memory leak on hosts with <39
physical address bits
- Remove bogus EFER.NX checks on 32-bit non-PAE hosts
- WAITPKG fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86/mmu: Protect marking SPs unsync when using TDP MMU with spinlock
KVM: x86/mmu: Don't step down in the TDP iterator when zapping all SPTEs
KVM: x86/mmu: Don't leak non-leaf SPTEs when zapping all SPTEs
KVM: nVMX: Use vmx_need_pf_intercept() when deciding if L0 wants a #PF
kvm: vmx: Sync all matching EPTPs when injecting nested EPT fault
KVM: x86: remove dead initialization
KVM: x86: Allow guest to set EFER.NX=1 on non-PAE 32-bit kernels
KVM: VMX: Use current VMCS to query WAITPKG support for MSR emulation
KVM: arm64: Fix race when enabling KVM_ARM_CAP_MTE
KVM: arm64: Fix off-by-one in range_is_memory
|
|
The 2021-model XPS 15 appears to use the same 4-speakers-on-ALC289 audio
setup as the Precision models, so requires the same quirk to enable woofer
output. Tested on my own 9510.
Signed-off-by: Kristin Paget <kristin@tombom.co.uk>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/e1fc95c5-c10a-1f98-a5c2-dd6e336157e1@tombom.co.uk
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three minor fixes, all in drivers"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: Fix incorrectly assigned error return and check
scsi: storvsc: Log TEST_UNIT_READY errors as warnings
scsi: lpfc: Move initialization of phba->poll_list earlier to avoid crash
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A couple of fixes for long standing bugs, a warning fixup, and some
miscellaneous dax cleanups.
The bugs were recently found due to new platforms looking to use the
ACPI NFIT "virtual" device definition, and new error injection
capabilities to trigger error responses to label area requests. Ira's
cleanups have been long pending, I neglected to send them earlier, and
see no harm in including them now. This has all appeared in -next with
no reported issues.
Summary:
- Fix support for NFIT "virtual" ranges (BIOS-defined memory disks)
- Fix recovery from failed label storage areas on NVDIMM devices
- Miscellaneous cleanups from Ira's investigation of
dax_direct_access paths preparing for stray-write protection"
* tag 'libnvdimm-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
tools/testing/nvdimm: Fix missing 'fallthrough' warning
libnvdimm/region: Fix label activation vs errors
ACPI: NFIT: Fix support for virtual SPA ranges
dax: Ensure errno is returned from dax_direct_access
fs/dax: Clarify nr_pages to dax_direct_access()
fs/fuse: Remove unneeded kaddr parameter
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fix from Greg KH:
"A single revert of a commit that caused problems in 5.14-rc5 for
5.14-rc6. It has been in linux-next almost all week, and has resolved
the issues that were reported on lots of different systems that were
not the platform that the change was originally tested on (gotta love
SoC cores used in multiple devices from multiple vendors...)"
* tag 'usb-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
Revert "usb: dwc3: gadget: Use list_replace_init() before traversing lists"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO driver fixes from Greg KH:
"Here are some small IIO driver fixes for reported problems for
5.14-rc6 (no staging driver fixes at the moment).
All of them resolve reported issues and have been in linux-next all
week with no reported problems. Full details are in the shortlog"
* tag 'staging-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: adc: Fix incorrect exit of for-loop
iio: humidity: hdc100x: Add margin to the conversion time
dt-bindings: iio: st: Remove wrong items length check
iio: accel: fxls8962af: fix i2c dependency
iio: adis: set GPIO reset pin direction
iio: adc: ti-ads7950: Ensure CS is deasserted after reading channels
iio: accel: fxls8962af: fix potential use of uninitialized symbol
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"One driver bugfix, a documentation bugfix, and an "uninitialized data"
leak fix for the core"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Documentation: i2c: add i2c-sysfs into index
i2c: dev: zero out array used for i2c reads from userspace
i2c: iproc: fix race between client unreg and tasklet
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"A small cleanup patch and a fix of a rare race in the Xen evtchn
driver"
* tag 'for-linus-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events: Fix race in set_evtchn_to_irq
xen/events: remove redundant initialization of variable irq
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- avoid passing -mno-relax to compilers that don't support it
- a comment fix
* tag 'riscv-for-linus-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix comment regarding kernel mapping overlapping with IS_ERR_VALUE
riscv: kexec: do not add '-mno-relax' flag if compiler doesn't support it
|
|
Pull configfs fix from Christoph Hellwig:
- fix to revert to the historic write behavior (Bart Van Assche)
* tag 'configfs-5.14' of git://git.infradead.org/users/hch/configfs:
configfs: restore the kernel v5.13 text attribute write behavior
|
|
commit <47595e32869f> ("<MAINTAINERS: Mark some staging directories>")
indicated the ipx network layer as obsolete in Jan 2018,
updated in the MAINTAINERS file.
now, after being exposed for 3 years to refactoring, so to
remove the ipx network layer info from MAINTAINERS.
additionally, there is no module that depends on ipx.h
except a broken staging driver(r8188eu)
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit <47595e32869f> ("<MAINTAINERS: Mark some staging directories>")
indicated the ipx network layer as obsolete in Jan 2018,
updated in the MAINTAINERS file
now, after being exposed for 3 years to refactoring, so to
delete uapi/linux/ipx.h and net/ipx.h header files for good.
additionally, there is no module that depends on ipx.h except
a broken staging driver(r8188eu)
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alex Elder says:
====================
net: ipa: last things before PM conversion
This series contains a few remaining changes needed before fully
switching over to using runtime power management rather than the
previous "IPA clock" mechanism.
The first patch moves the calls to enable and disable the IPA
interrupt as a system wakeup interrupt into "ipa_clock.c" with the
rest of the power-related code.
The second adds a flag to make it possible to distinguish runtime
suspend from system suspend.
The third and fourth patches arrange for the ->start_xmit path to
resume hardware if necessary, to ensure it is powered. If power is
not active, the TX queue is stopped, and arrangements are made for
the queue to be restarted once hardware power is active again.
The fifth patch keeps the TX queue active during suspend. This
isn't necessary for system suspend but it's important for runtime
suspend.
And the last patch makes it so we don't hold the hardware active
while the modem network device is open.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently a clock reference is taken whenever the ->ndo_open
callback for the modem netdev is called. That reference is dropped
when the device is closed, in ipa_stop().
We no longer need this, because ipa_start_xmit() now handles the
situation where the hardware power state is not active.
Drop the clock reference in ipa_open() when we're done, and take a
new reference in ipa_stop() before we begin closing the interface.
Finally (and unrelated, but trivial), change the return type of
ipa_start_xmit() to be netdev_tx_t instead of int.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently we stop the modem netdev transmit queue when suspending
the hardware. For system suspend this ensured we'd never attempt
to transmit while attempting to suspend the modem endpoints.
For runtime suspend, the IPA hardware might get suspended while the
system is operating. In that case we want an attempt to transmit a
packet to cause the hardware to resume if necessary. But if we
disable the queue this cannot happen.
So stop disabling the queue on suspend. In case we end up disabling
it in ipa_start_xmit() (see the previous commit), we still arrange
to start the TX queue on resume.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We need to ensure the hardware is powered when we transmit a packet.
But if it's not, we can't block to wait for it. So asynchronously
request power in ipa_start_xmit(), and only proceed if the return
value indicates the power state is active.
If the hardware is not active, a runtime resume request will have
been initiated. In that case, stop the network stack from further
transmit attempts until the resume completes. Return NETDEV_TX_BUSY,
to retry sending the packet once the queue is restarted.
If the power request returns an error (other than -EINPROGRESS,
which just means a resume requested elsewhere isn't complete), just
drop the packet.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Create a new work structure in the modem private data, and use it to
re-enable the modem network device transmit queue when resuming.
This is needed by the next patch, which stops the TX queue if IPA
power isn't active when a transmit request arrives. Packets will
start arriving the instant the TX queue is enabled, but resuming
isn't complete until ipa_modem_resume() returns. This way we're
sure to be resumed before transmits are allowed again.
Cancel it before calling ipa_stop() in ipa_modem_stop() to ensure
the transmit queue restart completes before it gets stopped there.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new flag that is set when the hardware is suspended due to a
system suspend operation, distingishing it from runtime suspend.
Use it in the SUSPEND IPA interrupt handler to determine whether to
trigger a system resume because of the event. Define new suspend
and resume power management callback functions to set and clear the
new flag, respectively.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Move the call to enable the IPA interrupt as a wakeup interrupt into
ipa_power_setup(), disable it in ipa_power_teardown().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Nikolay Aleksandrov says:
====================
net: bridge: mcast: dump querier state
This set adds the ability to dump the current multicast querier state.
This is extremely useful when debugging multicast issues, we've had
many cases of unexpected queriers causing strange behaviour and mcast
test failures. The first patch changes the querier struct to record
a port device's ifindex instead of a pointer to the port itself so we
can later retrieve it, I chose this way because it's much simpler
and doesn't require us to do querier port ref counting, it is best
effort anyway. Then patch 02 makes the querier address/port updates
consistent via a combination of multicast_lock and seqcount, so readers
can only use seqcount to get a consistent snapshot of address and port.
Patch 03 is a minor cleanup in preparation for the dump support, it
consolidates IPv4 and IPv6 querier selection paths as they share most of
the logic (except address comparisons of course). Finally the last three
patches add the new querier state dumping support, for the bridge's
global multicast context we embed the BRIDGE_QUERIER_xxx attributes
into IFLA_BR_MCAST_QUERIER_STATE and for the per-vlan global mcast
contexts we embed them into BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE.
The structure is:
[IFLA_BR_MCAST_QUERIER_STATE / BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE]
`[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier
`[BRIDGE_QUERIER_IP_PORT] - bridge port ifindex where the querier was
seen (set only if external querier)
`[BRIDGE_QUERIER_IP_OTHER_TIMER] - other querier timeout
`[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier
`[BRIDGE_QUERIER_IPV6_PORT] - bridge port ifindex where the querier
was seen (set only if external querier)
`[BRIDGE_QUERIER_IPV6_OTHER_TIMER] - other querier timeout
Later we can also add IGMP version of seen queriers and last seen values
from the queries.
====================
|
|
Use the new mcast querier state dump infrastructure and export vlans'
mcast context querier state embedded in attribute
BRIDGE_VLANDB_GOPTS_MCAST_QUERIER_STATE.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for dumping global IPv6 querier state, we dump the state
only if our own querier is enabled or there has been another external
querier which has won the election. For the bridge global state we use
a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside.
The structure is:
[IFLA_BR_MCAST_QUERIER_STATE]
`[BRIDGE_QUERIER_IPV6_ADDRESS] - ip address of the querier
`[BRIDGE_QUERIER_IPV6_PORT] - bridge port ifindex where the querier
was seen (set only if external querier)
`[BRIDGE_QUERIER_IPV6_OTHER_TIMER] - other querier timeout
IPv4 and IPv6 attributes are embedded at the same level of
IFLA_BR_MCAST_QUERIER_STATE. If we didn't dump anything we cancel the nest
and return.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for dumping global IPv4 querier state, we dump the state
only if our own querier is enabled or there has been another external
querier which has won the election. For the bridge global state we use
a new attribute IFLA_BR_MCAST_QUERIER_STATE and embed the state inside.
The structure is:
[IFLA_BR_MCAST_QUERIER_STATE]
`[BRIDGE_QUERIER_IP_ADDRESS] - ip address of the querier
`[BRIDGE_QUERIER_IP_PORT] - bridge port ifindex where the querier was
seen (set only if external querier)
`[BRIDGE_QUERIER_IP_OTHER_TIMER] - other querier timeout
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can consolidate both functions as they share almost the same logic.
This is easier to maintain and we have a single querier update function.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use a sequence counter to make sure port/address updates can be read
consistently without requiring the bridge multicast_lock. We need to
zero out the port and address when the other querier has expired and
we're about to select ourselves as querier. br_multicast_read_querier
will be used later when dumping querier state. Updates are done only
with the multicast spinlock and softirqs disabled, while reads are done
from process context and from softirqs (due to notifications).
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently when a querier port is detected its net_bridge_port pointer is
recorded, but it's used only for comparisons so it's fine to have stale
pointer, in order to dereference and use the port pointer a proper
accounting of its usage must be implemented adding unnecessary
complexity. To solve the problem we can just store the netdevice ifindex
instead of the port pointer and retrieve the bridge port. It is a best
effort and the device needs to be validated that is still part of that
bridge before use, but that is small price to pay for avoiding querier
reference counting for each port/vlan.
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Leon Romanovsky says:
====================
Devlink cleanup for delay event series
Jakub's request to make sure that devlink events are delayed and not
printed till they fully accessible [1] requires us to implement delayed
event notification system in the devlink.
In order to do it, I moved some of my patches (xarray e.t.c) from the future
series to be before "Move devlink_register to be near devlink_reload_enable" [2].
That allows us to rely on DEVLINK_REGISTERED xarray mark to decide if to print
event or not.
Other patches are simple cleanup which is needed anyway.
[1] https://lore.kernel.org/lkml/20210811071817.4af5ab34@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com
[2] https://lore.kernel.org/lkml/cover.1628599239.git.leonro@nvidia.com
Next in the queue:
* Delay event series
* Move devlink_register to be near devlink_reload_enable"
* Extension of devlink_ops to be set dynamically
* devlink_reload_* delete
* Devlink locks rework to user xarray and reference counting
* ????
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The devlink pointer always exists after hclge_devlink_init() succeed.
Remove that check together with NULL setting after release and ensure
that devlink_register is last command prior to call to devlink_reload_enable().
Fixes: b741269b2759 ("net: hns3: add support for registering devlink for PF")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The { 0 } doesn't clear all fields in the struct, but tells to the
compiler to set all fields to zero and doesn't touch any sub-fields
if they exists.
The {} is an empty initialiser that instructs to fully initialize whole
struct including sub-fields, which is error-prone for future
devlink_flash_notify extensions.
Fixes: 6700acc5f1fe ("devlink: collect flash notify params into a struct")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We can use xarray instead of linearly organized linked lists for the
devlink instances. This will let us revise the locking scheme in favour
of internal xarray locking that protects database.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The struct devlink itself is protected by internal lock and doesn't
need global lock during operation. That global lock is used to protect
addition/removal new devlink instances from the global list in use by
all devlink consumers in the system.
The future conversion of linked list to be xarray will allow us to
actually delete that lock, but first we need to count all struct devlink
users.
The reference counting provides us a way to ensure that no new user
space commands success to grab devlink instance which is going to be
destroyed makes it is safe to access it without lock.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Devlink objects are accessible only after they were registered and
have valid devlink_*->devlink pointers.
Remove that check and simplify respective fill functions as an outcome
of such change.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The devlink_pernet_pre_exit() will be called if net namespace exits.
That routine is relevant for devlink instances that were assigned to
that namespaces first. This assignment is possible only with the following
command: "devlink reload DEV netns ...", which already checks reload support.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Mat Martineau says:
====================
mptcp: Improve use of backup subflows
Multipath TCP combines multiple TCP subflows in to one stream, and the
MPTCP-level socket must decide which subflow to use when sending (or
resending) chunks of data. The choice of the "best" subflow to transmit
on can vary depending on the priority (normal or backup) for each
subflow and how well the subflow is performing.
In order to improve MPTCP performance when some subflows are failing,
this patch set changes how backup subflows are utilized and introduces
tracking of "stale" subflows that are still connected but not making
progress.
Patch 1 adjusts MPTCP-level retransmit timeouts to use data from all
subflows.
Patch 2 makes MPTCP-level retransmissions less aggressive to avoid
resending data that's still queued at the TCP level.
Patch 3 changes the way pending data is handled when subflows are
closed. Unacked MPTCP-level data still in the subflow tx queue is
immediately moved to another subflow for transmission instead of waiting
for MPTCP-level timeouts to trigger retransmission.
Patch 4 has some sysctl code cleanup.
Patches 5 and 6 add tracking of "stale" subflows, so only underlying TCP
subflow connections that appear to be making progress are considered
when selecting a subflow to (re)transmit data. How fast a subflow goes
stale is configurable with a per-namespace sysctl. Related MIBS are
added too.
Patch 7 makes sure the backup flag is always correctly recorded when the
MP_JOIN SYN/ACK is received for an added subflow.
Patch 8 adds more test cases for backup subflows and stale subflows.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add more test-case for link failures scenario,
including recovery from link failure using only
backup subflows and bi-directional transfer.
Additionally explicitly check for stale count
Co-developed-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
the parsed incoming backup flag is not propagated
to the subflow itself, the client may end-up using it
to send data.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/191
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This allows monitoring exceptional events like
active backup scenarios.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The msk can use backup subflows to transmit in-sequence data
only if there are no other active subflow. On active backup
scenario, the MPTCP connection can do forward progress only
due to MPTCP retransmissions - rtx can pick backup subflows.
This patch introduces a new flag flow MPTCP subflows: if the
underlying TCP connection made no progresses for long time,
and there are other less problematic subflows available, the
given subflow become stale.
Stale subflows are not considered active: if all non backup
subflows become stale, the MPTCP scheduler can pick backup
subflows for plain transmissions.
Stale subflows can return in active state, as soon as any reply
from the peer is observed.
Active backup scenarios can now leverage the available b/w
with no restrinction.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reorder the data in mptcp_pernet to avoid wasting space
with no reasons and constify the access helpers.
No functional changes intended.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The PM can close active subflow, e.g. due to ingress RM_ADDR
option. Such subflow could carry data still unacked at the
MPTCP-level, both in the write and the rtx_queue, which has
never reached the other peer.
Currently the mptcp-level retransmission will deliver such data,
but at a very low rate (at most 1 DSM for each MPTCP rtx interval).
We can speed-up the recovery a lot, moving all the unacked in the
tcp write_queue, so that it will be pushed again via other
subflows, at the speed allowed by them.
Also make available the new helper for later patches.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The current mptcp re-inject strategy is very aggressive,
we have mptcp-level retransmissions even on single subflow
connection, if the link in-use is lossy.
Let's be a little more conservative: we do retransmit
only if at least a subflow has write and rtx queue empty.
Additionally use the backup subflows only if the active
subflows are stale - no progresses in at least an rtx period
and ignore stale subflows for rtx timeout update
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As reported by Maxim, we have a lot of MPTCP-level
retransmissions when multilple links with different latencies
are in use.
This patch refactor the mptcp-level timeout accounting so that
the maximum of all the active subflow timeout is used. To avoid
traversing the subflow list multiple times, the update is
performed inside the packet scheduler.
Additionally clean-up a bit timeout handling.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Merge misc fixes from Andrew Morton:
"7 patches.
Subsystems affected by this patch series: mm (kasan, mm/slub,
mm/madvise, and memcg), and lib"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
lib: use PFN_PHYS() in devmem_is_allowed()
mm/memcg: fix incorrect flushing of lruvec data in obj_stock
mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)
mm: slub: fix slub_debug disabling for list of slabs
slub: fix kmalloc_pagealloc_invalid_free unit test
kasan, slub: reset tag when printing address
kasan, kmemleak: reset tags when scanning block
|
|
The 'imply' keyword does not do what most people think it does, it only
politely asks Kconfig to turn on another symbol, but does not prevent
it from being disabled manually or built as a loadable module when the
user is built-in. In the ICE driver, the latter now causes a link failure:
aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_eth_ioctl':
ice_main.c:(.text+0x13b0): undefined reference to `ice_ptp_get_ts_config'
ice_main.c:(.text+0x13b0): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_get_ts_config'
aarch64-linux-ld: ice_main.c:(.text+0x13bc): undefined reference to `ice_ptp_set_ts_config'
ice_main.c:(.text+0x13bc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_set_ts_config'
aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_prepare_for_reset':
ice_main.c:(.text+0x31fc): undefined reference to `ice_ptp_release'
ice_main.c:(.text+0x31fc): relocation truncated to fit: R_AARCH64_CALL26 against undefined symbol `ice_ptp_release'
aarch64-linux-ld: drivers/net/ethernet/intel/ice/ice_main.o: in function `ice_rebuild':
This is a recurring problem in many drivers, and we have discussed
it several times befores, without reaching a consensus. I'm providing
a link to the previous email thread for reference, which discusses
some related problems.
To solve the dependency issue better than the 'imply' keyword, introduce a
separate Kconfig symbol "CONFIG_PTP_1588_CLOCK_OPTIONAL" that any driver
can depend on if it is able to use PTP support when available, but works
fine without it. Whenever CONFIG_PTP_1588_CLOCK=m, those drivers are
then prevented from being built-in, the same way as with a 'depends on
PTP_1588_CLOCK || !PTP_1588_CLOCK' dependency that does the same trick,
but that can be rather confusing when you first see it.
Since this should cover the dependencies correctly, the IS_REACHABLE()
hack in the header is no longer needed now, and can be turned back
into a normal IS_ENABLED() check. Any driver that gets the dependency
wrong will now cause a link time failure rather than being unable to use
PTP support when that is in a loadable module.
However, the two recently added ptp_get_vclocks_index() and
ptp_convert_timestamp() interfaces are only called from builtin code with
ethtool and socket timestamps, so keep the current behavior by stubbing
those out completely when PTP is in a loadable module. This should be
addressed properly in a follow-up.
As Richard suggested, we may want to actually turn PTP support into a
'bool' option later on, preventing it from being a loadable module
altogether, which would be one way to solve the problem with the ethtool
interface.
Fixes: 06c16d89d2cb ("ice: register 1588 PTP clock device object for E810 devices")
Link: https://lore.kernel.org/netdev/20210804121318.337276-1-arnd@kernel.org/
Link: https://lore.kernel.org/netdev/CAK8P3a06enZOf=XyZ+zcAwBczv41UuCTz+=0FMf2gBz1_cOnZQ@mail.gmail.com/
Link: https://lore.kernel.org/netdev/CAK8P3a3=eOxE-K25754+fB_-i_0BZzf9a9RfPTX3ppSwu9WZXw@mail.gmail.com/
Link: https://lore.kernel.org/netdev/20210726084540.3282344-1-arnd@kernel.org/
Acked-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210812183509.1362782-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull cifs fixes from Steve French:
"Four CIFS/SMB3 Fixes, all for stable, two relating to deferred close,
and one for the 'modefromsid' mount option (when 'idsfromsid' not
specified)"
* tag '5.14-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Call close synchronously during unlink/rename/lease break.
cifs: Handle race conditions during rename
cifs: use the correct max-length for dentry_path_raw()
cifs: create sd context must be a multiple of 8
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
"A single patch to sgx test to fix Q1 and Q2 calculation"
* tag 'linux-kselftest-fixes-5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/sgx: Fix Q1 and Q2 calculation in sigstruct.c
|
|
Internal tests found out that the latest code doesn't bring up 1PPS out
as expected. As a result of incorrect define used to round the time up
the time was round down to the past second boundary.
Fix define used for rounding to properly round up to the next Top of
second in ice_ptp_cfg_clkout to fix it.
Fixes: 172db5f91d5f ("ice: add support for auxiliary input/output pins")
Signed-off-by: Maciej Machnikowski <maciej.machnikowski@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20210813165018.2196013-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|