linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2021-06-11	net: pc300too: move out assignment in if condition	Peng Li
	Should not use assignment in if condition. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: pc300too: fix the code style issue about "foo * bar"	Peng Li
	Fix the checkpatch error as "foo * bar" and should be "foo *bar". Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: pc300too: add blank line after declarations	Peng Li
	This patch fixes the checkpatch error about missing a blank line after declarations. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: pc300too: remove redundant blank lines	Peng Li
	This patch removes some redundant blank lines. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: devres: Correct a grammatical error	gushengxian
	Correct a grammatical error. Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	r8169: avoid link-up interrupt issue on RTL8106e if user enables ASPM	Heiner Kallweit
	It has been reported that on RTL8106e the link-up interrupt may be significantly delayed if the user enables ASPM L1. Per default ASPM is disabled. The change leaves L1 enabled on the PCIe link (thus still allowing to reach higher package power saving states), but the NIC won't actively trigger it. Reported-by: Koba Ko <koba.ko@canonical.com> Tested-by: Koba Ko <koba.ko@canonical.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	nfc: fdp: remove unnecessary labels	wengjianfeng
	Some labels are meaningless, so we delete them and use the return statement instead of the goto statement. Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	Merge branch 's390-qeyj-next'	David S. Miller
	Julian Wiedmann says: ==================== s390/qeth: updates 2021-06-11 please apply the following patch series for qeth to netdev's net-next tree. This enables TX NAPI for those devices that didn't use it previously, so that we can eventually rip out the qdio layer's internal interrupt machinery. Other than that it's just the normal mix of minor improvements and cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: Consider dependency on SWITCHDEV module	Alexandra Winter
	Without the SWITCHDEV module, the bridgeport attribute LEARNING_SYNC of the physical device (self) does not provide any functionality. Instead of calling the no-op stub version of the switchdev functions, fail the setting of the attribute with an appropriate message. While at it, also add an error message for the 'not supported by HW' case. Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: shrink TX buffer struct	Julian Wiedmann
	Convert the large boolean array into a bitmap, this substantially reduces the struct's size. While at it also clarify the naming. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: remove TX buffer's pointer to its queue	Julian Wiedmann
	qeth_tx_complete_buf() is the only remaining user of buf->q, and the callers can easily provide this as a parameter instead. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: remove QAOB's pointer to its TX buffer	Julian Wiedmann
	Maintaining a pointer inside the aob's user-definable area is fragile and unnecessary. At this stage we only need it to overload the buffer's state field, and to access the buffer's TX queue. The first part is easily solved by tracking the aob's state within the aob itself. This also feels much cleaner and self-contained. For enabling the access to the associated TX queue, we can store the queue's index in the aob. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: consolidate completion of pending TX buffers	Julian Wiedmann
	With commit 396c100472dd ("s390/qdio: let driver manage the QAOB") a pending TX buffer now has access to its associated QAOB during TX completion processing. We can thus reduce the amount of work & state propagation that needs to be done by qeth_qdio_handle_aob(). Move all this logic into the respective TX completion paths. Doing so even allows us to determine more precise TX_NOTIFY_* values via qeth_compute_cq_notification(aob->aorc, ...). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: use ethtool_sprintf()	Julian Wiedmann
	Use a recently introduced helper to fill our ethtool stats strings. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: unify the tracking of active cmds on ccw device	Julian Wiedmann
	We have one field to track _whether_ a cmd is active on a ccw device ('irq_pending'), and one to track _which_ cmd it is ('active_cmd'). Get rid of the irq_pending field, by testing active_cmd for NULL. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: also use TX NAPI for non-IQD devices	Julian Wiedmann
	Set scan_threshold = 0 to opt out from the qdio layer's internal tasklet & timer mechanism for TX completions, and replace it with the TX NAPI infrastructure that qeth already uses for IQD devices. This avoids the fragile logic in qdio_check_output_queue(), enables tighter integration and gives us more tuning options via ethtool in the future. For now we continue to apply the same policy as the qdio layer: scan for completions if 32 TX buffers are in use, or after 1 sec. A re-scan is done after 10 sec, but only if no TX interrupt is pending. With scan_threshold = 0 we no longer get TX completion scans from within qdio_get_next_buffers(). So trigger these manually in qeth_poll() and in the RX path switch to the equivalent qdio_inspect_queue(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	s390/qeth: count TX completion interrupts	Julian Wiedmann
	While the qdio layer already tracks the number of HW interrupts for a device, there's value in understanding how many of them have been raised due to our TX completion logic. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	Merge branch 'sja1110-dsa-tagging'	David S. Miller
	Vladimir Oltean says: ==================== DSA tagging driver for NXP SJA1110 This series adds support for tagging data and control packets on the new NXP SJA1110 switch (supported by the sja1105 driver). Up to this point it used the sja1105 driver, which allowed it to send data packets, but not PDUs as those required by STP and PTP. To accommodate this new tagger which has both a header and a trailer, we need to refactor the entire DSA tagging scheme, to replace the "overhead" concept with separate "needed_headroom" and "needed_tailroom" concepts, so that SJA1110 can declare its need for both. There is also some consolidation work for the receive path of tag_8021q and its callers (sja1105 and ocelot-8021q). Changes in v3: Rebase in front of the "Port the SJA1105 DSA driver to XPCS" series which seems to have stalled for now. Changes in v2: Export the dsa_8021q_rcv and sja1110_process_meta_tstamp symbols to avoid build errors as modules. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: sja1105: implement TX timestamping for SJA1110	Vladimir Oltean
	The TX timestamping procedure for SJA1105 is a bit unconventional because the transmit procedure itself is unconventional. Control packets (and therefore PTP as well) are transmitted to a specific port in SJA1105 using "management routes" which must be written over SPI to the switch. These are one-shot rules that match by destination MAC address on traffic coming from the CPU port, and select the precise destination port for that packet. So to transmit a packet from NET_TX softirq context, we actually need to defer to a process context so that we can perform that SPI write before we send the packet. The DSA master dev_queue_xmit() runs in process context, and we poll until the switch confirms it took the TX timestamp, then we annotate the skb clone with that TX timestamp. This is why the sja1105 driver does not need an skb queue for TX timestamping. But the SJA1110 is a bit (not much!) more conventional, and you can request 2-step TX timestamping through the DSA header, as well as give the switch a cookie (timestamp ID) which it will give back to you when it has the timestamp. So now we do need a queue for keeping the skb clones until their TX timestamps become available. The interesting part is that the metadata frames from SJA1105 haven't disappeared completely. On SJA1105 they were used as follow-ups which contained RX timestamps, but on SJA1110 they are actually TX completion packets, which contain a variable (up to 32) array of timestamps. Why an array? Because: - not only is the TX timestamp on the egress port being communicated, but also the RX timestamp on the CPU port. Nice, but we don't care about that, so we ignore it. - because a packet could be multicast to multiple egress ports, each port takes its own timestamp, and the TX completion packet contains the individual timestamps on each port. This is unconventional because switches typically have a timestamping FIFO and raise an interrupt, but this one doesn't. So the tagger needs to detect and parse meta frames, and call into the main switch driver, which pairs the timestamps with the skbs in the TX timestamping queue which are waiting for one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: sja1105: add the RX timestamping procedure for SJA1110	Vladimir Oltean
	This is really easy, since the full RX timestamp is in the DSA trailer and the tagger code transfers it to SJA1105_SKB_CB(skb)->tstamp, we just need to move it to the skb shared info region. This is as opposed to SJA1105, where the RX timestamp was received in a meta frame (so there needed to be a state machine to pair the 2 packets) and the timestamp was partial (so the packet, once matched with its timestamp, needed to be added to an RX timestamping queue where the PTP aux worker would reconstruct that timestamp). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: add support for the SJA1110 native tagging protocol	Vladimir Oltean
	The SJA1110 has improved a few things compared to SJA1105: - To send a control packet from the host port with SJA1105, one needed to program a one-shot "management route" over SPI. This is no longer true with SJA1110, you can actually send "in-band control extensions" in the packets sent by DSA, these are in fact DSA tags which contain the destination port and switch ID. - When receiving a control packet from the switch with SJA1105, the source port and switch ID were written in bytes 3 and 4 of the destination MAC address of the frame (which was a very poor shot at a DSA header). If the control packet also had an RX timestamp, that timestamp was sent in an actual follow-up packet, so there were reordering concerns on multi-core/multi-queue DSA masters, where the metadata frame with the RX timestamp might get processed before the actual packet to which that timestamp belonged (there is no way to pair a packet to its timestamp other than the order in which they were received). On SJA1110, this is no longer true, control packets have the source port, switch ID and timestamp all in the DSA tags. - Timestamps from the switch were partial: to get a 64-bit timestamp as required by PTP stacks, one would need to take the partial 24-bit or 32-bit timestamp from the packet, then read the current PTP time very quickly, and then patch in the high bits of the current PTP time into the captured partial timestamp, to reconstruct what the full 64-bit timestamp must have been. That is awful because packet processing is done in NAPI context, but reading the current PTP time is done over SPI and therefore needs sleepable context. But it also aggravated a few things: - Not only is there a DSA header in SJA1110, but there is a DSA trailer in fact, too. So DSA needs to be extended to support taggers which have both a header and a trailer. Very unconventional - my understanding is that the trailer exists because the timestamps couldn't be prepared in time for putting them in the header area. - Like SJA1105, not all packets sent to the CPU have the DSA tag added to them, only control packets do: * the ones which match the destination MAC filters/traps in MAC_FLTRES1 and MAC_FLTRES0 * the ones which match FDB entries which have TRAP or TAKETS bits set So we could in theory hack something up to request the switch to take timestamps for all packets that reach the CPU, and those would be DSA-tagged and contain the source port / switch ID by virtue of the fact that there needs to be a timestamp trailer provided. BUT: - The SJA1110 does not parse its own DSA tags in a way that is useful for routing in cross-chip topologies, a la Marvell. And the sja1105 driver already supports cross-chip bridging from the SJA1105 days. It does that by automatically setting up the DSA links as VLAN trunks which contain all the necessary tag_8021q RX VLANs that must be communicated between the switches that span the same bridge. So when using tag_8021q on sja1105, it is possible to have 2 switches with ports sw0p0, sw0p1, sw1p0, sw1p1, and 2 VLAN-unaware bridges br0 and br1, and br0 can take sw0p0 and sw1p0, and br1 can take sw0p1 and sw1p1, and forwarding will happen according to the expected rules of the Linux bridge. We like that, and we don't want that to go away, so as a matter of fact, the SJA1110 tagger still needs to support tag_8021q. So the sja1110 tagger is a hybrid between tag_8021q for data packets, and the native hardware support for control packets. On RX, packets have a 13-byte trailer if they contain an RX timestamp. That trailer is padded in such a way that its byte 8 (the start of the "residence time" field - not parsed by Linux because we don't care) is aligned on a 16 byte boundary. So the padding has a variable length between 0 and 15 bytes. The DSA header contains the offset of the beginning of the padding relative to the beginning of the frame (and the end of the padding is obviously the end of the packet minus 13 bytes, the length of the trailer). So we discard it. Packets which don't have a trailer contain the source port and switch ID information in the header (they are "trap-to-host" packets). Packets which have a trailer contain the source port and switch ID in the trailer. On TX, the destination port mask and switch ID is always in the trailer, so we always need to say in the header that a trailer is present. The header needs a custom EtherType and this was chosen as 0xdadc, after 0xdada which is for Marvell and 0xdadb which is for VLANs in VLAN-unaware mode on SJA1105 (and SJA1110 in fact too). Because we use tag_8021q in concert with the native tagging protocol, control packets will have 2 DSA tags. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: sja1105: make SJA1105_SKB_CB fit a full timestamp	Vladimir Oltean
	In SJA1105, RX timestamps for packets sent to the CPU are transmitted in separate follow-up packets (metadata frames). These contain partial timestamps (24 or 32 bits) which are kept in SJA1105_SKB_CB(skb)->meta_tstamp. Thankfully, SJA1110 improved that, and the RX timestamps are now transmitted in-band with the actual packet, in the timestamp trailer. The RX timestamps are now full-width 64 bits. Because we process the RX DSA tags in the rcv() method in the tagger, but we would like to preserve the DSA code structure in that we populate the skb timestamp in the port_rxtstamp() call which only happens later, the implication is that we must somehow pass the 64-bit timestamp from the rcv() method all the way to port_rxtstamp(). We can use the skb->cb for that. Rename the meta_tstamp from struct sja1105_skb_cb from "meta_tstamp" to "tstamp", and increase its size to 64 bits. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: tag_8021q: refactor RX VLAN parsing into a dedicated function	Vladimir Oltean
	The added value of this function is that it can deal with both the case where the VLAN header is in the skb head, as well as in the offload field. This is something I was not able to do using other functions in the network stack. Since both ocelot-8021q and sja1105 need to do the same stuff, let's make it a common service provided by tag_8021q. This is done as refactoring for the new SJA1110 tagger, which partly uses tag_8021q as well (just like SJA1105), and will be the third caller. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: tag_8021q: remove shim declarations	Vladimir Oltean
	All users of tag_8021q select it in Kconfig, so shim functions are not needed because it is not possible for it to be disabled and its callers enabled. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: tag_sja1105: stop resetting network and transport headers	Vladimir Oltean
	This makes no sense and is not needed, it is probably a debugging leftover. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: generalize overhead for taggers that use both headers and trailers	Vladimir Oltean
	Some really really weird switches just couldn't decide whether to use a normal or a tail tagger, so they just did both. This creates problems for DSA, because we only have the concept of an 'overhead' which can be applied to the headroom or to the tailroom of the skb (like for example during the central TX reallocation procedure), depending on the value of bool tail_tag, but not to both. We need to generalize DSA to cater for these odd switches by transforming the 'overhead / tail_tag' pair into 'needed_headroom / needed_tailroom'. The DSA master's MTU is increased to account for both. The flow dissector code is modified such that it only calls the DSA adjustment callback if the tagger has a non-zero header length. Taggers are trivially modified to declare either needed_headroom or needed_tailroom, based on the tail_tag value that they currently declare. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: sja1105: allow RX timestamps to be taken on all ports for SJA1110	Vladimir Oltean
	On SJA1105, there is support for a cascade port which is presumably connected to a downstream SJA1105 switch. The upstream one does not take PTP timestamps for packets received on this port, presumably because the downstream switch already did (and for PTP, it only makes sense for the leaf nodes in a DSA switch tree to do that). I haven't been able to validate that feature in a fully assembled setup, so I am disabling the feature by setting the cascade port to an unused port value (ds->num_ports). In SJA1110, multiple cascade ports are supported, and CASC_PORT became a bit mask from a port number. So when CASC_PORT is set to ds->num_ports (which is 11 on SJA1110), it is actually set to 0b1011, so ports 3, 1 and 0 are configured as cascade ports and we cannot take RX timestamps on them. So we need to introduce a check for SJA1110 and set things differently (to zero there), so that the cascading feature is properly disabled and RX timestamps can be taken on all ports. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: dsa: sja1105: enable the TTEthernet engine on SJA1110	Vladimir Oltean
	As opposed to SJA1105 where there are parts with TTEthernet and parts without, in SJA1110 all parts support it, but it must be enabled in the static config. So enable it unconditionally. We use it for the tc-taprio and tc-gate offload. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	Merge branch 'hns3-ptp'	David S. Miller
	Guangbin Huang says: ==================== net: hns3: add support for PTP This series adds PTP support for the HNS3 ethernet driver. change log: V1 -> V2: 1. use spinlock to prevent concurrency 2. add the handling when ptp_clock_register() returns NULL ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: hns3: add debugfs support for ptp info	Huazhong Tan
	Add a debugfs interface for dumping ptp information, which is helpful for debugging. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	net: hns3: add support for PTP	Huazhong Tan
	Adds PTP support for HNS3 ethernet driver. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11	ice: enable transmit timestamps for E810 devices	Jacob Keller
	Add support for enabling Tx timestamp requests for outgoing packets on E810 devices. The ice hardware can support multiple outstanding Tx timestamp requests. When sending a descriptor to hardware, a Tx timestamp request is made by setting a request bit, and assigning an index that represents which Tx timestamp index to store the timestamp in. Hardware makes no effort to synchronize the index use, so it is up to software to ensure that Tx timestamp indexes are not re-used before the timestamp is reported back. To do this, introduce a Tx timestamp tracker which will keep track of currently in-use indexes. In the hot path, if a packet has a timestamp request, an index will be requested from the tracker. Unfortunately, this does require a lock as the indexes are shared across all queues on a PHY. There are not enough indexes to reliably assign only 1 to each queue. For the E810 devices, the timestamp indexes are not shared across PHYs, so each port can have its own tracking. Once hardware captures a timestamp, an interrupt is fired. In this interrupt, trigger a new work item that will figure out which timestamp was completed, and report the timestamp back to the stack. This function loops through the Tx timestamp indexes and checks whether there is now a valid timestamp. If so, it clears the PHY timestamp indication in the PHY memory, locks and removes the SKB and bit in the tracker, then reports the timestamp to the stack. It is possible in some cases that a timestamp request will be initiated but never completed. This might occur if the packet is dropped by software or hardware before it reaches the PHY. Add a task to the periodic work function that will check whether a timestamp request is more than a few seconds old. If so, the timestamp index is cleared in the PHY, and the SKB is released. Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and use the same overall logic for extending to 64 bits of nanoseconds. With this change, E810 devices should be able to perform basic PTP functionality. Future changes will extend the support to cover the E822-based devices. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11	ice: enable receive hardware timestamping	Jacob Keller
	Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to requests to enable timestamping support. If the request is for enabling Rx timestamps, set a bit in the Rx descriptors to indicate that receive timestamps should be reported. Hardware captures receive timestamps in the PHY which only captures part of the timer, and reports only 40 bits into the Rx descriptor. The upper 32 bits represent the contents of GLTSYN_TIME_L at the point of packet reception, while the lower 8 bits represent the upper 8 bits of GLTSYN_TIME_0. The networking and PTP stack expect 64 bit timestamps in nanoseconds. To support this, implement some logic to extend the timestamps by using the full PHC time. If the Rx timestamp was captured prior to the PHC time, then the real timestamp is PHC - (lower_32_bits(PHC) - timestamp) If the Rx timestamp was captured after the PHC time, then the real timestamp is PHC + (timestamp - lower_32_bits(PHC)) These calculations are correct as long as neither the PHC timestamp nor the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can detect when the Rx timestamp is before or after the PHC as long as the PHC timestamp is no more than 2^31-1 nanoseconds old. In that case, we calculate the delta between the lower 32 bits of the PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx timestamp must have been captured in the past. If it's smaller, then the Rx timestamp must have been captured after PHC time. Add an ice_ptp_extend_32b_ts function that relies on a cached copy of the PHC time and implements this algorithm to calculate the proper upper 32bits of the Rx timestamps. Cache the PHC time periodically in all of the Rx rings. This enables each Rx ring to simply call the extension function with a recent copy of the PHC time. By ensuring that the PHC time is kept up to date periodically, we ensure this algorithm doesn't use stale data and produce incorrect results. To cache the time, introduce a kworker and a kwork item to periodically store the Rx time. It might seem like we should use the .do_aux_work interface of the PTP clock. This doesn't work because all PFs must cache this time, but only one PF owns the PTP clock device. Thus, the ice driver will manage its own kthread instead of relying on the PTP do_aux_work handler. With this change, the driver can now report Rx timestamps on all incoming packets. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11	ice: report the PTP clock index in ethtool .get_ts_info	Jacob Keller
	Now that the driver registers a PTP clock device that represents the clock hardware, it is important that the clock index is reported via the ethtool .get_ts_info callback. The underlying hardware resource is shared between multiple PF functions. Only one function owns the hardware resources associated with a timer, but multiple functions may be associated with it for the purposes of timestamping. To support this, the owning PF will store the clock index into the driver shared parameters buffer in firmware. Other PFs will look up the clock index by reading the driver shared parameter on demand when requested via the .get_ts_info ethtool function. In this way, all functions which are tied to the same timer are able to report the clock index. Userspace software such as ptp4l performs a look up on the netdev to determine the associated clock, and all commands to control or configure the clock will be handled through the controlling PF. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11	ice: register 1588 PTP clock device object for E810 devices	Jacob Keller
	Add a new ice_ptp.c file for holding the basic PTP clock interface functions. If the device supports PTP, call the new ice_ptp_init and ice_ptp_release functions where appropriate. If the function owns the hardware resource associated with the PTP hardware clock, register with the PTP_1588_CLOCK infrastructure to allocate a new clock object that represents the device hardware clock. Implement basic functionality for reading and setting the clock time, performing clock adjustments, and adjusting the clock frequency. Future changes will introduce functionality for handling related features including Tx and Rx timestamps. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11	ice: add low level PTP clock access functions	Jacob Keller
	Add the ice_ptp_hw.c file and some associated definitions to the ice driver folder. This file contains basic low level definitions for functions that interact with the device hardware. For now, only E810-based devices are supported. The ice hardware supports 2 major variants which have different PHYs with different procedures necessary for interacting with the device clock. Because the device captures timestamps in the PHY, each PHY has its own internal timer. The timers are synchronized in hardware by first preparing the source timer and the PHY timer shadow registers, and then issuing a synchronization command. This ensures that both the source timer and PHY timers are programmed simultaneously. The timers themselves are all driven from the same oscillator source. The functions in ice_ptp_hw.c abstract over the differences between how the PHYs in E810 are programmed vs how the PHYs in E822 devices are programmed. This series only implements E810 support, but E822 support will be added in a future change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11	ice: add support for set/get of driver-stored firmware parameters	Jacob Keller
	Depending on the device configuration, the ice hardware may share the PTP hardware clock timer between multiple PFs. Each PF is informed by firmware during initialization of the PTP timer association. When bringing up PTP, only the PFs which own the timer shall allocate a PTP hardware clock. Other PFs associated with that timer must report the correct PTP clock index in order to allow userspace software the ability to know which ports are connected to the same clock. To support this, the firmware has driver shared parameters. These parameters enable one PF to write the clock index into firmware, and have other PFs read the associated value out. This enables the driver to have only a single PF allocate and control the device timer registers, while other PFs associated with that timer can report the correct clock in the ETHTOOL_GET_TS_INFO report. Add support for the necessary admin queue commands to enable reading and writing of the driver shared parameters. This will be used in a future change to enable sharing the PTP clock index between PF drivers. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11	ice: process 1588 PTP capabilities during initialization	Jacob Keller
	The device firmware reports PTP clock capabilities to each PF during initialization. This includes various information for both the overall device and the individual function, including For functions: * whether this function has timesync enabled * whether this function owns one of the 2 possible clock timers, and which one * which timer the function is associated with * the clock frequency, if the device supports multiple clock frequencies * The GPIO pin association for the timer owned by this PF, if any For the device: * Which PF owns timer 0, if any * Which PF owns timer 1, if any * whether timer 0 is enabled * whether timer 1 is enabled Extract the bits from the capabilities information reported by firmware and store them in the device and function capability structures.o This information will be used in a future change to have the function driver enable PTP hardware clock support. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11	ice: add support for sideband messages	Jacob Keller
	In order to support certain device features, including enabling the PTP hardware clock, the ice driver needs to control some registers on the device PHY. These registers are accessed by sending sideband messages. For some hardware, these messages must be sent over the device admin queue, while other hardware has a dedicated control queue for the sideband messages. Add the neighbor device message structure for sending a message to the neighboring device. Where supported, initialize the sideband control queue and handle cleanup. Add a wrapper function for sending sideband control queue messages that read or write a neighboring device register. Because some devices send sideband messages over the AdminQ, also increase the length of the admin queue to allow more messages to be queued up. This is important because the sideband messages add additional pressure on the AQ usage. This support will be used in following patches to enable support for CONFIG_1588_PTP_CLOCK. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-10	Merge branch 'ipa-mem-2'	David S. Miller
	Alex Elder says: ==================== net: ipa: memory region rework, part 2 This is the second portion of a set of patches updating the IPA memory region code. In this portion (part 2), the focus is on adjusting the code so that it no longer assumes the memory region descriptor array is indexed by the region identifier. This brings with it some related cleanup. Three loops are changed so their loop index variable is an unsigned rather than an enumerated type. A set of functions is changed so a region identifier (rather than a memory region descriptor pointer) is passed as argument, to simplify their call sites. This isn't entirely related or required, but I think it improves the code. A validation function for filter and route table memory regions is changed to take memory region IDs, rather than determining which region to validate based on a set of Boolean flags. Finally, ipa_mem_find() is created to abstract getting a memory descriptor based on its ID, and it is used everywhere rather than indexing the array. With that implemented, all of the memory regions can be defined by arrays of entries defined without providing index designators. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: don't index mem data array by ID	Alex Elder
	Finally the code handles the IPA memory region array in the configuration data without assuming it is indexed by region ID. Get rid of the array index designators where these arrays are initialized. As a result, there's no more need to define an explicitly undefined memory region ID, so get rid of that. Change ipa_mem_find() so it no longer assumes the ipa->mem[] array is indexed by memory region ID. Instead, have it search the array for the entry having the requested memory ID, and return the address of the descriptor if found. Otherwise return NULL. Stop allowing memory regions to be defined with zero size and zero canary value. Check for this condition in ipa_mem_valid_one(). As a result, it is not necessary to check for this case in ipa_mem_config(). Finally, there is no need for IPA_MEM_UNDEFINED to be defined any more, so get rid of it. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: introduce ipa_mem_find()	Alex Elder
	Introduce a new function that abstracts finding information about a region in IPA-local memory, given its memory region ID. For now it simply uses the region ID as an index into the IPA memory array. If the region is not defined, ipa_mem_find() returns a null pointer. Update all code that accesses the ipa->mem[] array directly to use ipa_mem_find() instead. The return value must be checked for null when optional memory regions are sought. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: pass memory id to ipa_table_valid_one()	Alex Elder
	Stop passing most of the Boolean flags to ipa_table_valid_one(), and just pass a memory region ID to it instead. We still need to indicate whether we're operating on a routing or filter table. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: pass mem_id to ipa_table_reset_add()	Alex Elder
	Pass a memory region ID rather than the address of a memory region descriptor to ipa_table_reset_add() to simplify callers. Similarly, pass memory region IDs to ipa_table_init_add(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: pass mem ID to ipa_mem_zero_region_add()	Alex Elder
	Pass a memory region ID rather than the address of a memory region descriptor to ipa_mem_zero_region_add() to simplify callers. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: pass mem_id to ipa_filter_reset_table()	Alex Elder
	Pass a memory region ID rather than the address of a memory region descriptor to ipa_filter_reset_table(), to simplify callers. We can eliminate the check for a zero region size in this function because ipa_table_reset_add() checks that before adding anything to the transaction. Note that here and in subsequent commits there is no need to check whether a memory region exists, because we will have already verified that during initialization. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: clean up header memory validation	Alex Elder
	Do some general cleanup in ipa_cmd_header_valid(): - Delay assigning the mem variable until just before it's used. - Assign the maximum offset and size values together. - Improve comments explaining the single range of memory being made up of a modem portion and an AP portion. - Record the offset of the combined range in a local variable. - Do the initial size assignment right after assigning the offset. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	net: ipa: don't assume mem array indexed by ID	Alex Elder
	Change ipa_mem_valid() to iterate over the entries using a u32 index variable rather than using a memory region ID. Use the ID found inside the memory descriptor rather than the loop index. Change ipa_mem_size_valid() to iterate over the entries but without assuming the array index is the memory region ID. "Empty" entries will have zero size; and we'll temporarily assume such entries have zero offset as well (they all do, currently). Similarly, don't assume the mem[] array is indexed by ID in ipa_mem_config(). There, "empty" entries will have a zero canary count, so no special assumptions are needed to handle them correctly. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	ibmvnic: Allow device probe if the device is not ready at boot	Cristobal Forno
	Allow the device to be initialized at a later time if it is not available at boot. The device will be allowed to probe but will be given a "down" state. After completing device probe and registering the net device, the driver will await an interrupt signal from its partner device, indicating that it is ready for boot. The driver will schedule a work event to perform the necessary procedure and begin operation. Co-developed-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: Cristobal Forno <cforno12@linux.ibm.com> Acked-by: Lijun Pan <lijunp213@gmail.com> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10	Merge branch 'marvell-prestera-lag'	David S. Miller
	Vadym Kochan says: ==================== net: marvell: prestera: add LAG support The following features are supported: - LAG basic operations - create/delete LAG - add/remove a member to LAG - enable/disable member in LAG - LAG Bridge support - LAG VLAN support - LAG FDB support Limitations: - Only HASH lag tx type is supported - The Hash parameters are not configurable. They are applied during the LAG creation stage. - Enslaving a port to the LAG device that already has an upper device is not supported. Changes extracted from: https://lkml.org/lkml/2021/2/3/877 and marked with "v2". v2: There are 2 additional preparation patches which simplifies the netdev topology handling. 1) Initialize 'lag' with NULL in prestera_lag_create() [suggested by Vladimir Oltean] 2) Use -ENOSPC in prestera_lag_port_add() if max lag [suggested by Vladimir Oltean] numbers were reached. 3) Do not propagate netdev events to prestera_switchdev [suggested by Vladimir Oltean] but call bridge specific funcs. It simplifies the code. 4) Check on info->link_up in prestera_netdev_port_lower_event() [suggested by Vladimir Oltean] 5) Return -EOPNOTSUPP in prestera_netdev_port_event() in case [suggested by Vladimir Oltean] LAG hashing mode is not supported. 6) Do not pass "lower" netdev to bridge join/leave functions. [suggested by Vladimir Oltean] It is not need as offloading settings applied on particular physical port. It requires to do extra upper dev lookup in case port is in the LAG which is in the bridge on vlans add/del. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>