summaryrefslogtreecommitdiff
path: root/drivers/net/dsa/sja1105/sja1105_ptp.c
AgeCommit message (Collapse)Author
2020-08-03net: dsa: sja1105: poll for extts events from a timerVladimir Oltean
The current poll interval is enough to ensure that rising and falling edge events are not lost for a 1 PPS signal with 50% duty cycle. But when we deliver the events to user space, it will try to infer if they were corresponding to a rising or to a falling edge (the kernel driver doesn't know that either). User space will try to make that inference based on the time at which the PPS master had emitted the pulse (i.e. if it's a .0 time, it's rising edge, if it's .5 time, it's falling edge). But there is no in-kernel API for retrieving the precise timestamp corresponding to a PPS master (aka perout) pulse. So user space has to guess even that. It will read the PTP time on the PPS master right after we've delivered the extts event, and declare that the PPS master time was just the closest integer second, based on 2 thresholds (lower than .25, or higher than .75, and ignore anything else). Except that, if we poll for extts events (and our hardware doesn't really help us, by not providing an interrupt), then there is a risk that the poll period (and therefore the time at which the event is delivered) might confuse user space. Because we are always scheduling the next extts poll at SJA1105_EXTTS_INTERVAL "from now" (that's the only thing that the schedule_delayed_work() API gives us), it means that the start time of the next delayed workqueue will always be shifted to the right a little bit (shifted with the SPI access duration of this workqueue run). In turn, because user space sees extts events that are non-periodic compared to the PPS master's time, this means that it might start making wrong guesses about rising/falling edge. To understand the effect, here is the output of ts2phc currently. Notice the 'src' timestamps of the 'SKIP extts' events, and how they have a large wander. They keep increasing until the upper limit for the ignore threshold (.75 seconds), after which the application starts ignoring the _other_ edge. ts2phc[26.624]: /dev/ptp3 SKIP extts index 0 at 21.449898912 src 21.657784518 ts2phc[27.133]: adding tstamp 21.949894240 to clock /dev/ptp3 ts2phc[27.133]: adding tstamp 22.000000000 to clock /dev/ptp1 ts2phc[27.133]: /dev/ptp3 offset 640 s2 freq +5112 ts2phc[27.636]: /dev/ptp3 SKIP extts index 0 at 22.449889360 src 22.669398022 ts2phc[28.140]: adding tstamp 22.949884376 to clock /dev/ptp3 ts2phc[28.140]: adding tstamp 23.000000000 to clock /dev/ptp1 ts2phc[28.140]: /dev/ptp3 offset 96 s2 freq +4760 ts2phc[28.644]: /dev/ptp3 SKIP extts index 0 at 23.449879504 src 23.677420422 ts2phc[29.153]: adding tstamp 23.949874704 to clock /dev/ptp3 ts2phc[29.153]: adding tstamp 24.000000000 to clock /dev/ptp1 ts2phc[29.153]: /dev/ptp3 offset -264 s2 freq +4429 ts2phc[29.656]: /dev/ptp3 SKIP extts index 0 at 24.449870008 src 24.689407238 ts2phc[30.160]: adding tstamp 24.949865376 to clock /dev/ptp3 ts2phc[30.160]: adding tstamp 25.000000000 to clock /dev/ptp1 ts2phc[30.160]: /dev/ptp3 offset -280 s2 freq +4334 ts2phc[30.664]: /dev/ptp3 SKIP extts index 0 at 25.449860760 src 25.697449926 ts2phc[31.168]: adding tstamp 25.949856176 to clock /dev/ptp3 ts2phc[31.168]: adding tstamp 26.000000000 to clock /dev/ptp1 ts2phc[31.168]: /dev/ptp3 offset -176 s2 freq +4354 ts2phc[31.672]: /dev/ptp3 SKIP extts index 0 at 26.449851584 src 26.705433606 ts2phc[32.180]: adding tstamp 26.949846992 to clock /dev/ptp3 ts2phc[32.180]: adding tstamp 27.000000000 to clock /dev/ptp1 ts2phc[32.180]: /dev/ptp3 offset -80 s2 freq +4397 ts2phc[32.684]: /dev/ptp3 SKIP extts index 0 at 27.449842384 src 27.717415110 ts2phc[33.192]: adding tstamp 27.949837768 to clock /dev/ptp3 ts2phc[33.192]: adding tstamp 28.000000000 to clock /dev/ptp1 ts2phc[33.192]: /dev/ptp3 offset 0 s2 freq +4453 ts2phc[33.696]: /dev/ptp3 SKIP extts index 0 at 28.449833128 src 28.729412902 ts2phc[34.200]: adding tstamp 28.949828472 to clock /dev/ptp3 ts2phc[34.200]: adding tstamp 29.000000000 to clock /dev/ptp1 ts2phc[34.200]: /dev/ptp3 offset 8 s2 freq +4461 ts2phc[34.704]: /dev/ptp3 SKIP extts index 0 at 29.449823816 src 29.737416038 ts2phc[35.208]: adding tstamp 29.949819152 to clock /dev/ptp3 ts2phc[35.208]: adding tstamp 30.000000000 to clock /dev/ptp1 ts2phc[35.208]: /dev/ptp3 offset -8 s2 freq +4447 ts2phc[35.712]: /dev/ptp3 SKIP extts index 0 at 30.449814496 src 30.745554982 ts2phc[36.216]: adding tstamp 30.949809840 to clock /dev/ptp3 ts2phc[36.216]: adding tstamp 31.000000000 to clock /dev/ptp1 ts2phc[36.216]: /dev/ptp3 offset -8 s2 freq +4445 ts2phc[36.468]: /dev/ptp3 SKIP extts index 0 at 31.449805184 src 31.501109446 ts2phc[36.972]: adding tstamp 31.949800536 to clock /dev/ptp3 ts2phc[36.972]: adding tstamp 32.000000000 to clock /dev/ptp1 ts2phc[36.972]: /dev/ptp3 offset -8 s2 freq +4442 ts2phc[37.480]: /dev/ptp3 SKIP extts index 0 at 32.449795896 src 32.513320070 ts2phc[37.984]: adding tstamp 32.949791248 to clock /dev/ptp3 ts2phc[37.984]: adding tstamp 33.000000000 to clock /dev/ptp1 ts2phc[37.984]: /dev/ptp3 offset 0 s2 freq +4448 Fix that by taking the following measures: - Schedule the poll from a timer. Because we are really scheduling the timer periodically, the extts events delivered to user space are periodic too, and don't suffer from the "shift-to-the-right" effect. - Increase the poll period to 6 times a second. This imposes a smaller upper bound to the shift that can occur to the delivery time of extts events, and makes user space (ts2phc) to always interpret correctly which events should be skipped and which shouldn't. - Move the SPI readout itself to the main PTP kernel thread, instead of the generic workqueue. This is because the timer runs in atomic context, but is also better than before, because if needed, we can chrt & taskset this kernel thread, to ensure it gets enough priority under load. After this patch, one can notice that the wander is greatly reduced, and that the latencies of one extts poll are not propagated to the next. The 'src' timestamp that is skipped is never larger than .65 seconds (which means .15 seconds larger than the time at which the real event occurred at, and .10 seconds smaller than the .75 upper threshold for ignoring the falling edge): ts2phc[40.076]: adding tstamp 34.949261296 to clock /dev/ptp3 ts2phc[40.076]: adding tstamp 35.000000000 to clock /dev/ptp1 ts2phc[40.076]: /dev/ptp3 offset 48 s2 freq +4631 ts2phc[40.568]: /dev/ptp3 SKIP extts index 0 at 35.449256496 src 35.595791078 ts2phc[41.064]: adding tstamp 35.949251744 to clock /dev/ptp3 ts2phc[41.064]: adding tstamp 36.000000000 to clock /dev/ptp1 ts2phc[41.064]: /dev/ptp3 offset -224 s2 freq +4374 ts2phc[41.552]: /dev/ptp3 SKIP extts index 0 at 36.449247088 src 36.579825574 ts2phc[42.044]: adding tstamp 36.949242456 to clock /dev/ptp3 ts2phc[42.044]: adding tstamp 37.000000000 to clock /dev/ptp1 ts2phc[42.044]: /dev/ptp3 offset -240 s2 freq +4290 ts2phc[42.536]: /dev/ptp3 SKIP extts index 0 at 37.449237848 src 37.563828774 ts2phc[43.028]: adding tstamp 37.949233264 to clock /dev/ptp3 ts2phc[43.028]: adding tstamp 38.000000000 to clock /dev/ptp1 ts2phc[43.028]: /dev/ptp3 offset -144 s2 freq +4314 ts2phc[43.520]: /dev/ptp3 SKIP extts index 0 at 38.449228656 src 38.547823238 ts2phc[44.012]: adding tstamp 38.949224048 to clock /dev/ptp3 ts2phc[44.012]: adding tstamp 39.000000000 to clock /dev/ptp1 ts2phc[44.012]: /dev/ptp3 offset -80 s2 freq +4335 ts2phc[44.508]: /dev/ptp3 SKIP extts index 0 at 39.449219432 src 39.535846118 ts2phc[44.996]: adding tstamp 39.949214816 to clock /dev/ptp3 ts2phc[44.996]: adding tstamp 40.000000000 to clock /dev/ptp1 ts2phc[44.996]: /dev/ptp3 offset -32 s2 freq +4359 ts2phc[45.488]: /dev/ptp3 SKIP extts index 0 at 40.449210192 src 40.515824678 ts2phc[45.980]: adding tstamp 40.949205568 to clock /dev/ptp3 ts2phc[45.980]: adding tstamp 41.000000000 to clock /dev/ptp1 ts2phc[45.980]: /dev/ptp3 offset 8 s2 freq +4390 ts2phc[46.636]: /dev/ptp3 SKIP extts index 0 at 41.449200928 src 41.664176902 ts2phc[47.132]: adding tstamp 41.949196288 to clock /dev/ptp3 ts2phc[47.132]: adding tstamp 42.000000000 to clock /dev/ptp1 ts2phc[47.132]: /dev/ptp3 offset 0 s2 freq +4384 ts2phc[47.620]: /dev/ptp3 SKIP extts index 0 at 42.449191656 src 42.648117190 ts2phc[48.112]: adding tstamp 42.949187016 to clock /dev/ptp3 ts2phc[48.112]: adding tstamp 43.000000000 to clock /dev/ptp1 ts2phc[48.112]: /dev/ptp3 offset 0 s2 freq +4384 ts2phc[48.604]: /dev/ptp3 SKIP extts index 0 at 43.449182384 src 43.632112582 ts2phc[49.100]: adding tstamp 43.949177736 to clock /dev/ptp3 ts2phc[49.100]: adding tstamp 44.000000000 to clock /dev/ptp1 ts2phc[49.100]: /dev/ptp3 offset -8 s2 freq +4376 ts2phc[49.588]: /dev/ptp3 SKIP extts index 0 at 44.449173096 src 44.616136774 ts2phc[50.080]: adding tstamp 44.949168464 to clock /dev/ptp3 ts2phc[50.080]: adding tstamp 45.000000000 to clock /dev/ptp1 ts2phc[50.080]: /dev/ptp3 offset 8 s2 freq +4390 ts2phc[50.572]: /dev/ptp3 SKIP extts index 0 at 45.449163816 src 45.600134662 ts2phc[51.064]: adding tstamp 45.949159160 to clock /dev/ptp3 ts2phc[51.064]: adding tstamp 46.000000000 to clock /dev/ptp1 ts2phc[51.064]: /dev/ptp3 offset -8 s2 freq +4376 ts2phc[51.556]: /dev/ptp3 SKIP extts index 0 at 46.449154528 src 46.584588550 ts2phc[52.048]: adding tstamp 46.949149896 to clock /dev/ptp3 ts2phc[52.048]: adding tstamp 47.000000000 to clock /dev/ptp1 ts2phc[52.048]: /dev/ptp3 offset 0 s2 freq +4382 ts2phc[52.540]: /dev/ptp3 SKIP extts index 0 at 47.449145256 src 47.568132198 ts2phc[53.032]: adding tstamp 47.949140616 to clock /dev/ptp3 ts2phc[53.032]: adding tstamp 48.000000000 to clock /dev/ptp1 ts2phc[53.032]: /dev/ptp3 offset 0 s2 freq +4382 ts2phc[53.524]: /dev/ptp3 SKIP extts index 0 at 48.449135968 src 48.552121446 ts2phc[54.016]: adding tstamp 48.949131320 to clock /dev/ptp3 ts2phc[54.016]: adding tstamp 49.000000000 to clock /dev/ptp1 ts2phc[54.016]: /dev/ptp3 offset 0 s2 freq +4382 ts2phc[54.512]: /dev/ptp3 SKIP extts index 0 at 49.449126680 src 49.540147014 ts2phc[55.000]: adding tstamp 49.949122040 to clock /dev/ptp3 ts2phc[55.000]: adding tstamp 50.000000000 to clock /dev/ptp1 ts2phc[55.000]: /dev/ptp3 offset 0 s2 freq +4382 ts2phc[55.492]: /dev/ptp3 SKIP extts index 0 at 50.449117400 src 50.520119078 ts2phc[55.988]: adding tstamp 50.949112768 to clock /dev/ptp3 ts2phc[55.988]: adding tstamp 51.000000000 to clock /dev/ptp1 ts2phc[55.988]: /dev/ptp3 offset 8 s2 freq +4390 ts2phc[56.476]: /dev/ptp3 SKIP extts index 0 at 51.449108120 src 51.504175910 ts2phc[57.132]: adding tstamp 51.949103480 to clock /dev/ptp3 ts2phc[57.132]: adding tstamp 52.000000000 to clock /dev/ptp1 ts2phc[57.132]: /dev/ptp3 offset 0 s2 freq +4384 ts2phc[57.624]: /dev/ptp3 SKIP extts index 0 at 52.449098840 src 52.651833574 ts2phc[58.116]: adding tstamp 52.949094200 to clock /dev/ptp3 ts2phc[58.116]: adding tstamp 53.000000000 to clock /dev/ptp1 ts2phc[58.116]: /dev/ptp3 offset 8 s2 freq +4392 ts2phc[58.612]: /dev/ptp3 SKIP extts index 0 at 53.449089560 src 53.639826918 ts2phc[59.100]: adding tstamp 53.949084920 to clock /dev/ptp3 ts2phc[59.100]: adding tstamp 54.000000000 to clock /dev/ptp1 ts2phc[59.100]: /dev/ptp3 offset 8 s2 freq +4394 ts2phc[59.592]: /dev/ptp3 SKIP extts index 0 at 54.449080272 src 54.619842278 ts2phc[60.084]: adding tstamp 54.949075624 to clock /dev/ptp3 ts2phc[60.084]: adding tstamp 55.000000000 to clock /dev/ptp1 ts2phc[60.084]: /dev/ptp3 offset 8 s2 freq +4397 ts2phc[60.576]: /dev/ptp3 SKIP extts index 0 at 55.449070968 src 55.603885542 ts2phc[61.068]: adding tstamp 55.949066312 to clock /dev/ptp3 ts2phc[61.068]: adding tstamp 56.000000000 to clock /dev/ptp1 ts2phc[61.068]: /dev/ptp3 offset 0 s2 freq +4391 ts2phc[61.560]: /dev/ptp3 SKIP extts index 0 at 56.449061680 src 56.587885798 ts2phc[62.052]: adding tstamp 56.949057032 to clock /dev/ptp3 ts2phc[62.052]: adding tstamp 57.000000000 to clock /dev/ptp1 ts2phc[62.052]: /dev/ptp3 offset -8 s2 freq +4383 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-15net: dsa: sja1105: fix PTP timestamping with large tc-taprio cyclesVladimir Oltean
It isn't actually described clearly at all in UM10944.pdf, but on TX of a management frame (such as PTP), this needs to happen: - The destination MAC address (i.e. 01-80-c2-00-00-0e), along with the desired destination port, need to be installed in one of the 4 management slots of the switch, over SPI. - The host can poll over SPI for that management slot's ENFPORT field. That gets unset when the switch has matched the slot to the frame. And therein lies the problem. ENFPORT does not mean that the packet has been transmitted. Just that it has been received over the CPU port, and that the mgmt slot is yet again available. This is relevant because of what we are doing in sja1105_ptp_txtstamp_skb, which is called right after sja1105_mgmt_xmit. We are in a hard real-time deadline, since the hardware only gives us 24 bits of TX timestamp, so we need to read the full PTP clock to reconstruct it. Because we're in a hurry (in an attempt to make sure that we have a full 64-bit PTP time which is as close as possible to the actual transmission time of the frame, to avoid 24-bit wraparounds), first we read the PTP clock, then we poll for the TX timestamp to become available. But of course, we don't know for sure that the frame has been transmitted when we read the full PTP clock. We had assumed that ENFPORT means it has, but the assumption is incorrect. And while in most real-life scenarios this has never been caught due to software delays, nowhere is this fact more obvious than with a tc-taprio offload, where PTP traffic gets a small timeslot very rarely (example: 1 packet per 10 ms). In that case, we will be reading the PTP clock for timestamp reconstruction too early (before the packet has been transmitted), and this renders the reconstruction procedure incorrect (see the assumptions described in the comments found on function sja1105_tstamp_reconstruct). So the PTP TX timestamps will be off by 1<<24 clock ticks, or 135 ms (1 tick is 8 ns). So fix this case of premature optimization by simply reordering the sja1105_ptpegr_ts_poll and the sja1105_ptpclkval_read function calls. It turns out that in practice, the 135 ms hard deadline for PTP timestamp wraparound is not so hard, since even the most bandwidth-intensive PTP profiles, such as 802.1AS-2011, have a sync frame interval of 125 ms. So if we couldn't deliver a timestamp in 135 ms (which we can), we're toast and have much bigger problems anyway. Fixes: 47ed985e97f5 ("net: dsa: sja1105: Add logic for TX timestamping") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06net: dsa: sja1105: the PTP_CLK extts input reacts on both edgesVladimir Oltean
It looks like the sja1105 external timestamping input is not as generic as we thought. When fed a signal with 50% duty cycle, it will timestamp both the rising and the falling edge. When fed a short pulse signal, only the timestamp of the falling edge will be seen in the PTPSYNCTS register, because that of the rising edge had been overwritten. So the moral is: don't feed it short pulse inputs. Luckily this is not a complete deal breaker, as we can still work with 1 Hz square waves. But the problem is that the extts polling period was not dimensioned enough for this input signal. If we leave the period at half a second, we risk losing timestamps due to jitter in the measuring process. So we need to increase it to 4 times per second. Also, the very least we can do to inform the user is to deny any other flags combination than with PTP_RISING_EDGE and PTP_FALLING_EDGE both set. Fixes: 747e5eb31d59 ("net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUTVladimir Oltean
The SJA1105 switch family has a PTP_CLK pin which emits a signal with fixed 50% duty cycle, but variable frequency and programmable start time. On the second generation (P/Q/R/S) switches, this pin supports even more functionality. The use case described by the hardware documents talks about synchronization via oneshot pulses: given 2 sja1105 switches, arbitrarily designated as a master and a slave, the master emits a single pulse on PTP_CLK, while the slave is configured to timestamp this pulse received on its PTP_CLK pin (which must obviously be configured as input). The difference between the timestamps then exactly becomes the slave offset to the master. The only trouble with the above is that the hardware is very much tied into this use case only, and not very generic beyond that: - When emitting a oneshot pulse, instead of being told when to emit it, the switch just does it "now" and tells you later what time it was, via the PTPSYNCTS register. [ Incidentally, this is the same register that the slave uses to collect the ext_ts timestamp from, too. ] - On the sync slave, there is no interrupt mechanism on reception of a new extts, and no FIFO to buffer them, because in the foreseen use case, software is in control of both the master and the slave pins, so it "knows" when there's something to collect. These 2 problems mean that: - We don't support (at least yet) the quirky oneshot mode exposed by the hardware, just normal periodic output. - We abuse the hardware a little bit when we expose generic extts. Because there's no interrupt mechanism, we need to poll at double the frequency we expect to receive a pulse. Currently that means a non-configurable "twice a second". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23net: dsa: sja1105: unconditionally set DESTMETA and SRCMETA in AVB tableVladimir Oltean
These fields configure the destination and source MAC address that the switch will put in the Ethernet frames sent towards the CPU port that contain RX timestamps for PTP. These fields do not enable the feature itself, that is configured via SEND_META0 and SEND_META1 in the General Params table. The implication of this patch is that the AVB Params table will always be present in the static config. Which doesn't really hurt. This is needed because in a future patch, we will add another field from this table, CAS_MASTER, for configuring the PTP_CLK pin function. That can be configured irrespective of whether RX timestamping is enabled or not, so always having this table present is going to simplify things a bit. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Simple overlapping changes in bpf land wrt. bpf_helper_defs.h handling. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30net: dsa: sja1105: Empty the RX timestamping queue on PTP settings changeVladimir Oltean
When disabling PTP timestamping, don't reset the switch with the new static config until all existing PTP frames have been timestamped on the RX path or dropped. There's nothing we can do with these afterwards. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30net: dsa: sja1105: Use PTP core's dedicated kernel thread for RX timestampingVladimir Oltean
And move the queue of skb's waiting for RX timestamps into the ptp_data structure, since it isn't needed if PTP is not compiled. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30net: dsa: sja1105: Really make the PTP command read-writeVladimir Oltean
When activating tc-taprio offload on the switch ports, the TAS state machine will try to check whether it is running or not, but will find both the STARTED and STOPPED bits as false in the sja1105_tas_check_running function. So the function will return -EINVAL (an abnormal situation) and the kernel will keep printing this from the TAS FSM workqueue: [ 37.691971] sja1105 spi0.1: An operation returned -22 The reason is that the underlying function that gets called, sja1105_ptp_commit, does not actually do a SPI_READ, but a SPI_WRITE. So the command buffer remains initialized with zeroes instead of retrieving the hardware state. Fix that. Fixes: 41603d78b362 ("net: dsa: sja1105: Make the PTP command read-write") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-30net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slotVladimir Oltean
The PTP egress timestamp N must be captured from register PTPEGR_TS[n], where n = 2 * PORT + TSREG. There are 10 PTPEGR_TS registers, 2 per port. We are only using TSREG=0. As opposed to the management slots, which are 4 in number (SJA1105_NUM_PORTS, minus the CPU port). Any management frame (which includes PTP frames) can be sent to any non-CPU port through any management slot. When the CPU port is not the last port (#4), there will be a mismatch between the slot and the port number. Luckily, the only mainline occurrence with this switch (arch/arm/boot/dts/ls1021a-tsn.dts) does have the CPU port as #4, so the issue did not manifest itself thus far. Fixes: 47ed985e97f5 ("net: dsa: sja1105: Add logic for TX timestamping") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-14net: dsa: sja1105: Implement state machine for TAS with PTP clock sourceVladimir Oltean
Tested using the following bash script and the tc from iproute2-next: #!/bin/bash set -e -u -o pipefail NSEC_PER_SEC="1000000000" gatemask() { local tc_list="$1" local mask=0 for tc in ${tc_list}; do mask=$((${mask} | (1 << ${tc}))) done printf "%02x" ${mask} } if ! systemctl is-active --quiet ptp4l; then echo "Please start the ptp4l service" exit fi now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }') # Phase-align the base time to the start of the next second. sec=$(echo "${now}" | gawk -F. '{ print $1; }') base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))" tc qdisc add dev swp5 parent root handle 100 taprio \ num_tc 8 \ map 0 1 2 3 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S $(gatemask 7) 100000 \ sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \ clockid CLOCK_TAI flags 2 The "state machine" is a workqueue invoked after each manipulation command on the PTP clock (reset, adjust time, set time, adjust frequency) which checks over the state of the time-aware scheduler. So it is not monitored periodically, only in reaction to a PTP command typically triggered from a userspace daemon (linuxptp). Otherwise there is no reason for things to go wrong. Now that the timecounter/cyclecounter has been replaced with hardware operations on the PTP clock, the TAS Kconfig now depends upon PTP and the standalone clocksource operating mode has been removed. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-14net: dsa: sja1105: Make the PTP command read-writeVladimir Oltean
The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate whether the time-aware scheduler is running or not. We will be using that for monitoring the scheduler in the next patch, so refactor the PTP command API in order to allow that. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12net: dsa: sja1105: Print the reset reasonVladimir Oltean
Sometimes it can be quite opaque even for me why the driver decided to reset the switch. So instead of adding dump_stack() calls each time for debugging, just add a reset reason to sja1105_static_config_reload calls which gets printed to the console. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11net: dsa: sja1105: Restore PTP time after switch resetVladimir Oltean
The PTP time of the switch is not preserved when uploading a new static configuration. Work around this hardware oddity by reading its PTP time before a static config upload, and restoring it afterwards. Static config changes are expected to occur at runtime even in scenarios directly related to PTP, i.e. the Time-Aware Scheduler of the switch is programmed in this way. Perhaps the larger implication of this patch is that the PTP .gettimex64 and .settime functions need to be exposed to sja1105_main.c, where the PTP lock needs to be held during this entire process. So their core implementation needs to move to some common functions which get exposed in sja1105_ptp.h. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-11net: dsa: sja1105: Implement the .gettimex64 system call for PTPVladimir Oltean
Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace applications (i.e. phc2sys) to compensate for the delays incurred while reading the PHC's time. The task itself of taking the software timestamp is delegated to the SPI subsystem, through the newly introduced API in struct spi_transfer. The goal is to cross-timestamp I/O operations on the switch's PTP clock with values in the local system clock (CLOCK_REALTIME). For that we need to understand a bit of the hardware internals. The 'read PTP time' message is a 12 byte structure, first 4 bytes of which represent the SPI header, and the last 8 bytes represent the 64-bit PTP time. The switch itself starts processing the command immediately after receiving the last bit of the address, i.e. at the middle of byte 3 (last byte of header). The PTP time is shadowed to a buffer register in the switch, and retrieved atomically during the subsequent SPI frames. A similar thing goes on for the 'write PTP time' message, although in that case the switch waits until the 64-bit PTP time becomes fully available before taking any action. So the byte that needs to be software-timestamped is byte 11 (last) of the transfer. The patch creates a common (and local) sja1105_xfer implementation for the SPI I/O, and offers 3 front-ends: - sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally requesting a PTP timestamp - sja1105_xfer_buf: this is for large transfers (e.g. the static config buffer) and other misc data, and there is no point in giving timestamping capabilities to this. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-18net: dsa: sja1105: Switch to hardware operations for PTPVladimir Oltean
Adjusting the hardware clock (PTPCLKVAL, PTPCLKADD, PTPCLKRATE) is a requirement for the auxiliary PTP functionality of the switch (TTEthernet, PPS input, PPS output). Therefore we need to switch to using these registers to keep a synchronized time in hardware, instead of the timecounter/cyclecounter implementation, which is reliant on the free-running PTPTSCLK. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-14net: dsa: sja1105: Change the PTP command access patternVladimir Oltean
The PTP command register contains enable bits for: - Putting the 64-bit PTPCLKVAL register in add/subtract or write mode - Taking timestamps off of the corrected vs free-running clock - Starting/stopping the TTEthernet scheduling - Starting/stopping PPS output - Resetting the switch When a command needs to be issued (e.g. "change the PTPCLKVAL from write mode to add/subtract mode"), one cannot simply write to the command register setting the PTPCLKADD bit to 1, because that would zeroize the other settings. One also cannot do a read-modify-write (that would be too easy for this hardware) because not all bits of the command register are readable over SPI. So this leaves us with the only option of keeping the value of the PTP command register in the driver, and operating on that. Actually there are 2 types of PTP operations now: - Operations that modify the cached PTP command. These operate on ptp_data->cmd as a pointer. - Operations that apply all previously cached PTP settings, but don't otherwise cache what they did themselves. The sja1105_ptp_reset function is such an example. It copies the ptp_data->cmd on stack before modifying and writing it to SPI. This practically means that struct sja1105_ptp_cmd is no longer an implementation detail, since it needs to be stored in full into struct sja1105_ptp_data, and hence in struct sja1105_private. So the (*ptp_cmd) function prototype can change and take struct sja1105_ptp_cmd as second argument now. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-14net: dsa: sja1105: Move PTP data to its own private structureVladimir Oltean
This is a non-functional change with 2 goals (both for the case when CONFIG_NET_DSA_SJA1105_PTP is not enabled): - Reduce the size of the sja1105_private structure. - Make the PTP code more self-contained. Leaving priv->ptp_data.lock to be initialized in sja1105_main.c is not a leftover: it will be used in a future patch "net: dsa: sja1105: Restore PTP time after switch reset". Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-14net: dsa: sja1105: Make all public PTP functions take dsa_switch as argumentVladimir Oltean
The new rule (as already started for sja1105_tas.h) is for functions of optional driver components (ones which may be disabled via Kconfig - PTP and TAS) to take struct dsa_switch *ds instead of struct sja1105_private *priv as first argument. This is so that forward-declarations of struct sja1105_private can be avoided. So make sja1105_ptp.h the second user of this rule. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-14net: dsa: sja1105: Get rid of global declaration of struct ptp_clock_infoVladimir Oltean
We need priv->ptp_caps to hold a structure and not just a pointer, because we use container_of in the various PTP callbacks. Therefore, the sja1105_ptp_caps structure declared in the global memory of the driver serves no further purpose after copying it into priv->ptp_caps. So just populate priv->ptp_caps with the needed operations and remove sja1105_ptp_caps. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02net: dsa: sja1105: Rename sja1105_spi_send_packed_buf to sja1105_xfer_bufVladimir Oltean
The most commonly called function in the driver is long due for a rename. The "packed" word is redundant (it doesn't make sense to transfer an unpacked structure, since that is in CPU endianness yadda yadda), and the "spi" word is also redundant since argument 2 of the function is SPI_READ or SPI_WRITE. As for the sja1105_spi_send_long_packed_buf function, it is only being used from sja1105_spi.c, so remove its global prototype. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-02net: dsa: sja1105: Replace sja1105_spi_send_int with sja1105_xfer_{u32, u64}Vladimir Oltean
Having a function that takes a variable number of unpacked bytes which it generically calls an "int" is confusing and makes auditing patches next to impossible. We only use spi_send_int with the int sizes of 32 and 64 bits. So just make the spi_send_int function less generic and replace it with the appropriate two explicit functions, which can now type-check the int pointer type. Note that there is still a small weirdness in the u32 function, which has to convert it to a u64 temporary. This is because of how the packing API works at the moment, but the weirdness is at least hidden from callers of sja1105_xfer_u32 now. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-06net: dsa: sja1105: Really fix panic on unregistering PTP clockVladimir Oltean
The IS_ERR_OR_NULL(priv->clock) check inside sja1105_ptp_clock_unregister() is preventing cancel_delayed_work_sync from actually being run. Additionally, sja1105_ptp_clock_unregister() does not actually get run, when placed in sja1105_remove(). The DSA switch gets torn down, but the sja1105 module does not get unregistered. So sja1105_ptp_clock_unregister needs to be moved to sja1105_teardown, to be symmetrical with sja1105_ptp_clock_register which is called from the DSA sja1105_setup. It is strange to fix a "fixes" patch, but the probe failure can only be seen when the attached PHY does not respond to MDIO (issue which I can't pinpoint the reason to) and it goes away after I power-cycle the board. This time the patch was validated on a failing board, and the kernel panic from the fixed commit's message can no longer be seen. Fixes: 29dd908d355f ("net: dsa: sja1105: Cancel PTP delayed work on unregister") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27net: dsa: sja1105: Cancel PTP delayed work on unregisterVladimir Oltean
Currently when the driver unloads and PTP is enabled, the delayed work that prevents the timecounter from expiring becomes a ticking time bomb. The kernel will schedule the work thread within 60 seconds of driver removal, but the work handler is no longer there, leading to this strange and inconclusive stack trace: [ 64.473112] Unable to handle kernel paging request at virtual address 79746970 [ 64.480340] pgd = 008c4af9 [ 64.483042] [79746970] *pgd=00000000 [ 64.486620] Internal error: Oops: 80000005 [#1] SMP ARM [ 64.491820] Modules linked in: [ 64.494871] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0-rc5-01634-ge3a2773ba9e5 #1246 [ 64.503007] Hardware name: Freescale LS1021A [ 64.507259] PC is at 0x79746970 [ 64.510393] LR is at call_timer_fn+0x3c/0x18c [ 64.514729] pc : [<79746970>] lr : [<c03bd734>] psr: 60010113 [ 64.520965] sp : c1901de0 ip : 00000000 fp : c1903080 [ 64.526163] r10: c1901e38 r9 : ffffe000 r8 : c19064ac [ 64.531363] r7 : 79746972 r6 : e98dd260 r5 : 00000100 r4 : c1a9e4a0 [ 64.537859] r3 : c1900000 r2 : ffffa400 r1 : 79746972 r0 : e98dd260 [ 64.544359] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 64.551460] Control: 10c5387d Table: a8a2806a DAC: 00000051 [ 64.557176] Process swapper/0 (pid: 0, stack limit = 0x1ddb27f0) [ 64.563147] Stack: (0xc1901de0 to 0xc1902000) [ 64.567481] 1de0: eb6a4918 3d60d7c3 c1a9e554 e98dd260 eb6a34c0 c1a9e4a0 ffffa400 c19064ac [ 64.575616] 1e00: ffffe000 c03bd95c c1901e34 c1901e34 eb6a34c0 c1901e30 c1903d00 c186f4c0 [ 64.583751] 1e20: c1906488 29e34000 c1903080 c03bdca4 00000000 eaa6f218 00000000 eb6a45c0 [ 64.591886] 1e40: eb6a45c0 20010193 00000003 c03c0a68 20010193 3f7231be c1903084 00000002 [ 64.600022] 1e60: 00000082 00000001 ffffe000 c1a9e0a4 00000100 c0302298 02b64722 0000000f [ 64.608157] 1e80: c186b3c8 c1877540 c19064ac 0000000a c186b350 ffffa401 c1903d00 c1107348 [ 64.616292] 1ea0: 00200102 c0d87a14 ea823c00 ffffe000 00000012 00000000 00000000 ea810800 [ 64.624427] 1ec0: f0803000 c1876ba8 00000000 c034c784 c18774b8 c039fb50 c1906c90 c1978aac [ 64.632562] 1ee0: f080200c f0802000 c1901f10 c0709ca8 c03091a0 60010013 ffffffff c1901f44 [ 64.640697] 1f00: 00000000 c1900000 c1876ba8 c0301a8c 00000000 000070a0 eb6ac1a0 c031da60 [ 64.648832] 1f20: ffffe000 c19064ac c19064f0 00000001 00000000 c1906488 c1876ba8 00000000 [ 64.656967] 1f40: ffffffff c1901f60 c030919c c03091a0 60010013 ffffffff 00000051 00000000 [ 64.665102] 1f60: ffffe000 c0376aa4 c1a9da37 ffffffff 00000037 3f7231be c1ab20c0 000000cc [ 64.673238] 1f80: c1906488 c1906480 ffffffff 00000037 c1ab20c0 c1ab20c0 00000001 c0376e1c [ 64.681373] 1fa0: c1ab2118 c1700ea8 ffffffff ffffffff 00000000 c1700754 c17dfa40 ebfffd80 [ 64.689509] 1fc0: 00000000 c17dfa40 3f7733be 00000000 00000000 c1700330 00000051 10c0387d [ 64.697644] 1fe0: 00000000 8f000000 410fc075 10c5387d 00000000 00000000 00000000 00000000 [ 64.705788] [<c03bd734>] (call_timer_fn) from [<c03bd95c>] (expire_timers+0xd8/0x144) [ 64.713579] [<c03bd95c>] (expire_timers) from [<c03bdca4>] (run_timer_softirq+0xe4/0x1dc) [ 64.721716] [<c03bdca4>] (run_timer_softirq) from [<c0302298>] (__do_softirq+0x130/0x3c8) [ 64.729854] [<c0302298>] (__do_softirq) from [<c034c784>] (irq_exit+0xbc/0xd8) [ 64.737040] [<c034c784>] (irq_exit) from [<c039fb50>] (__handle_domain_irq+0x60/0xb4) [ 64.744833] [<c039fb50>] (__handle_domain_irq) from [<c0709ca8>] (gic_handle_irq+0x58/0x9c) [ 64.753143] [<c0709ca8>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90) [ 64.760583] Exception stack(0xc1901f10 to 0xc1901f58) [ 64.765605] 1f00: 00000000 000070a0 eb6ac1a0 c031da60 [ 64.773740] 1f20: ffffe000 c19064ac c19064f0 00000001 00000000 c1906488 c1876ba8 00000000 [ 64.781873] 1f40: ffffffff c1901f60 c030919c c03091a0 60010013 ffffffff [ 64.788456] [<c0301a8c>] (__irq_svc) from [<c03091a0>] (arch_cpu_idle+0x38/0x3c) [ 64.795816] [<c03091a0>] (arch_cpu_idle) from [<c0376aa4>] (do_idle+0x1bc/0x298) [ 64.803175] [<c0376aa4>] (do_idle) from [<c0376e1c>] (cpu_startup_entry+0x18/0x1c) [ 64.810707] [<c0376e1c>] (cpu_startup_entry) from [<c1700ea8>] (start_kernel+0x480/0x4ac) [ 64.818839] Code: bad PC value [ 64.821890] ---[ end trace e226ed97b1c584cd ]--- [ 64.826482] Kernel panic - not syncing: Fatal exception in interrupt [ 64.832807] CPU1: stopping [ 64.835501] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 5.2.0-rc5-01634-ge3a2773ba9e5 #1246 [ 64.845013] Hardware name: Freescale LS1021A [ 64.849266] [<c0312394>] (unwind_backtrace) from [<c030cc74>] (show_stack+0x10/0x14) [ 64.856972] [<c030cc74>] (show_stack) from [<c0ff4138>] (dump_stack+0xb4/0xc8) [ 64.864159] [<c0ff4138>] (dump_stack) from [<c0310854>] (handle_IPI+0x3bc/0x3dc) [ 64.871519] [<c0310854>] (handle_IPI) from [<c0709ce8>] (gic_handle_irq+0x98/0x9c) [ 64.879050] [<c0709ce8>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90) [ 64.886489] Exception stack(0xea8cbf60 to 0xea8cbfa8) [ 64.891514] bf60: 00000000 0000307c eb6c11a0 c031da60 ffffe000 c19064ac c19064f0 00000002 [ 64.899649] bf80: 00000000 c1906488 c1876ba8 00000000 00000000 ea8cbfb0 c030919c c03091a0 [ 64.907780] bfa0: 600d0013 ffffffff [ 64.911250] [<c0301a8c>] (__irq_svc) from [<c03091a0>] (arch_cpu_idle+0x38/0x3c) [ 64.918609] [<c03091a0>] (arch_cpu_idle) from [<c0376aa4>] (do_idle+0x1bc/0x298) [ 64.925967] [<c0376aa4>] (do_idle) from [<c0376e1c>] (cpu_startup_entry+0x18/0x1c) [ 64.933496] [<c0376e1c>] (cpu_startup_entry) from [<803025cc>] (0x803025cc) [ 64.940422] Rebooting in 3 seconds.. In this case, what happened is that the DSA driver failed to probe at boot time due to a PHY issue during phylink_connect_phy: [ 2.245607] fsl-gianfar soc:ethernet@2d90000 eth2: error -19 setting up slave phy [ 2.258051] sja1105 spi0.1: failed to create slave for port 0.0 Fixes: bb77f36ac21d ("net: dsa: sja1105: Add support for the PTP clock") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-27net: dsa: sja1105: Build PTP support in main DSA driverVladimir Oltean
As Arnd Bergmann pointed out in commit 78fe8a28fb96 ("net: dsa: sja1105: fix ptp link error"), there is no point in having PTP support as a separate loadable kernel module. So remove the exported symbols and make sja1105.ko contain PTP support or not based on CONFIG_NET_DSA_SJA1105_PTP. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08net: dsa: sja1105: Expose PTP timestamping ioctls to userspaceVladimir Oltean
This enables the PTP support towards userspace applications such as linuxptp. The switches can timestamp only trapped multicast MAC frames, and therefore only the profiles of 1588 over L2 are supported. TX timestamping can be enabled per port, but RX timestamping is enabled globally. As long as RX timestamping is enabled, the switch will emit metadata follow-up frames that will be processed by the tagger. It may be a problem that linuxptp does not restore the RX timestamping settings when exiting. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08net: dsa: sja1105: Add logic for TX timestampingVladimir Oltean
On TX, timestamping is performed synchronously from the port_deferred_xmit worker thread. In management routes, the switch is requested to take egress timestamps (again partial), which are reconstructed and appended to a clone of the skb that was just sent. The cloning is done by DSA and we retrieve the pointer from the structure that DSA keeps in skb->cb. Then these clones are enqueued to the socket's error queue for application-level processing. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08net: dsa: sja1105: Add support for the PTP clockVladimir Oltean
The design of this PHC driver is influenced by the switch's behavior w.r.t. timestamping. It exposes two PTP counters, one free-running (PTPTSCLK) and the other offset- and frequency-corrected in hardware through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either of these for frame timestamps. However, the user manual warns that taking timestamps based on the corrected clock is less than useful, as the switch can deliver corrupted timestamps in a variety of circumstances. Therefore, this PHC uses the free-running PTPTSCLK together with a timecounter/cyclecounter structure that translates it into a software time domain. Thus, the settime/adjtime and adjfine callbacks are hardware no-ops. The timestamps (introduced in a further patch) will also be translated to the correct time domain before being handed over to the userspace PTP stack. The introduction of a second set of PHC operations that operate on the hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat unavoidable, as the TTEthernet core uses the corrected PTP time domain. However, the free-running counter + timecounter structure combination will suffice for now, as the resulting timestamps yield a sub-50 ns synchronization offset in steady state using linuxptp. For this patch, in absence of frame timestamping, the operations of the switch PHC were tested by syncing it to the system time as a local slave clock with: phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>