summaryrefslogtreecommitdiff
path: root/Documentation/core-api/real-time/architecture-porting.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/core-api/real-time/architecture-porting.rst')
-rw-r--r--Documentation/core-api/real-time/architecture-porting.rst109
1 files changed, 109 insertions, 0 deletions
diff --git a/Documentation/core-api/real-time/architecture-porting.rst b/Documentation/core-api/real-time/architecture-porting.rst
new file mode 100644
index 000000000000..d822fac29922
--- /dev/null
+++ b/Documentation/core-api/real-time/architecture-porting.rst
@@ -0,0 +1,109 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============================================
+Porting an architecture to support PREEMPT_RT
+=============================================
+
+:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
+
+This list outlines the architecture specific requirements that must be
+implemented in order to enable PREEMPT_RT. Once all required features are
+implemented, ARCH_SUPPORTS_RT can be selected in architecture’s Kconfig to make
+PREEMPT_RT selectable.
+Many prerequisites (genirq support for example) are enforced by the common code
+and are omitted here.
+
+The optional features are not strictly required but it is worth to consider
+them.
+
+Requirements
+------------
+
+Forced threaded interrupts
+ CONFIG_IRQ_FORCED_THREADING must be selected. Any interrupts that must
+ remain in hard-IRQ context must be marked with IRQF_NO_THREAD. This
+ requirement applies for instance to clocksource event interrupts,
+ perf interrupts and cascading interrupt-controller handlers.
+
+PREEMPTION support
+ Kernel preemption must be supported and requires that
+ CONFIG_ARCH_NO_PREEMPT remain unselected. Scheduling requests, such as those
+ issued from an interrupt or other exception handler, must be processed
+ immediately.
+
+POSIX CPU timers and KVM
+ POSIX CPU timers must expire from thread context rather than directly within
+ the timer interrupt. This behavior is enabled by setting the configuration
+ option CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
+ When KVM is enabled, CONFIG_KVM_XFER_TO_GUEST_WORK must also be set to ensure
+ that any pending work, such as POSIX timer expiration, is handled before
+ transitioning into guest mode.
+
+Hard-IRQ and Soft-IRQ stacks
+ Soft interrupts are handled in the thread context in which they are raised. If
+ a soft interrupt is triggered from hard-IRQ context, its execution is deferred
+ to the ksoftirqd thread. Preemption is never disabled during soft interrupt
+ handling, which makes soft interrupts preemptible.
+ If an architecture provides a custom __do_softirq() implementation that uses a
+ separate stack, it must select CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK. The
+ functionality should only be enabled when CONFIG_SOFTIRQ_ON_OWN_STACK is set.
+
+FPU and SIMD access in kernel mode
+ FPU and SIMD registers are typically not used in kernel mode and are therefore
+ not saved during kernel preemption. As a result, any kernel code that uses
+ these registers must be enclosed within a kernel_fpu_begin() and
+ kernel_fpu_end() section.
+ The kernel_fpu_begin() function usually invokes local_bh_disable() to prevent
+ interruptions from softirqs and to disable regular preemption. This allows the
+ protected code to run safely in both thread and softirq contexts.
+ On PREEMPT_RT kernels, however, kernel_fpu_begin() must not call
+ local_bh_disable(). Instead, it should use preempt_disable(), since softirqs
+ are always handled in thread context under PREEMPT_RT. In this case, disabling
+ preemption alone is sufficient.
+ The crypto subsystem operates on memory pages and requires users to "walk and
+ map" these pages while processing a request. This operation must occur outside
+ the kernel_fpu_begin()/ kernel_fpu_end() section because it requires preemption
+ to be enabled. These preemption points are generally sufficient to avoid
+ excessive scheduling latency.
+
+Exception handlers
+ Exception handlers, such as the page fault handler, typically enable interrupts
+ early, before invoking any generic code to process the exception. This is
+ necessary because handling a page fault may involve operations that can sleep.
+ Enabling interrupts is especially important on PREEMPT_RT, where certain
+ locks, such as spinlock_t, become sleepable. For example, handling an
+ invalid opcode may result in sending a SIGILL signal to the user task. A
+ debug excpetion will send a SIGTRAP signal.
+ In both cases, if the exception occurred in user space, it is safe to enable
+ interrupts early. Sending a signal requires both interrupts and kernel
+ preemption to be enabled.
+
+Optional features
+-----------------
+
+Timer and clocksource
+ A high-resolution clocksource and clockevents device are recommended. The
+ clockevents device should support the CLOCK_EVT_FEAT_ONESHOT feature for
+ optimal timer behavior. In most cases, microsecond-level accuracy is
+ sufficient
+
+Lazy preemption
+ This mechanism allows an in-kernel scheduling request for non-real-time tasks
+ to be delayed until the task is about to return to user space. It helps avoid
+ preempting a task that holds a sleeping lock at the time of the scheduling
+ request.
+ With CONFIG_GENERIC_IRQ_ENTRY enabled, supporting this feature requires
+ defining a bit for TIF_NEED_RESCHED_LAZY, preferably near TIF_NEED_RESCHED.
+
+Serial console with NBCON
+ With PREEMPT_RT enabled, all console output is handled by a dedicated thread
+ rather than directly from the context in which printk() is invoked. This design
+ allows printk() to be safely used in atomic contexts.
+ However, this also means that if the kernel crashes and cannot switch to the
+ printing thread, no output will be visible preventing the system from printing
+ its final messages.
+ There are exceptions for immediate output, such as during panic() handling. To
+ support this, the console driver must implement new-style lock handling. This
+ involves setting the CON_NBCON flag in console::flags and providing
+ implementations for the write_atomic, write_thread, device_lock, and
+ device_unlock callbacks.