diff options
author | Richard Braun <rbraun@sceen.net> | 2017-03-26 22:05:50 +0200 |
---|---|---|
committer | Richard Braun <rbraun@sceen.net> | 2017-03-26 22:05:50 +0200 |
commit | 766befe52cd7182519f8e64d18a066a5a2b9f37a (patch) | |
tree | 15d442620323d73f5a7e9b508b4bfb5f3c824687 /doc/style.9.txt | |
parent | 89d039180d18c9a5ca33f5225d7bd3c0b5c6120b (diff) |
doc: add style(9)
Diffstat (limited to 'doc/style.9.txt')
-rw-r--r-- | doc/style.9.txt | 477 |
1 files changed, 477 insertions, 0 deletions
diff --git a/doc/style.9.txt b/doc/style.9.txt new file mode 100644 index 0000000..15df3ac --- /dev/null +++ b/doc/style.9.txt @@ -0,0 +1,477 @@ +STYLE(9) +======== +:doctype: manpage +:man source: X15 +:man manual: X15 Kernel Developer{rsquo}s Manual + +NAME +---- + +style - kernel coding style and rules + +DESCRIPTION +----------- + +This document describes the preferred coding style, a close variant of +the modern K&R style, for the X15 kernel. It probably does not encompass +all the rules, which are quite numerous despite efforts to keep them +simple and compact. As a result, developers should compare their work +against existing recent modules and infer the rules that may not be +mentioned here. If you identify multiple rules that seem to conflict, +assume any of them is valid. If in doubt, make a pragmatic decision +and move forward. + +LANGUAGES +--------- + +The kernel is mostly written in the C programming language, conforming +to ISO/IEC 9899:1999 (C99) and slowly transitioning to ISO/IEC 9899:2011 +(C11), with GNU extensions. While many GNU extensions are used, nested +functions are forbidden. They are considered too specific to the GCC +compiler. Note that X15 can currently be built with both GCC and Clang, +and the latter has no support for nested functions. Also note that the +kernel only expects the freestanding C environment from the compiler. + +Since X15 is low level system software, processor-specific assembly is +inevitable. Developers should always refrain from writing assembly code +when possible. Obviously, that code must only be used in +architecture-specific modules. + +LINES +----- + +Lines are normally limited to 80 columns, unless there is a compelling +reason not to. This makes it easy to identify code with too many levels +of indentation, and also allows viewing multiple buffers side-by-side +on small screens, which happens to also be comfortable on larger screens. + +When defining a function, always break the line right before its name, +so that qualifiers and the return type are on their own line. As a +result, there is enough space for arguments even when the function +name is long. It also makes functions easier to find with grep. + +Use at most one empty line as a separator. Use empty lines around +functions and blocks, and also when you consider it appropriate +inside blocks, e.g. to isolate critical sections, and/or locking +functions. + +COMMENTS +-------- + +Each source file must start with a copyright statement comment, +followed by a description of the file, unless the content is simple +and obvious enough not to need one. The copyright statement and the +description must be separated by two empty comment lines : + +[source,c] +-------------------------------------------------------------------------------- +/* + * Copyright (c) 2017 John Smith. + * + * This program is free software: you can redistribute it and/or modify + * etc... + * + * + * Description of the module. + */ +-------------------------------------------------------------------------------- + +Comments are written in C style, not C++. Here are a few examples : + +[source,c] +-------------------------------------------------------------------------------- +/* Most single-line comments look like this */ + +/* + * Multi-line comments look like this. They should also be used for + * "reference" descriptions in public headers. + */ +struct dummy { + int useless; /* Here is a way to provide short member descriptions */ + char moot; /* But don't hesitate to move descriptions above members + when they spawn too many lines on the right side. */ +}; + +/* + * Instead of using C comments to disable code blocks, use the #if 0 + * preprocessor directive. Remember that C comments do not nest. + */ +#if 0 +... +#endif +-------------------------------------------------------------------------------- + +Comments should not be abused. A large number of comments creates noise +that readers must filter. Instead of writing a comment, ask yourself +whether you can improve the meaning of the code itself, e.g. by changing +names or breaking the code into smaller functions with names that are +good enough to convey the message of the comment. Prefer to describe data +over code. Comments in the code must point at details that are important +and not obvious, and quality code should have few of those. + +Note that the project does not use an annotation format for documentation +generation such as Doxygen, because, despite the undeniable improvement over +other types of tools, the benefit still seems too small considering the +additional rules that developers must keep in mind, the effort spent on +checking the annotations and the duplication that they may cause, and the +actual use of the generated documentation compared to direct source code +browsing. + +INDENTATION +----------- + +An indentation level is 4 spaces. It is strongly recommended, although +not compulsory, to avoid using more than 3 indentation levels. More +levels are tolerated for very short blocks. Combining 4 spaces per level +with 80 columns allows the use of somewhat long, significant names, and +naturally warns that a function should be simplified when the code becomes +overly crammed to the right. + +Tabulation characters are forbidden in source files, and should only be +used in Makefiles where absolutely required. This allows everyone to get +the same view, whatever the editor used. + +In a switch statement, the case labels are aligned on the same column as +the switch statement : + +[source,c] +-------------------------------------------------------------------------------- +switch (var) { +case VAL1: + do_one_thing(); + break; +case VAL2: +case VAL3: + do_another(); + break; +} +-------------------------------------------------------------------------------- + +When conditions don't fit on a single line, break before operators, +and indent the new line on the same column as the associated condition. +Use parentheses to make precedence explicit, even when not strictly needed. +Here is an example : + +[source,c] +-------------------------------------------------------------------------------- +static void +do_something(void) +{ + if (overly_long_stupid_condition_name_that_you_shouldn_t_use + && another_overly_long_stupid_condition_name + && (yet_another_variable_with_an_overly_long_name + || (a_first_long_variable_name != a_second_long_variable_name))) { + return; + } +} +-------------------------------------------------------------------------------- + +One reason for those rules is to make reviews easier, by pushing code as much +to the left as possible. The code should allow superficial reviews by just +taking a quick look at the left side of the code, e.g. control statements, +most conditions, operators, function names, their first parameter, etc... +A line which reaches the right side marks a spot that may need more careful +review. + +Don't write multiple statements on a single line. + +BRACES +------ + +Opening braces are written at the end of lines containing control statements +or struct/union/enum definitions. Closing braces are written on the same +column as their associated control statement. Opening braces for functions +are written on their own line after the arguments. Braces must also be used +for single-line blocks, as this reduces both efforts when adding debugging +code in such blocks and risks when merging conflicting code. In cases of +multi-part statements, closing braces are written before the continuation. +Here is an example : + +[source,c] +-------------------------------------------------------------------------------- +void +do_something(void *arg) +{ + if (arg == NULL) { + return; + } else { + print_what_it_is(arg); + } + + do { + play_with(arg); + } while (can_play_with(arg)); +} +-------------------------------------------------------------------------------- + +The main reason for this style is that it allows line compression with +almost no loss of readability, since it is easy to identify where blocks +start. + +SPACES +------ + +Spaces are used around most keywords. The exceptions are sizeof, typeof, +alignof, and pass:[__attribute__], because of their similarity with +functions. These exceptions must be used with parentheses. Spaces are also +used around operators, except unary operators. For example : + +[source,c] +-------------------------------------------------------------------------------- +if (!skip_copy && (a != b)) { + memcpy(&a, &b, sizeof(a)); +} +-------------------------------------------------------------------------------- + +When declaring pointer variables, use a space before "`*`", but not after, +so that the "`*`" character and the variable name are adjacent. The main +reason is that, while pointers are variables, with a type of their own, +they are a special kind of variables, that deserve more attention. Another +is to reduce mistakes when declaring several variables on the same line. +On the other hand, when writing functions returning pointers, use spaces +both before and after "`*`". The reason is to always clearly separate +returned types from the function name. Here is an example : + +[source,c] +-------------------------------------------------------------------------------- +static struct my_type * my_function(struct my_type *var); + +/* ... */ + +static struct my_type * +my_function(struct my_type *var) +{ + struct my_type *a, tmp; + + a = do_something(var, &tmp); + return a; +} +-------------------------------------------------------------------------------- + +Don't leave trailing spaces at the end of lines. + +NAMES +----- + +First of all, use underscores instead of camel case to separate words in +compound names. Function and variable names are in lower case. Macro +names generally are in upper caser, except for macros which behave strictly +like functions, which could be replaced with inline functions with no +API change. + +There are two ways to name symbols, depending on whether they're global +or not. Global symbols, either private (declared static) or public +(part of the publically exposed interface), must always be prefixed with +the namespace of the module they belong to. This makes it easy to know +that 1/ a symbol is global and 2/ where to find it in the project. It also +makes it possible to move definitions between the opaque implementation +and header files with few diffs. Names for local variables should be +short, with no prefix, since prefixes are used to add context, whereas +local variables are confined to the current context. + +When a module defines an object type, the type name immediately follows +the module name. Functions applying to objects (methods) are named by +adding the method name to the object type name. The name of helper +functions and methods is simply added at the end of the main function +that calls them. + +Here are some examples : + +[source,c] +-------------------------------------------------------------------------------- +/* + * Object type "cpu_pool" of module "kmem". + */ +struct kmem_cpu_pool; + +/* + * Method "init" of object type "cpu_pool" of module "kmem". + */ +static void kmem_cpu_pool_init(struct kmem_cpu_pool *cpu_pool, + struct kmem_cache *cache); + +/* + * Object type "cache" of module "kmem". + */ +struct kmem_cache; + +/* + * Helper function "verify" of method "alloc" of object type "cache" of + * module "kmem". + */ +static void kmem_cache_alloc_verify(struct kmem_cache *cache, void *buf, + int construct); + +/* + * Method "alloc" of object type "cache" of module "kmem". + */ +void * +kmem_cache_alloc(struct kmem_cache *cache) +{ + struct kmem_cpu_pool *cpu_pool; + int filled, verify; + void *buf; + + /* ... */; +} + +/* + * Function "info" of module "kmem". + */ +void kmem_info(void); +-------------------------------------------------------------------------------- + +There is currently one exception to the naming rules, which is the +module:arch/param module. The only API changes allowed on this module +are removals, in order to move certain symbols where they should be, +and renames using the param namespace, for symbols that actually belong +to the module. + +FUNCTIONS +--------- + +Functions should do one thing, and do it well. Their name should be +carefully chosen to best reflect their operation. Ideally, functions +should be short enough to fit in a "page", i.e. not much more than +25 lines. The real metric developers should use to decide whether to +break a function or not is the number of variables, including parameters. +Assuming most people can remember around 5 things at the same time, you +should break functions when the number of variables gets higher than +this value. Variables such as a buffer pointer and its size can be +considered a single variable. + +Do not hesitate to break functions up when things get even slightly +complicated. If that helps, remember that static functions can easily +be inlined by the compiler, even without the inline keyword. Link-time +optimizations (LTO) improve the situation further by considering the +whole program as a single compilation unit. + +Always name arguments in function prototypes. + +Common call protocols +~~~~~~~~~~~~~~~~~~~~~ + +Functions should match one of the following call protocols : + +type get_value(...):: + Accessor with no side effect. +void do_something(...):: + Function that cannot fail. +struct my_struct * my_struct_get(...):: + Function that returns an object, or NULL if an error occurred. +int do_something(...):: + Function that returns an error code, unless stated otherwise. + +It's easy to know when to return an object, or an error and the +object through a pointer argument : if there is a single possible +error code, let a NULL value encode that error, otherwise return +it explicitely : + +struct my_struct * my_struct_create(...):: + Return NULL if an error occurs, which may only be ERROR_NOMEM. +int my_struct_create(struct my_struct **my_structp, ...):: + Return an error (ERROR_NOMEM or something else) if creation fails, + otherwise pass the new object to the caller through the double pointer + argument and return 0. + +Methods, which are functions invoked on an object instance, must be +defined so that their first argument is always the object instance +on which the method is invoked. + +Unconditional branch statements +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Developers are encouraged to use unconditional branches in cases where +they contribute to maintaining complexity low. Such cases include return +on invalid argument, breaking or continuing simple loops, and centralized +error handling. Note that this includes the *goto* statement. The rationale, +beyond making error handling code common, is to keep the indentation level +low. The main code flow, with the lowest indentation level, should match +the most likely case. The compiler often uses heuristics based on keywords +such as *return* or *goto* which are applied to the conditions of the +containing block. You may also use the *likely* and *unlikely* macros +to give hints about branches to the compiler in performance critical +paths. + +Here is an example for object creation : + +[source,c] +-------------------------------------------------------------------------------- +static int +obj_create(struct obj **objp, int var) +{ + struct subobj *subobj; + struct obj *obj; + int error; + + obj = kmem_cache_alloc(&obj_cache); + + if (obj == NULL) { + return ERROR_NOMEM; + } + + error = subobj_create(&subobj); + + if (error) { + goto error_subobj; + } + + error = obj_check_var(var); + + if (error) { + goto error_var; + } + + obj_init(obj, subobj, var); + *objp = obj; + return 0; + +error_var: + subobj_destroy(subobj); +error_subobj: + kmem_cache_free(&obj_cache, obj); + return error; +} +-------------------------------------------------------------------------------- + +Common functions and methods +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following list can be used to easily find appropriate standard names +for the most common operations. + +*bootstrap*:: + This function is used for early module initialization. It should only + exist if initialization must be broken down into multiple steps. If + there is a single initialization step, run it in the *setup* function. +*setup*:: + This function is used for module initialization. +*init*:: + Object initialization, without memory allocation. Object initialization + should normally never fail. +*create*:: + Object creation, including both allocation and initialization. +*destroy*:: + Object destruction. +*ref*:: + Add a reference to an object. +*unref*:: + Remove a reference from an object, destroying it if there are no more + references. +*acquire* or *lock*:: + Acquire exclusive access to an object. +*release* or *unlock*:: + Release exclusive access to an object. +*add*:: + Add an object to a container. +*remove*:: + Remove an object from a container. +*lookup*:: + Look up an object in a container. + +SEE +--- + +manpage:intro + +{x15-operating-system} |