Subject: Re: Coding Standards, anyone?
Date: Fri, 3 Feb 1995 12:38:51 +0200
From: Linus Torvalds <Linus.Torvalds@cs.helsinki.fi>
In-Reply-To: Greg McGary's message as of Feb  2, 20:06
To: linux-kernel@vger.rutgers.edu

Greg McGary: "Coding Standards, anyone?" (Feb  2, 20:06):
> Am I the only one, or is there anyone else out there who thinks that
> the Linux kernel could benefit from consistent adherence to a coding
> standard?
> 
> I don't much care *what* it is, just as long as it's *something*.
> 
> Since this is still Linus's sandbox, I think we could all agree to
> standardize on Linus's own style, immortalize it as a set of indent
> parameters, and use it.  As an alternate, we could adhere to the GNU
> style.  I think it's Linus's call.

The GNU style is *definitely* out when it comes to code that I want to
keep up.  It's simply a rather horrible standard, and if you look at
some of the GNU sources you'll find them often completely
incomprehensible. 

I do have a standard of my own, BUT..  I know this is religious, and
when it comes to device drivers that I don't expect to be able to update
anyway and code like that, I allow almost any coding standard the author
wants to use, as he's the one that will have to keep it up.  If somebody
else takes over (and the original author obviously doesn't keep it up
any more), the new person is free to change the style.  I'm happy to say
that this has happened only a very few times. 

Anyway, here's my standard in a nutshell, with comments (and if I don't
always adhere to it 100%, it's only because sometimes I'm lazy too, but
you'll find it true for almost all code I write):

Physical layout:
 - K&R brace placement.
	1) It's the way God intended them to be, and K&R are his
	   prophets.  Enough said. 
 - intendation is hard-tabs, 8 character wide.
	1) This makes it obvious at a glance how the code is indented. 
	   2 characters is *much* too small, and 4 characters isn't enough
	   over 40 lines of code or so, especially if you don't have 40
	   lines on your screen.
	2) If you think this makes the code move too much to the right,
	   see the next point.

Coding style:
 - not more than 4 levels of indentation, preferably not even more than
   three levels. 
	1) If you have more, you'll confuse yourself and others. 
	2) Only BASIC programmers don't know how to use functions and
	   procedures.  They are there to make the code more readable and
	   maintanable, so use them.  When you notice that you need to go
	   over the 4-level limit, move the innermost levels into a
	   separate function.  If your flow of control makes that
	   impossible, you /still/ shouldn't go for 5 levels: you should
	   look at what you're doing wrong in the first place. 
 - functions shouldn't be longer than 50 lines. 20 lines preferable.
	1) see indentation. Same reasons, same fixes.
 - one function shouldn't have more than a few variables as it's working
   set.  Only about 5-10 local variables, and 0-2 global variables,
   Function arguments count as local variables. 
	1) see indentation. Same reason, same fixes.

There are *very* few reasons to break any of the above.  Just about the
only good reason is to go beyond the function length limit: some
functions are inherently "flat" (ie only a few levels of indentation)
and obvious, but tend to be rather long just because it has a lot of
equivalent cases that need testing.  This may be due to bad programming
(trying to make one function do many things), but it can also be simply
due to external circumstances (you get input you have no control over
and have to make decisions upon that). 

Variables:
 - Use variable scoping.
	1) Use global variables only when different processes need to
	   access it.  Even then, see if you can make it static to one
	   function. 
 - Naming: global entities need good names, local entities don't.  Don't
   use automatic naming rules: it just hides other problems with your
   code. 
	1) Global functions/variables need to have descriptive names, so
	   that you don't have to look up what they actually do.  If you
	   think you get sore fingers from the typing, you're doing
	   something wrong (probably using too much global information). 
	2) Local counters and other "temparary" values don't need
	   special names: their use really *is* obvious.  "i" and "tmp" are
	   perfectly good local variable names in many circumstances.
	3) *don't* name according to argument type ("automatic naming
	   rules"): argument type checking is done by the compiler, and the
	   types may change, anyway.  Only idiots and Microsoft code the
	   types in the names.  Give names according to what they do, or
	   how they are used. 
 - Typechecking is your friend.
	1) Don't use typecasts.  If you do use them, you'd better have a
	   *very* good reason, and "the compiler warns if I don't" isn't
	   good enough. 
	2) NULL is *obviously* ((void *) 0), and enything else is wrong
	   (even in C++, with the gcc extension which makes it ok).  Being
	   able to say "int i = NULL;" is horrible.  NULL being just 0 is
	   according to ANSI, but is bad for any other reason.  Typecasting
	   NULL for function parameters is bad: it's unreadable, and
	   doesn't make sense if you have prototypes (and if you don't have
	   prototypes you're failing already). 

Preprocessor:
 - Don't use it.
	1) Inline functions are mostly as good as a macro, and allow you
	   to do type checking.
	2) Macros are good for doing strange things.  Which is exactly
	   why you should *not* use them. 
	3) Use macros only for simple substitutions, to avoid magic
	   numbers in your code. 
	4) #ifdef's inside the code is ugly, and unreadable.  In header
	   files it's ok. 

Comments:
 - Well..  If you have the energy.  Management and teaching likes them,
   I like it more if you just follow the rules above. 
	1) Comments are ok. Don't overdo it.
	2) Comments inside code (especially when they are several lines)
	   just break the flow of code and doesn't help anything.  Putting
	   a comment above your function to tell what it does is fine, but
	   don't do it just to repeat the C declaration, or for functions
	   that don't need it. 

Reasons I don't generally like GNU code:
 - the indentation is all messed up: K&R has a good reason: it makes the
   code compact in th vertical direction, without making it unreadable. 
   Having braces on lines of their own doesn't much help readablility,
   but *does* mean that you won't see as much of the real code in your
   window. 
 - the functions are usually *way* too long.  200 lines with 4-character
   is horrible. 

Anyway, this got longer than I intended it to be, and most of it is
religious.  Ignore it if you wish, but you'll be sorry ;-)

		Linus