After running indent on the code, I manually selected each change that
now looks better than before. The remaining changes are left for later.
All in all, indent did a pretty good job, except for syntactic additions
from after 1990, but that was to be expected. Examples for such
additions are GCC's __attribute__ and C99 designated initializers.
Indent has only few knobs to tune the indentation. The knob for the
continuation indentation applies to function declarations as well as to
expressions. The knob for indentation of local variable declarations
applies to struct members as well, even if these are members of a
top-level struct.
Several code comments crossed the right margin in column 78. Several
other code comments were correctly broken though. The cause for this
difference was not obvious.
No functional change.
Before C99, C had no boolean type. Instead, indent used int for that,
just like many other programs. Even with C99, bool and int can be used
interchangeably in many situations, such as querying '!i' or '!ptr' or
'cond == 0'.
Since January 2021, lint provides the strict bool mode, which makes bool
a non-arithmetic type that is incompatible with any other type. Having
clearly separate types helps in understanding the code.
To migrate indent to strict bool mode, the first step is to apply all
changes that keep the resulting binary the same. Since sizeof(bool) is
1 and sizeof(int) is 4, the type ibool serves as an intermediate type.
For now it is defined to int, later it will become bool.
The current code compiles cleanly in C99 and C11 mode, as well as in
lint's strict bool mode. There are a few tricky places:
In args.c in 'struct pro', there are two types of options: boolean and
integer. Boolean options point to a bool variable, integer options
point to an int variable. To keep the current structure of the code,
the pointer has been changed to 'void *'. To ensure type safety, the
definition of the options is done via preprocessor magic, which in C11
mode ensures the correct pointer types. (Add CFLAGS+=-std=gnu11 at the
very bottom of the Makefile.)
In indent.c in process_preprocessing, a boolean variable is
post-incremented. That variable is only assigned to another variable,
and that variable is only used in a boolean context. To provoke a
different behavior between the '++' and the '= true', the source code
to be indented would need 1 << 32 preprocessing directives, which is
unlikely to happen in practice.
In io.c in dump_line, the variables ps.in_stmt and ps.in_decl only ever
get the values 0 and 1. For these values, the expressions 'a & ~b' and
'a && !b' are equivalent, in all versions of C. The compiler may
generate different code for them, though.
In io.c in parse_indent_comment, the assignment to inhibit_formatting
takes place in integer context. If the compiler is smart enough to
detect the possible values of on_off, it may generate the same code
before and after the change, but that is rather unlikely.
The second step of the migration will be to replace ibool with bool,
step by step, just in case there are any hidden gotchas in the code,
such as sizeof or pointer casts.
No change to the resulting binary.
The additional parameter last_bl_ptr was only necessary because the last
blank was stored as a pointer into the buffer. By storing the index in
the buffer instead, it doesn't need to be updated all the time.
No functional change.
The manual page says that the default maximum length of a comment line
is 78. The test 'comments.0' wrongly assumed that this 78 would refer
to the maximum _column_ allowed, which is off by one.
Fix the wording in the test 'comments.0' and remove the (now satisfied)
expectation comments in the test 'token-comment.0'.
Several other tests just happened to hit that limit, fix these as well.
column == 1 + indentation.
In addition, indentation is a relative distance while column is an
absolute position. Therefore, don't confuse these two concepts, to
prevent off-by-one errors.
No functional change.
The word 'col' should only be used for the 1-based column number. This
name is completely inappropriate for a line length since that provokes
off-by-one errors. The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.
Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.
No functional change.
These two functions operated on column numbers instead of indentation,
which required adjustments of '+ 1' and '- 1'. Their names were
completely wrong since these functions did not count anything, instead
they computed the column.
No functional change.
The goal is to only ever be concerned about the _indentation_ of a
token, never the _column_ it appears in. Having only one of these
avoids off-by-one errors.
No functional change.
Using the invariant 'column == 1 + indent'. This removes several overly
complicated '+ 1' from the code that are not needed conceptually.
No functional change.
Calculating the indentation is simpler than calculating the column,
since that saves the constant addition and subtraction of the 1.
No functional change.
It's strange that indent's own code is not formatted by indent itself,
which would be a good demonstration of its capabilities.
In its current state, I don't trust indent to get even the tokenization
correct, therefore the only safe way is to format the code manually.
This is a prerequisite for converting the token types to an enum instead
of a preprocessor define, since the return type of lexi will become
token_type. Having the enum will make debugging easier.
There was a single naming collision, which forced the variable in
scan_profile to be renamed. All other token names are used nowhere
else.
No change to the resulting binary.
Merge all the changes from the recent FreeBSD HEAD snapshot
into our local copy.
FreeBSD actively maintains this program in their sources and their
repository contains over 100 commits with changes.
Keep the delta between the FreeBSD and NetBSD versions to absolute
minimum, mostly RCS Id and compatiblity fixes.
Major chages in this import:
- Added an option -ldi<N> to control indentation of local variable names.
- Added option -P for loading user-provided files as profiles
- Added -tsn for setting tabsize
- Rename -nsac/-sac ("space after cast") to -ncs/-cs
- Added option -fbs Enables (disables) splitting the function declaration and opening brace across two lines.
- Respect SIMPLE_BACKUP_SUFFIX environment variable in indent(1)
- Group global option variables into an options structure
- Use bsearch() for looking up type keywords.
- Don't produce unneeded space character in function declarators
- Don't unnecessarily add a blank before a comment ends.
- Don't ignore newlines after comments that follow braces.
Merge the FreeBSD intend(1) tests with our ATF framework.
All tests pass.
Upgrade prepared by Manikishan Ghantasala.
Final polishing by myself.