Commit Graph

136 Commits

Author SHA1 Message Date
rillig c162536975 indent: don't claim that indent is "the nicest C pretty printer around"
That statement may have been true in 1993, but definitely is not true
anymore, as of 2021.

The part about "needs to be completely redone" is still true though
since indent cannot even format its own source code in an acceptable
way.
2021-03-26 22:33:54 +00:00
rillig 0554da173f indent: remove workaround for array initialization bug in lint
The bug has been fixed in init.c 1.133 from 2021-03-25.
2021-03-26 22:27:43 +00:00
rillig f8d94ead78 indent: fix Clang build everywhere but on amd64
No idea why Clang didn't complain about this on amd64, only on all other
platforms.
2021-03-26 22:02:00 +00:00
rillig de1cec24a0 indent: clean up check_size_comment
The additional parameter last_bl_ptr was only necessary because the last
blank was stored as a pointer into the buffer.  By storing the index in
the buffer instead, it doesn't need to be updated all the time.

No functional change.
2021-03-14 05:26:42 +00:00
rillig a8e5d6abb4 indent: remove trailing whitespace 2021-03-14 04:52:10 +00:00
rillig f21f69f749 indent: clean up target column computation in process_comment
No functional change.
2021-03-14 04:42:17 +00:00
rillig 3cc06592f5 indent: make compute_code_indent more readable
The '?:' operator computing the factor was too hard to read.  When
quickly scanning the code, the 1 in the expression looked too much like
it would be added to the indentation, which would turn the indentation
length into a column number, and that again would smell like an
off-by-one error.

No functional change.
2021-03-14 01:44:37 +00:00
rillig 1e4c413bac indent: fix off-by-one error in comment wrapping
The manual page says that the default maximum length of a comment line
is 78.  The test 'comments.0' wrongly assumed that this 78 would refer
to the maximum _column_ allowed, which is off by one.

Fix the wording in the test 'comments.0' and remove the (now satisfied)
expectation comments in the test 'token-comment.0'.

Several other tests just happened to hit that limit, fix these as well.
2021-03-14 01:34:13 +00:00
rillig bd908d33db indent: give indent a try at formatting its own code
Formatting indent.h required the following manual corrections
afterwards:

The first tab in the comment in line 1 was replaced with a space but
shouldn't be.

The spacing around the '...' in function prototypes was completely
wrong.  It looked like 'const char *,...)__printflike', without any
spaces.

The '*' of the return type 'const char *' was tied to the function name,
even though this declaration was only for a single function.  In such a
case, it's more appropriate to line up the function names.

The function-like macros were not indented to -di.  This is something
that I would not expect from indent, so it's ok to do that manually.
2021-03-14 00:33:25 +00:00
rillig 744982a9b6 indent: fix lint warnings
No functional change.
2021-03-14 00:22:16 +00:00
rillig 1eb04b8cb5 indent: remove disabled duplicate RCS ID from header
By convention, headers don't record their RCS ID.
2021-03-13 23:42:23 +00:00
rillig c960730f11 indent: fix documentation of parser_state.paren_indents
The column position is not the same as the indentation (off-by-one).
2021-03-13 23:36:10 +00:00
rillig fea66a5127 indent: add debug logging for switching the input buffer
No functional change outside debug mode.
2021-03-13 18:46:39 +00:00
rillig eb340c3e4a indent: align comments in indent's own code
No functional change.
2021-03-13 18:24:56 +00:00
rillig 56c4653e3f indent: remove the '+ 1' from right margin calculation in comment
No functional change.
2021-03-13 18:11:31 +00:00
rillig 7e11a5f382 indent: rename local variable in dump_line
This clarifies that the variable names a column, not an indentation.
2021-03-13 13:55:42 +00:00
rillig 7301e2c2d1 indent: in dump_line, reduce scope of local variable
This allows the variable 'target' in the lower half of the function to
get a more specific name.

No functional change.
2021-03-13 13:54:01 +00:00
rillig 66af9142ab indent: distinguish between 'column' and 'indentation'
column == 1 + indentation.

In addition, indentation is a relative distance while column is an
absolute position.  Therefore, don't confuse these two concepts, to
prevent off-by-one errors.

No functional change.
2021-03-13 13:51:08 +00:00
rillig 7ad6720446 indent: rename pr_comment to process_comment, clean up documentation
No functional change.
2021-03-13 13:25:23 +00:00
rillig 93e110b15f indent: fix handling of '/*' in string literal in preprocessing line
Previously, the '/*' in the string literal had been interpreted as the
beginning of a comment, which was wrong.  Because of that, the variable
declaration in the following line was still interpreted as part of the
comment.  The comment even continued until the end of the file.

Due to indent's forgiving nature, it neither complained nor even
mentioned that anything had gone wrong.  The decision of rather
producing wrong output than failing early is a dangerous one.

At least, there should have been an error message that at the end of the
file, the parser was still in a a comment, expecting the closing '*/'.
2021-03-13 13:14:14 +00:00
rillig 102d371a8a indent: split 'main_loop' into several functions
No functional change.
2021-03-13 12:52:24 +00:00
rillig b97b269d52 indent: split 'main' into manageable parts
Since several years (maybe even decades) compilers know how to inline
static functions that are only used once.  Therefore there is no need to
have overly long functions anymore, especially not 'main', which is only
called a single time and thus does not add any noticeable performance
degradation.

No functional change.
2021-03-13 11:47:22 +00:00
rillig acec5beac9 indent: remove redundant parentheses
No functional change.
2021-03-13 11:27:01 +00:00
rillig 0a99ae80ca indent: fix confusing variable names
The word 'col' should only be used for the 1-based column number.  This
name is completely inappropriate for a line length since that provokes
off-by-one errors.  The name 'cols' would be acceptable although
confusing since it sounds so similar to 'col'.

Therefore, rename variables that are related to the maximum line length
to 'line_length' since that makes for obvious code and nicely relates to
the description of the option in the manual page.

No functional change.
2021-03-13 11:19:43 +00:00
rillig 2c2459a1fa indent: document undefined behavior in processing of comments
No functional change.
2021-03-13 10:47:59 +00:00
rillig f3b63c94c8 indent: inline calls to count_spaces and count_spaces_until
These two functions operated on column numbers instead of indentation,
which required adjustments of '+ 1' and '- 1'.  Their names were
completely wrong since these functions did not count anything, instead
they computed the column.

No functional change.
2021-03-13 10:32:25 +00:00
rillig 526591ce10 indent: replace column computation with indentation computation
No functional change.
2021-03-13 10:20:54 +00:00
rillig 0c51d9451c indent: replace compute_code_column with compute_code_indent
The goal is to only ever be concerned about the _indentation_ of a
token, never the _column_ it appears in.  Having only one of these
avoids off-by-one errors.

No functional change.
2021-03-13 10:06:47 +00:00
rillig 5888ddac66 indent: replace compute_label_column with compute_label_indent
Using the invariant 'column == 1 + indent'.  This removes several overly
complicated '+ 1' from the code that are not needed conceptually.

No functional change.
2021-03-13 09:54:11 +00:00
rillig 4f1ab5eff9 indent: manually fix indentation in indent's own source code 2021-03-13 09:48:04 +00:00
rillig 6f2286deb6 indent: add debug logging for actually writing to the output file
Together with the results of the tokenizer and the 4 buffers for token,
label, code and comment, the debug log now provides a good high-level
view on how the indentation happens and where to look for the many
remaining bugs.
2021-03-13 09:21:57 +00:00
rillig 6892e0dc50 indent: remove strange debugging code that went in the output file
Whenever the code to be output contained the magic byte 0x80, instead of
writing this byte, indent wrote the column number at the beginning of
the code snippet, times 7.  Especially the 'times 7' does not make any
sense at all.

In ISO-8859-1, this character position is not assigned.  In Microsoft
Codepage 1252 it is the Euro sign.  In UTF-8 (which was probably not on
the author's list when the code was originally written) it occurs as the
middle byte for code points like U+2026 (horizontal ellipsis) from the
block General Punctuation.

Remove this strange code, thereby fixing indent for UTF-8 code.  The
code had been there since at least 1993-04-09, when it was first
imported to NetBSD.
2021-03-13 09:06:12 +00:00
rillig 689a1f7922 indent: replace pad_output with output_indent
Calculating the indentation is simpler than calculating the column,
since that saves the constant addition and subtraction of the 1.

No functional change.
2021-03-13 00:26:56 +00:00
rillig 47a823baba indent: clean up verbose documentation comments from the 1970s
Since C90, there is no need to repeat the type of the function
parameters.

In the whole code of indent, there is a lot of confusion between the
concepts of a 'column' (which is a position on the screen, counting
starts at 1) and 'indentation' (which is a length, not a position).  To
avoid this confusion, the code will be rewritten anyway very soon.

Repeatedly adding and subtracting 1 from the 'current column' is not
elegant, this should rather be done by consistently measuring only the
indentation from the left border (at offset 0), as a distance, not as an
absolute position.
2021-03-13 00:03:29 +00:00
rillig 8f15c12d1f indent: add 'const', rename variables, reorder formula for tab width
Column counting starts at 1.  This 1 should rather be at the beginning
of the formula since it is thought of being added at the very beginning
of the line, not at the end.

When adding a tab, the newly added tab is added at the end of the
string, therefore that '+ 1' should be at the end of the formula as
well.

No functional change.
2021-03-12 23:27:41 +00:00
rillig a0306e684f indent: replace 'target' with 'indent' in function names
The word 'target' was not as specific as possible.

No functional change.
2021-03-12 23:16:00 +00:00
rillig 2be5ec967d indent: use consistent indentation for 'else'
Half of the code used -ce, the other half the opposite -nce.

No functional change.
2021-03-12 23:10:18 +00:00
rillig 2603dcca0f indent: make output_string inline
GCC 9.3.0 didn't notice that the argument to this function is always a
string literal, which makes it worthwhile to inline the call.
2021-03-12 19:14:18 +00:00
rillig 51ad939870 indent: add helper functions for doing the actual output
This allows to add debug logging to these few functions instead of all
other places that might output something.

Reducing the possible output formats to a few primitives makes dump_line
simpler, especially the fprintf calls.  It also removes the non-constant
printf string.

The call to output_int may be meant for debugging, as the character 0x80
is unlikely to appear in any real-world code.

No functional change.
2021-03-12 19:11:29 +00:00
rillig d07ba49995 indent: fix misleading indentation in indent's own code
No functional change.
2021-03-12 18:11:50 +00:00
rillig 61a2b8d236 indent: move code for tokenizing numbers further up
Having it directly below the table makes it easier understandable.

I also tried to omit this function entirely by moving the code into the
initializer itself, but that made the code redundant and furthermore
increased the size of the resulting binary, probably because of the new
relocation records.

No functional change.
2021-03-12 17:46:48 +00:00
rillig 81c22dc68a indent: manually fix indentation
No functional change.
2021-03-12 00:15:34 +00:00
rillig 200ea2d398 indent: reduce indentation of check_size functions
No functional change.
2021-03-11 22:32:06 +00:00
rillig c7f0688822 indent: remove redundant cast after allocation functions
No functional change.
2021-03-11 22:28:30 +00:00
rillig 72f722fd46 indent: use consistent array indexing
No functional change.
2021-03-11 22:15:44 +00:00
rillig a855cd0537 indent: merge duplicate code for reading from the input buffer
No functional change.
2021-03-11 21:47:36 +00:00
rillig ee51961c3d indent: extract search_brace from main
No functional change.
2021-03-09 19:46:28 +00:00
rillig 4ca89b5791 indent: extract capsicum code out of the main function
No functional change.
2021-03-09 19:32:41 +00:00
rillig d60ce5707b indent: rename a few more token types
The previous names were either too short or ambiguous.

No functional change.
2021-03-09 19:23:08 +00:00
rillig ed0fd2fe70 indent: make token names more precise
The previous 'casestmt' was wrong since a case label is not a statement
at all.

The previous 'swstmt' was overly short, and wrong as well, since it
represents only the 'switch (expr)' part, which is not a complete switch
statement.  Same for 'ifstmt', 'whilestmt', 'forstmt'.

The previous word 'head' was not precise enough since it didn't specify
exactly where the head ends and the body starts.  Especially for
handling the dangling else, this distinction is important.

No functional change.
2021-03-09 19:14:39 +00:00