Commit Graph

9554 Commits

Author SHA1 Message Date
Benno Schulenberg
49d8b99e4f softwrap: avoid time-consuming computations, to burden large files less
Whenever softwrap was toggled on or line numbers were toggled on/off or
the window was resized, the extra rows per line needed to be recomputed
for ALL the lines.  For large files with many long lines this was too
costly.

(This change causes the indicator to have an incorrect size when there
are many softwrapped chunks onscreen, but that will be addressed later.)

This fixes https://savannah.gnu.org/bugs/?60429.

Problem existed since version 5.0, since the indicator was introduced.
2021-04-21 16:40:20 +02:00
Benno Schulenberg
bb81932422 chars: work around the wrong private-use-character widths on OpenBSD
This fixes https://savannah.gnu.org/bugs/?60393.
2021-04-20 11:13:08 +02:00
Benno Schulenberg
5efb6836a8 options: retire the obsolete 'smooth', 'morespace', and 'nopauses' 2021-04-15 11:43:39 +02:00
Benno Schulenberg
48fa14acc0 tweaks: simplify two fragments of code
This makes the handling of plain ASCII a tiny bit slower, but it
affects only the users of --constantshow without --minibar, so...

All other uses of mbstrlen() and collect_char() are not in speed-
critical code paths.
2021-04-13 11:19:32 +02:00
Benno Schulenberg
eb7181b35e tweaks: adjust and improve one comment, and frob another 2021-04-12 15:14:05 +02:00
Benno Schulenberg
8db42023bb files: when Mac format has been detected, stay with it
This fixes https://savannah.gnu.org/bugs/?60382.

Bug existed since commit 09b919a6 from three weeks ago.
2021-04-12 14:50:04 +02:00
Benno Schulenberg
018a8e12ca build: fix compilation for --enable-tiny plus --enable-multibuffer
This will not show the error messages for other buffers when using
a tiny build, but... one cannot have everything.
2021-04-10 12:01:34 +02:00
Benno Schulenberg
b4a5aedc6c tweaks: remove a misplaced (and nested) #ifdef
It was accidentally introduced two weeks ago by commit 1c010d8e.
2021-04-09 16:55:07 +02:00
Benno Schulenberg
d6ed174d09 tweaks: morph a function into what it is actually used for
Since the previous commit, mbwidth() is used only to determine whether
a character is either double width or zero width.  There is no need to
return the actual width of the character; a simple yes or no is enough.

Transforming mbwidth() into is_doublewidth() also allows streamlining
it and is_zerowidth() a bit, so that they become slightly faster.
2021-04-09 16:38:23 +02:00
Benno Schulenberg
78f92e044a tweaks: avoid parsing a multibyte character twice
The number of bytes in the character were determined twice: first in
mbwidth() and then in char_length().  Do it just once, in mbtowide().

Also, avoid calling is_cntrl_char(), because it does unneeded checks
when we already know that the high bit is set.

This duplicates some code, but advance_over() is called a lot, so it
is important that it is as fast as possible.

This shouldn't slow down plain ASCII, as the extra checks (use_utf8
and *string < 0xA0) are done only for non-ASCII (apart from DEL).
2021-04-09 11:32:15 +02:00
Benno Schulenberg
f11931a0dd tweaks: rename a variable, for contrast with another
The 'start_index' was in index in the given text, while 'index' is an
index in the displayable string.  Having both of them using 'index' in
their name was somewhat confusing.
2021-04-08 12:19:34 +02:00
Benno Schulenberg
31a6931be9 tweaks: elide a call of strlen() for every row
For a normal file (without overlong lines) the strlen() wasn't much
of a problem.  But when there are very long lines, it wasted time
counting stuff that wouldn't be displayed on the current row anyway,
and reserved *far* too much memory for the displayable string.

Problem existed since commit cf0eed6c from five years ago that traded
a continuous comparison (of the used space with the reserved space)
against a one-time big reservation up front involving a strlen().
In retrospect that was not a good trade-off when softwrapping.

The extra check (charwidth == 0) is incurred only by characters that
have their high bit set, so the average file (with only ASCII) is not
affected by this -- it just loses an unneeded call of strlen().
2021-04-08 12:15:12 +02:00
Benno Schulenberg
debb288115 tweaks: reduce the maximum character length from six bytes to four
In UTF-8 valid multibyte characters are at most four bytes long,
and now that we no longer make use of mblen() and mbtowc() from
the underlying system, we won't get five- or six-byte sequences
mistakenly reported as valid (by glibc).  So it is always enough
to reserve space for just four bytes per character.
2021-04-07 17:21:25 +02:00
Benno Schulenberg
c75a3839da tweaks: elide a small function that is used just once 2021-04-07 17:08:05 +02:00
Benno Schulenberg
b6a32fbd5f tweaks: elide an unneeded resetting NULL call to wctomb()
Calling wctomb() with NULL as the first parameter returns zero in a
UTF-8 locale, meaning that there is no state, so there is no point
in resetting it either.
2021-04-07 16:11:40 +02:00
Benno Schulenberg
09e4c86606 tweaks: improve a couple of comments 2021-04-07 12:28:48 +02:00
Benno Schulenberg
20eb422829 tweaks: avoid converting a file name for more than will fit on screen 2021-04-07 12:12:06 +02:00
Benno Schulenberg
90c6b572d0 display: avoid determining twice from and until where to draw each row
The two calls of draw_row() are each immediately preceded by a call to
display_string(), which has already determined from which x position
and until which x position in the relevant line the current row will
be drawn -- doing this again in draw_row() is a waste of time.  Even
though it is ugly, pass the two data points from one function to the
other via global variables.

For normal files (without overlong lines), this saves on average some
fifty calls of advance_over() per row.  When softwrapping a file with
overlong lines, the savings for each softwrapped chunk are much higher.
2021-04-07 11:27:10 +02:00
Benno Schulenberg
712b574fb7 tweaks: rename a variable, away from an abbreviation 2021-04-06 16:27:46 +02:00
Benno Schulenberg
fd023d6dcf tweaks: put the most likely condition first, for a quicker return
Also condense a comment.
2021-04-06 11:32:25 +02:00
Benno Schulenberg
9c16be32d7 tweaks: reshuffle two conditions, to have the most unlikely one first
This also better fits the preceding comment.
2021-04-05 16:10:44 +02:00
Benno Schulenberg
c3cdb099da tweaks: reshuffle a comment, and put the main extension first
And add some air to the compact file.
2021-04-05 16:06:54 +02:00
Mike Frysinger
682a088fb8 syntax: tcl: support Expect scripts too 2021-04-05 16:02:44 +02:00
Benno Schulenberg
8c25bd0e94 tweaks: elide two more instances of useless character copying
Just point at the relevant characters directly
instead of first copying them out.
2021-04-02 16:39:10 +02:00
Benno Schulenberg
20e122ef41 startup: show the helpful message only when ^G has not been rebound
(Well, it now checks that ^G is still the first shortcut that is bound
to 'do_help', but that is good enough: if the user did any rebinding,
they probably do not need any reminder about how to invoke 'Help'.)

This fixes https://savannah.gnu.org/bugs/?60315.
Reported-by: Robert Goulding <goulding.2@nd.edu>
2021-04-01 10:09:28 +02:00
Benno Schulenberg
0dcac9188f tweaks: simplify two fragments of code, eliding useless character copying 2021-03-29 20:06:05 +02:00
Benno Schulenberg
1c010d8ec9 chars: implement mbtowc() ourselves, for more efficiency
This saves a function call, and the passing and checking of the
MAXCHARLEN parameter, and the checking whether wc is maybe NULL
(which for nano is never the case), and who knows what other
overheads mbtowc() has, and our workaround for glibc.

Code was written after looking at gnulib/lib/mbrtowc-impl-utf8.h.
2021-03-29 12:36:10 +02:00
Benno Schulenberg
b020937475 chars: implement mblen() ourselves, for efficiency
Most implementations of mblen() do a call to mbtowc(), which is
a waste of time when all we want to know is the number of bytes
(and when we already know that we're using UTF-8 and that the
first byte is at least 0xC2).

(This also avoids burdening correct implementations with the
workaround that was needed only for glibc.)

Code was written after looking at gnulib/lib/mbrtowc-impl-utf8.h.
2021-03-27 14:38:28 +01:00
Benno Schulenberg
e3f46b066a build: fix compilation when configured with --disable-multibuffer 2021-03-26 12:21:44 +01:00
Benno Schulenberg
df7fe1280d tweaks: drop unneeded braces and adjust indentation after previous change 2021-03-26 12:17:44 +01:00
Benno Schulenberg
929770191e chars: work around a UTF-8 bug in glibc, to display invalid codes right
The mblen() and mbtowc() functions will happily return 4 or 5 or 6
for byte sequences that start with 0xF4 0x90 or higher.  But those
sequences encode for U+110000 or higher, which are not valid Unicode
code points.  The libc of FreeBSD and OpenBSD and Alpine correctly
return -1 for such sequences.  Make nano behave correctly also when
linked against glibc, so that invalid sequences are always presented
as a series of invalid bytes and never as a single invalid code.

This fixes https://savannah.gnu.org/bugs/?60262.

Bug existed since before version 2.0.0.
2021-03-26 11:07:05 +01:00
Benno Schulenberg
66d9d6c6d2 tweaks: elide the pointless is_valid_unicode() function
The call of this function in make_mbchar() does not add anything,
because wctomb() already returns -1 for codes U+D800 to U+DFFF,
and parse_verbatim_kbinput() already rejects anything that starts
with U+11.... or higher, so make_mbchar() is never called for codes
beyond U+10FFFF.

And the call in display_string() just needs to check for wc <= 0x10FFFF
because mbtowc() already returns -1 for codes U+D800 to U+DFFF.
2021-03-25 11:24:41 +01:00
Benno Schulenberg
de816840cb input: accept Unicode codes for non-characters as valid, since they are
That is, accept U+FDD0 to U+FDEF, and accept U+xxFFFE and U+xxFFFF
for xx from 00 to 10 hex, being the 66 reserved "non-characters".

It may not be wise of the user to input these "things" (by typing
their code after M-V), but the codes are valid Unicode code points
and should not be rejected.

See https://www.unicode.org/faq/private_use.html#nonchar8 et al.

This fixes https://savannah.gnu.org/bugs/?60263.

Bug existed since before version 2.0.0.
2021-03-24 17:11:05 +01:00
Benno Schulenberg
74fcc3be79 tweaks: normalize the indentation after an earlier change
(Should have been done yesterday, right after commit 2f718e11.)
2021-03-24 12:29:50 +01:00
Benno Schulenberg
b6909d3737 build: fix compilation when configured with --enable-tiny
Commit 0c1bf429 from last week added two calls to digits()
for --constantshow.
2021-03-24 12:21:50 +01:00
Benno Schulenberg
823d79b36c tweaks: shorten a comment and trim an #ifdef 2021-03-24 12:16:10 +01:00
Benno Schulenberg
735757b0c1 tweaks: set the file format only when unset, so it doesn't need saving
When inserting a file into the current buffer, the 'fmt' element will
already be set.  When we avoid overwriting the current value of 'fmt'
(when it's other than UNSPECIFIED), we don't need to save and restore
the value when inserting a file.
2021-03-24 12:00:34 +01:00
Benno Schulenberg
09b919a68f files: always register the format, also when the file is unwritable
When saving the buffer under a different name, it should by default
have the same format as the original file.

This fixes https://savannah.gnu.org/bugs/?60278.

Bug existed since version 2.6.0, commit 0293eac1.
2021-03-24 11:53:56 +01:00
Benno Schulenberg
2f718e11a7 files: create a new buffer earlier, so that error messages can be stored
This improves the fix for https://savannah.gnu.org/bugs/?60269,
by not dropping error messages that happen before a buffer is opened.

This basically reverts commit b63c90bf from a year ago, except that
it now always deletes the created buffer when the user does not want
to override the lock file, also when it is the only buffer.
2021-03-23 16:46:37 +01:00
Benno Schulenberg
77da54c6c6 startup: do not store an error message in the record of another buffer
Set the 'format' of a file only when it has been fully read in,
so that this field can be used to indicate that any later error
message cannot be meant for this buffer.

This fixes https://savannah.gnu.org/bugs/?60269.

Bug existed since commit 6bf52dcc from yesterday.
2021-03-23 16:19:07 +01:00
Benno Schulenberg
6bf52dcc8d startup: do not crash when trying to open a device or directory
Make sure there is an 'openfile' record before trying to save an
error message in this record.

This fixes https://savannah.gnu.org/bugs/?60268.

Bug existed since commit ede64d7e from yesterday.
2021-03-22 15:50:31 +01:00
Benno Schulenberg
3d9e803aed syntax: c: colorize also labels that contain digits, and uncolorize colon
Labels may contain digits (after the first character).
And the colon after "default" should not be colored.

Inspired-by: Hussam al-Homsi <sawuare@gmail.com>
2021-03-22 11:41:08 +01:00
Benno Schulenberg
ede64d7ea0 feedback: upon first switch to a buffer, show its error message (if any)
When opening multiple files and some of them had an error, only the
first message was shown and the others were lost -- indicated only
by three dots.  Improve upon this by storing the first error message
for each buffer and showing this message when the buffer is first
switched to.

Requested-by: Mike Frysinger <vapier@gentoo.org>
2021-03-21 16:52:29 +01:00
Benno Schulenberg
e8db390d6f tweaks: reshuffle a fragment of code, to prepare for the next change
Also, don't reserve MAXCHARLEN bytes for the terminating NUL byte.
2021-03-21 16:52:29 +01:00
Benno Schulenberg
0c1bf429e8 display: make the output of --constantshow less jittery
That is: reserve for the current line and current character the number
of positions needed for the total number of lines and characters, and
reserve two positions for both the current column and the total number
of columns.  This will keep all nine numbers in the output in the same
place -- as long as there are no lines with more than 99 columns.  In
this latter case there will still be some jitter, but all-in-all the
output is much stabler than it was.

Suggested-by: Mike Frysinger <vapier@gentoo.org>
2021-03-18 16:20:04 +01:00
Benno Schulenberg
f6357a73f0 memory: fix an off-by-one error to free also the last line in a group
The number of lines in a group is the difference in line numbers
between the last line and the first line *plus one*.

This fixes https://savannah.gnu.org/bugs/?60104.

Bug existed since version 2.9.0, since indenting and unindenting
became undoable, and commit f722c532 formed a part of that.
2021-03-05 18:50:34 +01:00
Benno Schulenberg
49ca7e5aa8 memory: do not allocate space for multidata when it's already allocated
Allocating it again would leak the existing space.

This fixes https://savannah.gnu.org/bugs/?60172.
Reported-by: Mike Frysinger <vapier@gentoo.org>

Bug existed since version 5.6, commit 1fdd23d3.
2021-03-05 11:25:14 +01:00
Benno Schulenberg
be9b1a1887 tweaks: avoid a warning on newer compilers, by writing an extra byte
When the version number is a trio, the version string will occupy
ten bytes and the terminating NUL byte would not be written (which
was not a problem as byte 12 of the lock data is zero anyway).
But it's better to not have the compiler complain, so allow writing
the terminating NUL byte outside of the ten bytes reserved for the
version string.
2021-03-03 10:47:41 +01:00
Benno Schulenberg
bb74aacf2d po: update translations and regenerate POT file and PO files 2021-03-03 10:11:58 +01:00
Benno Schulenberg
b72d9af52c bump version numbers and add a news item for the 5.6.1 release 2021-03-03 09:47:41 +01:00