Commit Graph

1005 Commits

Author SHA1 Message Date
Ron Yorston
305a30d80b awk: fix read beyond end of buffer
Commit 7d06d6e18 (awk: fix printf %%) can cause awk printf to read
beyond the end of a strduped buffer:

  2349      while (*f && *f != '%')
  2350          f++;
  2351      c = *++f;

If the loop terminates because a NUL character is detected the
character after the NUL is read.  This can result in failures
depending on the value of that character.

function                                             old     new   delta
awk_printf                                           672     665      -7

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-09-09 18:12:21 +02:00
Daniel Thau
7d06d6e186 awk: fix printf %%
A refactor of the awk printf code in
e2e3802987
appears to have broken the printf interpretation of two percent signs,
which normally outputs only one percent sign.

The patch below brings busybox awk printf behavior back into alignment
with the pre-e2e380 behavior, the busybox printf util, and other common
(awk and non-awk) printf implementations.

function                                             old     new   delta
awk_printf                                           626     672     +46

Signed-off-by: Daniel Thau <danthau at bedrocklinux.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-09-05 03:42:51 +02:00
Ron Yorston
a51d953b95 vi: further changes to colon addresses
Improved error messages:

- specify when a search fails or a mark isn't set;
- warn when line addresses are out of range or when a range of
  lines is reversed.

Addresses are limited to the number of lines in the file so a
command like ':2000000000' (go to the two billionth line) no
longer causes a long pause.

Improved vi compatibility of '+' and '-' operators that aren't
followed immediately by a number:

   :4+++=       7
   :3-2=        1
   :3 - 2=      4 (yes, really!)

In a command like ':,$' the empty address before the separator now
correctly refers to the current line.  (The similar case ':1,' was
already being handled.)

And all with a tidy reduction in bloat (32-bit build):

function                                             old     new   delta
colon                                               4029    4069     +40
.rodata                                            99348   99253     -95
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/1 up/down: 40/-95)            Total: -55 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-08-29 20:07:20 +02:00
Ron Yorston
74c4f356ae vi: code shrink print_literal()
Simplify the function print_literal() which is used to format a
string that may contain unprintable characters or control
characters.

- Unprintable characters were being displayed in normal text rather
  than the bold used for the rest of the message.  This doesn't seem
  particularly helpful and it upsets the calculation of the width
  of the message in show_status_line().  Use '?' rather than '.' for
  unprintable characters.

- Newlines in the string were displayed as both '^J' and '$', which
  is somewhat redundant.

function                                             old     new   delta
not_implemented                                      199     108     -91
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-91)             Total: -91 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-08-22 00:09:57 +02:00
Ron Yorston
08ad934ac4 vi: searches in colon commands should wrap
The '/' and '?' search commands wrap to the other end of the buffer
if the search target isn't found.  When searches are used to specify
addresses in colon commands they should do the same.

(In traditional vi and vim this behaviour is controlled by the
'wrapscan' option.  BusyBox vi doesn't have this option and always
uses the default behaviour.)

function                                             old     new   delta
colon                                               4033    4077     +44
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/0 up/down: 44/0)               Total: 44 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-08-22 00:09:57 +02:00
Ron Yorston
38e9c8c95b vi: don't right shift empty lines
The right shift command ('>') shouldn't affect empty lines.

function                                             old     new   delta
do_cmd                                              4860    4894     +34
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/0 up/down: 34/0)               Total: 34 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-08-20 15:26:09 +02:00
Ron Yorston
f9217cd235 vi: support ~/.exrc
Run initialisation commands from ~/.exrc.  As with EXINIT these
commands are processed before the first file is loaded.

Commands starting with double quotes are ignored.  This is how
comments are often included in .exrc.

function                                             old     new   delta
vi_main                                              268     406    +138
colon                                               4033    4071     +38
.rodata                                           108411  108442     +31
packed_usage                                       34128   34118     -10
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 3/1 up/down: 207/-10)           Total: 197 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-08-20 15:26:09 +02:00
Ron Yorston
f07772f19e vi: changes to handling of -c and EXINIT
Rewrite handling of command line arguments so any number of -c
commands will be processed.  Previously only two -c commands
were allowed (or one if EXINIT was set).

Process commands from EXINIT before the first file is read into
memory, as specified by POSIX.

function                                             old     new   delta
run_cmds                                               -      77     +77
.rodata                                           108410  108411      +1
vi_main                                              305     268     -37
edit_file                                            816     764     -52
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 1/2 up/down: 78/-89)            Total: -11 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-08-20 15:26:09 +02:00
Denys Vlasenko
d6a7203042 vi: fix compile-time error if !ENABLE_FEATURE_VI_SETOPTS
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-08-16 01:31:32 +02:00
Denys Vlasenko
dabbeeb793 awk: whitespace and debugging tweaks
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-14 16:58:05 +02:00
Denys Vlasenko
95fffd8a7f vi: remove redundant assignment
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-14 16:32:19 +02:00
Denys Vlasenko
d3480dd582 awk: disallow break/continue outside of loops
function                                             old     new   delta
.rodata                                           104139  104186     +47
chain_group                                          610     633     +23
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 70/0)               Total: 70 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-14 16:32:19 +02:00
Denys Vlasenko
d62627487a awk: tighten parsing - disallow extra semicolons
'; BEGIN {...}' and 'BEGIN {...} ;; {...}' are not accepted by gawk

function                                             old     new   delta
parse_program                                        332     353     +21

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-14 16:32:19 +02:00
Ron Yorston
e6f4145f29 vi: fix regex search compilation error
Building with FEATURE_VI_REGEX_SEARCH enabled fails.

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>
2021-07-13 18:23:43 +02:00
Denys Vlasenko
36feb26824 vi: somewhat more readable code, no logic changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 16:16:21 +02:00
Ron Yorston
2759201401 vi: allow delimiter in ':s' to be escaped
When regular expressions are allowed in search commands it becomes
possible to escape the delimiter in search/replace commands.  For
example, this command will replace '/abc' with '/abc/':

   :s/\/abc/\/abc\//g

The code to split the command into 'find' and 'replace' strings
should allow for this possibility.

VI_REGEX_SEARCH isn't enabled by default.  When it is:

function                                             old     new   delta
strchr_backslash                                       -      38     +38
colon                                               4378    4373      -5
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 0/1 up/down: 38/-5)              Total: 33 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 14:42:20 +02:00
Denys Vlasenko
95ac4a48f1 vi: allow regular expressions in ':s' commands
BusyBox vi has never supported the use of regular expressions in
search/replace (':s') commands.  Implement this using GNU regex
when VI_REGEX_SEARCH is enabled.

The implementation:

- uses basic regular expressions, to match those used in the search
  command;

- only supports substitution of back references ('\0' - '\9') in the
  replacement string.  Any other character following a backslash is
  treated as that literal character.

VI_REGEX_SEARCH isn't enabled in the default build.  In that case:

function                                             old     new   delta
colon                                               4036    4033      -3
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-3)               Total: -3 bytes

When VI_REGEX_SEARCH is enabled:

function                                             old     new   delta
colon                                               4036    4378    +342
.rodata                                           108207  108229     +22
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 364/0)             Total: 364 bytes

v2: Rebase.  Code shrink.  Ensure empty replacement string is null terminated.

Signed-off-by: Andrey Dobrovolsky <andrey.dobrovolsky.odessa@gmail.com>
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 14:38:20 +02:00
Ron Yorston
c76c78740a vi: improve handling of anchored searches
Suppose we search for a git conflict marker '<<<<<<< HEAD' using
the command '/^<<<'.  Using 'n' to go to the next match finds
'<<<' on the current line, apparently ignoring the '^' anchor.

Set a flag in the compiled regular expression to indicate that the
start of the string should not be considered a beginning-of-line
anchor.  An exception has to be made when the search starts from
the beginning of the file.  Make a similar change for end-of-line
anchors.

This doesn't affect a default build with VI_REGEX_SEARCH disabled.
When it's enabled:

function                                             old     new   delta
char_search                                          247     285     +38

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 13:50:59 +02:00
Ron Yorston
2916443ab6 vi: use basic regular expressions for search
Both traditional vi and vim use basic regular expressions for
search.  Also, they don't allow matches to extend across line
endings.  Thus with the file:

   123
   234

the search '/2.*4$' should find the second '2', not the first.

Make BusyBox vi do the same.

Whether or not VI_REGEX_SEARCH is enabled:

function                                             old     new   delta
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0)                 Total: 0 bytes

Signed-off-by: Andrey Dobrovolsky <andrey.dobrovolsky.odessa@gmail.com>
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 13:36:29 +02:00
Ron Yorston
b50ac07cba vi: allow 'gg' to specify a range
Commit 7b93e317c (vi: enable 'dG' command. Closes 11801) allowed
'G' to be used as a range specifier for change/yank/delete
operations.

Add similar support for 'gg'.  This requires setting the 'cmd_error'
flag if 'g' is followed by any character other than another 'g'.

function                                             old     new   delta
do_cmd                                              4852    4860      +8
.rodata                                           108179  108180      +1
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 9/0)                 Total: 9 bytes

Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-13 13:32:13 +02:00
Denys Vlasenko
ab755e3717 awk: in parsing, remove superfluous NEWLINE check; optimize builtin arg evaluation
function                                             old     new   delta
exec_builtin                                        1149    1145      -4

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-12 13:30:30 +02:00
Denys Vlasenko
8d269ef859 awk: fix printf "%-10c", 0
function                                             old     new   delta
awk_printf                                           596     626     +30

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-12 11:27:11 +02:00
Denys Vlasenko
caa93ecdd3 awk: fix corner case in awk_printf
Example where it wasn't working:
    awk 'BEGIN { printf "qwe %s rty %c uio\n", "a", 0, "c" }'
- the NUL printing in %c caused premature stop of printing.

function                                             old     new   delta
awk_printf                                           593     596      +3

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 18:16:10 +02:00
Denys Vlasenko
39aabfe8f0 awk: unbreak "cmd" | getline
function                                             old     new   delta
evaluate                                            3337    3343      +6

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:51:43 +02:00
Denys Vlasenko
4ef8841b21 awk: unbreak "printf('%c') can output NUL" testcase
function                                             old     new   delta
awk_printf                                           546     593     +47

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:25:33 +02:00
Denys Vlasenko
3d57a84907 awk: undo TI_PRINT, it introduced a bug (print with any redirect acting as printf)
function                                             old     new   delta
evaluate                                            3329    3337      +8

Patch by Ron Yorston <rmy@pobox.com>

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 12:00:31 +02:00
Denys Vlasenko
49c3ce64f0 awk: rollback_token() + chain_group() == chain_until_rbrace()
function                                             old     new   delta
parse_program                                        336     332      -4

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-11 11:46:21 +02:00
Denys Vlasenko
e2e3802987 awk: fix printf buffer overflow
function                                             old     new   delta
awk_printf                                           468     546     +78
fmt_num                                              239     247      +8
getvar_s                                             125     111     -14
evaluate                                            3343    3329     -14
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/2 up/down: 86/-28)             Total: 58 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-04 01:25:34 +02:00
Denys Vlasenko
08ca313d7e awk: simplify tests for operation class
Usually, an operation class has only one possible value of "info" word.
In this case, just compare the entire info word, do not bother
to mask OPCLSMASK bits.

(Example where this is not the case: OC_REPLACE for "<op>=")

function                                             old     new   delta
mk_splitter                                          106     100      -6
chain_group                                          616     610      -6
nextarg                                               40      32      -8
exec_builtin                                        1157    1149      -8
as_regex                                             111     103      -8
awk_split                                            553     543     -10
parse_expr                                           948     936     -12
awk_getline                                          656     642     -14
evaluate                                            3387    3343     -44
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/9 up/down: 0/-116)           Total: -116 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 14:11:51 +02:00
Denys Vlasenko
cb042b0582 awk: restore strdup elision optimization in assignment
function                                             old     new   delta
evaluate                                            3339    3387     +48

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 13:30:45 +02:00
Denys Vlasenko
90404ed2f6 awk: match(): code shrink
function                                             old     new   delta
do_match                                               -     165    +165
exec_builtin_match                                   202       -    -202
------------------------------------------------------------------------------
(add/remove: 1/1 grow/shrink: 0/0 up/down: 165/-202)          Total: -37 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 12:20:36 +02:00
Denys Vlasenko
0e3ef4efb0 awk: rand(): 64-bit constants should be ULL
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 11:57:59 +02:00
Denys Vlasenko
2211fa70cc awk: do not use a copy of g_progname for node->l.new_progname
We never destroy g_progname's, the strings still exist, no need to copy

function                                             old     new   delta
chain_node                                           104      97      -7

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 11:54:01 +02:00
Denys Vlasenko
e1e7ad6b60 awk: support %F %a %A in printf
function                                             old     new   delta
.rodata                                           104111  104120      +9

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 01:59:36 +02:00
Denys Vlasenko
1f765709ed awk: open-code TS_OPTERM, no logic changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 01:32:03 +02:00
Denys Vlasenko
2b65e73db3 awk: tighten rules in action parsing
Disallow:
    BEGIN
	{ action }  - must start on the same line
Disallow:
    func f()
	print "hello" - must be in {...}

function                                             old     new   delta
chain_until_rbrace                                     -      41     +41
parse_program                                        307     336     +29
chain_group                                          649     616     -33
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 1/1 up/down: 70/-33)             Total: 37 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 01:16:48 +02:00
Denys Vlasenko
717200eb43 awk: rename GRPSTART/END to L/RBRACE, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-03 00:39:55 +02:00
Denys Vlasenko
b705bf5539 awk: move match() code out-of-line
function                                             old     new   delta
exec_builtin_match                                     -     202    +202
exec_builtin                                        1434    1157    -277
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 0/1 up/down: 202/-277)          Total: -75 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 23:48:48 +02:00
Denys Vlasenko
646429e05e awk: use smaller regmatch_t arrays, they had 2 elements for no apparent reason
function                                             old     new   delta
exec_builtin                                        1479    1434     -45

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 23:26:09 +02:00
Denys Vlasenko
a5d7b0f4f4 awk: fix detection of VAR=VAL arguments
1NAME=VAL is not it, neither is VA.R=VAL

function                                             old     new   delta
next_input_file                                      216     214      -2
is_assignment                                        115      91     -24
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-26)             Total: -26 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 23:07:21 +02:00
Denys Vlasenko
4d902ea9de awk: fix beavior of "exit" without parameter
function                                             old     new   delta
evaluate                                            3336    3339      +3
awk_exit                                              93      94      +1
awk_main                                             829     827      -2
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/1 up/down: 4/-2)                Total: 2 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 22:28:51 +02:00
Denys Vlasenko
8bb03da906 awk: rand() could return 1.0, fix this - should be in [0,1)
While at it, make it finer-grained (63 bits of randomness)

function                                             old     new   delta
evaluate                                            3303    3336     +33
.rodata                                           104107  104111      +4
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 37/0)               Total: 37 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 19:38:03 +02:00
Denys Vlasenko
37ae8cdc6e awk: beautify builtins table, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 18:55:00 +02:00
Denys Vlasenko
47d9133896 awk: enforce simple builtins' argument number
function                                             old     new   delta
evaluate                                            3215    3303     +88
.rodata                                           104036  104107     +71
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 159/0)             Total: 159 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 18:28:12 +02:00
Denys Vlasenko
786ca197ad awk: make builtin definitions more understandable, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 17:32:08 +02:00
Denys Vlasenko
640212ae0e awk: do not special-case "delete"
Rework of the previous fix:
Can use operation attributes to disable arg evaluation instead of special-casing.

function                                             old     new   delta
.rodata                                           104032  104036      +4
evaluate                                            3223    3215      -8
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/1 up/down: 4/-8)               Total: -4 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 15:21:36 +02:00
Denys Vlasenko
ef5463cf16 awk: shuffle globals for smaller offsets
function                                             old     new   delta
awk_main                                             832     829      -3
evaluate                                            3229    3223      -6
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-9)               Total: -9 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:53:52 +02:00
Denys Vlasenko
966cafcc77 awk: use "static" tmpvars in main and exit
function                                             old     new   delta
awk_exit                                             103      93     -10
awk_main                                             850     832     -18
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-28)             Total: -28 bytes

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:33:13 +02:00
Denys Vlasenko
1193c68fa7 awk: when parsing length(), simplify eating of LPAREN
function                                             old     new   delta
parse_expr                                           945     948      +3

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:29:01 +02:00
Denys Vlasenko
40573556f2 awk: shuffle functions to reduce forward declarations, no code changes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2021-07-02 14:27:40 +02:00