BusyBox vi has never supported the use of regular expressions in
search/replace (':s') commands. Implement this using GNU regex
when VI_REGEX_SEARCH is enabled.
The implementation:
- uses basic regular expressions, to match those used in the search
command;
- only supports substitution of back references ('\0' - '\9') in the
replacement string. Any other character following a backslash is
treated as that literal character.
VI_REGEX_SEARCH isn't enabled in the default build. In that case:
function old new delta
colon 4036 4033 -3
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-3) Total: -3 bytes
When VI_REGEX_SEARCH is enabled:
function old new delta
colon 4036 4378 +342
.rodata 108207 108229 +22
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 364/0) Total: 364 bytes
v2: Rebase. Code shrink. Ensure empty replacement string is null terminated.
Signed-off-by: Andrey Dobrovolsky <andrey.dobrovolsky.odessa@gmail.com>
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Suppose we search for a git conflict marker '<<<<<<< HEAD' using
the command '/^<<<'. Using 'n' to go to the next match finds
'<<<' on the current line, apparently ignoring the '^' anchor.
Set a flag in the compiled regular expression to indicate that the
start of the string should not be considered a beginning-of-line
anchor. An exception has to be made when the search starts from
the beginning of the file. Make a similar change for end-of-line
anchors.
This doesn't affect a default build with VI_REGEX_SEARCH disabled.
When it's enabled:
function old new delta
char_search 247 285 +38
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Both traditional vi and vim use basic regular expressions for
search. Also, they don't allow matches to extend across line
endings. Thus with the file:
123
234
the search '/2.*4$' should find the second '2', not the first.
Make BusyBox vi do the same.
Whether or not VI_REGEX_SEARCH is enabled:
function old new delta
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0) Total: 0 bytes
Signed-off-by: Andrey Dobrovolsky <andrey.dobrovolsky.odessa@gmail.com>
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Commit 7b93e317c (vi: enable 'dG' command. Closes 11801) allowed
'G' to be used as a range specifier for change/yank/delete
operations.
Add similar support for 'gg'. This requires setting the 'cmd_error'
flag if 'g' is followed by any character other than another 'g'.
function old new delta
do_cmd 4852 4860 +8
.rodata 108179 108180 +1
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 9/0) Total: 9 bytes
Signed-off-by: Ron Yorston <rmy@pobox.com>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Example where it wasn't working:
awk 'BEGIN { printf "qwe %s rty %c uio\n", "a", 0, "c" }'
- the NUL printing in %c caused premature stop of printing.
function old new delta
awk_printf 593 596 +3
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Usually, an operation class has only one possible value of "info" word.
In this case, just compare the entire info word, do not bother
to mask OPCLSMASK bits.
(Example where this is not the case: OC_REPLACE for "<op>=")
function old new delta
mk_splitter 106 100 -6
chain_group 616 610 -6
nextarg 40 32 -8
exec_builtin 1157 1149 -8
as_regex 111 103 -8
awk_split 553 543 -10
parse_expr 948 936 -12
awk_getline 656 642 -14
evaluate 3387 3343 -44
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/9 up/down: 0/-116) Total: -116 bytes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
We never destroy g_progname's, the strings still exist, no need to copy
function old new delta
chain_node 104 97 -7
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Disallow:
BEGIN
{ action } - must start on the same line
Disallow:
func f()
print "hello" - must be in {...}
function old new delta
chain_until_rbrace - 41 +41
parse_program 307 336 +29
chain_group 649 616 -33
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 1/1 up/down: 70/-33) Total: 37 bytes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
While at it, make it finer-grained (63 bits of randomness)
function old new delta
evaluate 3303 3336 +33
.rodata 104107 104111 +4
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 2/0 up/down: 37/0) Total: 37 bytes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Rework of the previous fix:
Can use operation attributes to disable arg evaluation instead of special-casing.
function old new delta
.rodata 104032 104036 +4
evaluate 3223 3215 -8
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 1/1 up/down: 4/-8) Total: -4 bytes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
ptest() was using this idea already.
As far as I can see, this is safe. Ttestsuite passes.
One downside is that a temporary from e.g. printf invocation
won't be freed until the next printf call.
function old new delta
awk_printf 481 468 -13
as_regex 137 111 -26
------------------------------------------------------------------------------
(add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-39) Total: -39 bytes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
It seems to be designed to reduce overhead of malloc's auxiliary data,
by allocating at least 64 variables as a block.
With "struct var" being about 20-32 bytes long (32/64 bits),
malloc overhead for one temporary indeed is high, ~33% more memory used
than needed.
function old new delta
evaluate 3137 3145 +8
modprobe_main 798 803 +5
exec_builtin 1414 1419 +5
awk_printf 476 481 +5
as_regex 132 137 +5
EMSG_INTERNAL_ERROR 15 - -15
nvfree 169 116 -53
nvalloc 145 - -145
------------------------------------------------------------------------------
(add/remove: 0/2 grow/shrink: 5/1 up/down: 28/-213) Total: -185 bytes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
hash_find(): do not caclculate hash twice. Do not divide - can use
cheap multiply-by-8 shift.
nextword(): do not repeatedly increment in-memory value, do it in register,
then store final result.
hashwalk_init(): do not strlen() twice.
function old new delta
hash_search3 - 49 +49
hash_find 259 281 +22
nextword 19 16 -3
evaluate 3141 3137 -4
hash_search 54 28 -26
------------------------------------------------------------------------------
(add/remove: 1/0 grow/shrink: 1/3 up/down: 71/-33) Total: 38 bytes
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
We can free them after they are no longer needed.
(Currently, being a NOEXEC applet is much larger waste of memory
for the case of long-running awk script).
function old new delta
awk_main 831 827 -4
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>