Commit f9002b85 authored by Warner Losh's avatar Warner Losh
Browse files

awk: bring in vendor branch from upstream 20210727

Changes since the last import:

July 27, 2021:
	As per IEEE Std 1003.1-2008, -F "str" is now consistent with
	-v FS="str" when str is null. Thanks to Warner Losh.

July 24, 2021:
	Fix readrec's definition of a record. This fixes an issue
	with NetBSD's RS regular expression support that can cause
	an infinite read loop. Thanks to Miguel Pineiro Jr.

	Fix regular expression RS ^-anchoring. RS ^-anchoring needs to
	know if it is reading the first record of a file. This change
	restores a missing line that was overlooked when porting NetBSD's
	RS regex functionality. Thanks to Miguel Pineiro Jr.

	Fix size computation in replace_repeat() for special case
	REPEAT_WITH_Q. Thanks to Todd C. Miller.

Also, for the first time, import all the tests.

Sponsored by:		Netflix
parent 746b7396
......@@ -25,6 +25,23 @@ THIS SOFTWARE.
This file lists all bug fixes, changes, etc., made since the AWK book
was sent to the printers in August, 1987.
July 27, 2021:
As per IEEE Std 1003.1-2008, -F "str" is now consistent with
-v FS="str" when str is null. Thanks to Warner Losh.
July 24, 2021:
Fix readrec's definition of a record. This fixes an issue
with NetBSD's RS regular expression support that can cause
an infinite read loop. Thanks to Miguel Pineiro Jr.
Fix regular expression RS ^-anchoring. RS ^-anchoring needs to
know if it is reading the first record of a file. This change
restores a missing line that was overlooked when porting NetBSD's
RS regex functionality. Thanks to Miguel Pineiro Jr.
Fix size computation in replace_repeat() for special case
REPEAT_WITH_Q. Thanks to Todd C. Miller.
February 15, 2021:
Small fix so that awk will compile again with g++. Thanks to
Arnold Robbins.
......
# The One True Awk
This is the version of `awk` described in _The AWK Programming Language_,
by Al Aho, Brian Kernighan, and Peter Weinberger
(Addison-Wesley, 1988, ISBN 0-201-07981-X).
## Copyright
Copyright (C) Lucent Technologies 1997<br/>
All Rights Reserved
Permission to use, copy, modify, and distribute this software and
its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all
copies and that both that the copyright notice and this
permission notice and warranty disclaimer appear in supporting
documentation, and that the name Lucent Technologies or any of
its entities not be used in advertising or publicity pertaining
to distribution of the software without specific, written prior
permission.
LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.
## Distribution and Reporting Problems
Changes, mostly bug fixes and occasional enhancements, are listed
in `FIXES`. If you distribute this code further, please please please
distribute `FIXES` with it.
If you find errors, please report them
to bwk@cs.princeton.edu.
Please _also_ open an issue in the GitHub issue tracker, to make
it easy to track issues.
Thanks.
## Submitting Pull Requests
Pull requests are welcome. Some guidelines:
* Please do not use functions or facilities that are not standard (e.g.,
`strlcpy()`, `fpurge()`).
* Please run the test suite and make sure that your changes pass before
posting the pull request. To do so:
1. Save the previous version of `awk` somewhere in your path. Call it `nawk` (for example).
1. Run `oldawk=nawk make check > check.out 2>&1`.
1. Search for `BAD` or `error` in the result. In general, look over it manually to make sure there are no errors.
* Please create the pull request with a request
to merge into the `staging` branch instead of into the `master` branch.
This allows us to do testing, and to make any additional edits or changes
after the merge but before merging to `master`.
## Building
The program itself is created by
make
which should produce a sequence of messages roughly like this:
yacc -d awkgram.y
conflicts: 43 shift/reduce, 85 reduce/reduce
mv y.tab.c ytab.c
mv y.tab.h ytab.h
cc -c ytab.c
cc -c b.c
cc -c main.c
cc -c parse.c
cc maketab.c -o maketab
./maketab >proctab.c
cc -c proctab.c
cc -c tran.c
cc -c lib.c
cc -c run.c
cc -c lex.c
cc ytab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o -lm
This produces an executable `a.out`; you will eventually want to
move this to some place like `/usr/bin/awk`.
If your system does not have `yacc` or `bison` (the GNU
equivalent), you need to install one of them first.
NOTE: This version uses ANSI C (C 99), as you should also. We have
compiled this without any changes using `gcc -Wall` and/or local C
compilers on a variety of systems, but new systems or compilers
may raise some new complaint; reports of difficulties are
welcome.
This compiles without change on Macintosh OS X using `gcc` and
the standard developer tools.
You can also use `make CC=g++` to build with the GNU C++ compiler,
should you choose to do so.
The version of `malloc` that comes with some systems is sometimes
astonishly slow. If `awk` seems slow, you might try fixing that.
More generally, turning on optimization can significantly improve
`awk`'s speed, perhaps by 1/3 for highest levels.
## A Note About Releases
We don't do releases.
## A Note About Maintenance
NOTICE! Maintenance of this program is on a ''best effort''
basis. We try to get to issues and pull requests as quickly
as we can. Unfortunately, however, keeping this program going
is not at the top of our priority list.
#### Last Updated
Sat Jul 25 14:00:07 EDT 2021
Wed Jan 22 02:10:35 MST 2020
============================
Here are some things that it'd be nice to have volunteer
help on.
1. Rework the test suite so that it's easier to maintain
and see exactly which tests fail:
A. Extract beebe.tar into separate file and update scripts
B. Split apart multiple tests into separate tests with input
and "ok" files for comparisons.
2. Pull in more of the tests from gawk that only test standard features.
The beebe.tar file appears to be from sometime in the 1990s.
3. Make the One True Awk valgrind clean. In particular add a
a test suite target that runs valgrind on all the tests and
reports if there are any definite losses or any invalid reads
or writes (similar to gawk's test of this nature).
......@@ -935,7 +935,7 @@ replace_repeat(const uschar *reptok, int reptoklen, const uschar *atom,
if (special_case == REPEAT_PLUS_APPENDED) {
size++; /* for the final + */
} else if (special_case == REPEAT_WITH_Q) {
size += init_q + (atomlen+1)* n_q_reps;
size += init_q + (atomlen+1)* (n_q_reps-init_q);
} else if (special_case == REPEAT_ZERO) {
size += 2; /* just a null ERE: () */
}
......@@ -964,11 +964,8 @@ replace_repeat(const uschar *reptok, int reptoklen, const uschar *atom,
}
}
memcpy(&buf[j], reptok+reptoklen, suffix_length);
if (special_case == REPEAT_ZERO) {
buf[j+suffix_length] = '\0';
} else {
buf[size] = '\0';
}
j += suffix_length;
buf[j] = '\0';
/* free old basestr */
if (firstbasestr != basestr) {
if (basestr)
......
#! /bin/bash
if [ ! -f ../a.out ]
then
echo Making executable
(cd .. ; make) || exit 0
fi
for i in *.awk
do
echo === $i
OUT=${i%.awk}.OUT
OK=${i%.awk}.ok
IN=${i%.awk}.in
input=
if [ -f $IN ]
then
input=$IN
fi
../a.out -f $i $input > $OUT 2>&1
if cmp -s $OK $OUT
then
rm -f $OUT
else
echo ++++ $i failed!
fi
done
{
for (i = 1; i <= NF; i++)
print i, $i, $i + 0
}
-inf -inform inform -nan -nancy nancy -123 0 123 +123 nancy +nancy +nan inform +inform +inf
1 -inf -inf
2 -inform 0
3 inform 0
4 -nan -nan
5 -nancy 0
6 nancy 0
7 -123 -123
8 0 0
9 123 123
10 +123 123
11 nancy 0
12 +nancy 0
13 +nan +nan
14 inform 0
15 +inform 0
16 +inf +inf
\
\ No newline at end of file
../a.out: syntax error at source line 1 source file pfile-overflow.awk
context is
>>> <<<
../a.out: bailing out at source line 1 source file pfile-overflow.awk
BEGIN { RS="zx" } { print $1 }
......@@ -176,6 +176,7 @@ int getrec(char **pbuf, int *pbufsize, bool isrecord) /* get next input record *
infile = stdin;
else if ((infile = fopen(file, "r")) == NULL)
FATAL("can't open file %s", file);
innew = true;
setfval(fnrloc, 0.0);
}
c = readrec(&buf, &bufsize, infile, innew);
......@@ -241,6 +242,7 @@ int readrec(char **pbuf, int *pbufsize, FILE *inf, bool newflag) /* read one rec
}
if (found)
setptr(patbeg, '\0');
isrec = (found == 0 && *buf == '\0') ? false : true;
} else {
if ((sep = *rs) == 0) {
sep = '\n';
......@@ -270,10 +272,10 @@ int readrec(char **pbuf, int *pbufsize, FILE *inf, bool newflag) /* read one rec
if (!adjbuf(&buf, &bufsize, 1+rr-buf, recsize, &rr, "readrec 3"))
FATAL("input record `%.30s...' too long", buf);
*rr = 0;
isrec = (c == EOF && rr == buf) ? false : true;
}
*pbuf = buf;
*pbufsize = bufsize;
isrec = *buf || !feof(inf);
DPRINTF("readrec saw <%s>, returns %d\n", buf, isrec);
return isrec;
}
......
......@@ -22,7 +22,7 @@ ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.
****************************************************************/
const char *version = "version 20210215";
const char *version = "version 20210724";
#define DEBUG
#include <stdio.h>
......@@ -91,9 +91,7 @@ setfs(char *p)
/* wart: t=>\t */
if (p[0] == 't' && p[1] == '\0')
return "\t";
else if (p[0] != '\0')
return p;
return NULL;
return p;
}
static char *
......@@ -169,8 +167,6 @@ int main(int argc, char *argv[])
break;
case 'F': /* set field separator */
fs = setfs(getarg(&argc, &argv, "no field separator"));
if (fs == NULL)
WARNING("field separator FS is empty");
break;
case 'v': /* -v a=1 to be done NOW. one -v for each */
vn = getarg(&argc, &argv, "no variable name");
......
oldawk=${oldawk-awk}
awk=${awk-../a.out}
echo oldawk=$oldawk, awk=$awk
for i in T.*
do
$i
done
# an arbitrary collection of input data
cat td.1 td.1 >foo.td
sed 's/^........................//' td.1 >>foo.td
pr -m td.1 td.1 td.1 >>foo.td
pr -2 td.1 >>foo.td
wc foo.td
td=foo.td
>footot
for i in $*
do
echo $i >/dev/tty
echo $i '<<<'
cd ..
echo testdir/$i:
ind <testdir/$i
a.out -f testdir/$i >drek.c
cat drek.c
make drek || ( echo $i ' ' bad compile; echo $i ' ' bad compile >/dev/tty; continue )
cd testdir
time /usr/bin/awk -f $i $td >foo2 2>foo2t
cat foo2t
time ../drek $td >foo1 2>foo1t
cat foo1t
cmp foo1 foo2 || ( echo $i ' ' bad; echo $i ' ' bad >/dev/tty; diff foo1 foo2 | sed 20q )
echo '>>>' $i
echo
echo $i: >>footot
cat foo1t foo2t >>footot
done
ctimes footot
oldawk=${oldawk-awk}
awk=${awk-../a.out}
echo oldawk=$oldawk, awk=$awk
for i
do
echo "$i:"
$oldawk -f $i test.countries test.countries >foo1
$awk -f $i test.countries test.countries >foo2
if cmp -s foo1 foo2
then true
else echo -n "$i: BAD ..."
fi
diff -b foo1 foo2 | sed -e 's/^/ /' -e 10q
done
oldawk=${oldawk-myawk}
awk=${awk-../a.out}
echo oldawk=$oldawk, awk=$awk
for i
do
echo "$i:"
$oldawk -f $i test.data >foo1
$awk -f $i test.data >foo2
if cmp -s foo1 foo2
then true
else echo -n "$i: BAD ..."
fi
diff -b foo1 foo2 | sed -e 's/^/ /' -e 10q
done
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment