egrep
wc is a program that does character, word, and
line counts on files. The -l option indicates line count,
and that's most useful, I think. For instance: how many
lines does the tmpdata file from last homework have?
> wc -l ~wcbrown/tmpdata
195 /users/faculty/wcbrown/tmpdata
This tells me it has 195 lines.
Piped commands are a powerful part of unix. You "pipe" the
output of one command into the input of another command.
For example, ls -l prints out all the files in
the current directory - one per line.
valiant[202] [~/courses/SI472/classes/C07/]> ls -l total 8 -rw-r--r-- 1 wcbrown faculty 1679 Sep 12 17:15 Class.html -rw-r--r-- 1 wcbrown faculty 936 Sep 12 15:52 Class.html~ drwxr-x--x 2 wcbrown faculty 512 Sep 14 07:46 HW/If I want to know how many files are in a directory, I simply pipe the output of
ls -l to the input of
wc -l:
valiant[203] [~/courses/SI472/classes/C07/]> ls -l | wc -l
4
A pipe is indicated with a |
Weserver access logs: Logon to csb and type:
cd /usr/netscape/server4/https-csb.mathsci.usna.edu/logsto change to the directory in which the departmental access log is kept. The file
access is the access log. Type
more accessto move through the file (hitting space brings up the next page in the file). This file is HUGE, so do not copy it or try to open it in a text editor. How huge is it?
csb[123] [/usr/netscape/server4/https-csb.mathsci.usna.edu/logs/]> wc -l access 184312 accessIt's 184,312 lines long! Here's a line from this file:
valiant.mathsci.usna.edu - - [11/Jul/2000:09:13:10 -0400] "GET /~wcbrown/blank.html HTTP/1.0" 200 514What does this tell you? Well, first of all, the hit came from the machine
valiant.mathsci.usna.edu
... that's me. The date for this access was
11/Jul/2000 (the time's in there too), and the
page that was accessed was ~wcbrown/blank.html,
which is part of my homepage. So on 11 July, I accessed my
homepage from the machine valiant.
I might want to know how many accesses there have been to my homepage.
Well, any access with a ~wcbrown in it is to my
homepage, so there have been
csb[125] [/usr/netscape/server4/https-csb.mathsci.usna.edu/logs/]> egrep '~wcbrown' access | wc -l 42038... 42,038 access of pages belonging to me.
The first thing on a line is the address (symbolic if possible, otherwise numeric) of the person accessing the site. That address is followed by the string " - - ". Suppose I wanted to find all accesses to our site from russia. Well, thier addresses end in .ru. Unfortunately, a "." matches "any character" in a regexp, so we have to put in "\." (i.e. use an escape sequence) to get a . character:
csb[135] [/usr/netscape/server4/https-csb.mathsci.usna.edu/logs/]> egrep '\.ru - -' access ppp96-201.dialup.mtu-net.ru - - [18/Jul/2000:20:49:23 -0400] "GET /~coleman/classes/si221/labs/lab2/ HTTP/1.1" 404 319 uxse118.jinr.ru - - [26/Jul/2000:06:16:15 -0400] "GET /~wcbrown/courses/SI420/mylisp/Tutorial.html HTTP/1.0" 200 586 uxse118.jinr.ru - - [26/Jul/2000:06:16:19 -0400] "GET /~wcbrown/courses/SI420/mylisp/Lisp1.html HTTP/1.0" 200 4836 uxse118.jinr.ru - - [26/Jul/2000:06:17:27 -0400] "GET /~wcbrown/courses/SI420/mylisp/Lisp2.html HTTP/1.0" 200 3521 uxse118.jinr.ru - - [26/Jul/2000:06:17:51 -0400] "GET /~wcbrown/courses/SI420/mylisp/Lisp3.html HTTP/1.0" 200 6789 uxse118.jinr.ru - - [26/Jul/2000:06:18:33 -0400] "GET /~wcbrown/courses/SI420/mylisp/Lisp4.html HTTP/1.0" 200 6999 mcc2-pool-239.cell.ru - - [18/Aug/2000:13:47:06 -0400] "GET /~needham/courses/si434/fall99/slides/chap3.htm HTTP/1.0" 200 6337 dima.stu.neva.ru - - [28/Aug/2000:12:11:20 -0400] "GET /~wcbrown/courses/SI433/classes/C12/Class.html HTTP/1.1" 200 3138 dima.stu.neva.ru - - [28/Aug/2000:12:11:48 -0400] "GET /~wcbrown/courses/SI433/classes/C12/LCS.cpp HTTP/1.1" 200 2556
M014560.mid4.usna.edu - - [28/Aug/2000:23:34:46 -0400] "GET /~wcbrown/courses/SI472/classes/C02/HW/HW.css HTTP/1.0" 200 892