This section introduces other useful UNIX system
utilities and covers:
- Connecting to remote machines.
- Networking routing utilities.
- Remote file transfer.
- Other Internet-related utilities.
- Facilities for user information and communication.
- Printer control.
- Email utilities.
- Advanced text file processing with sed and awk.
- Target directed compilation with make.
- Version control with CVS.
- C++ compilation facilities.
- Manual pages.
- ssh machinename
ssh is a secure alternative for remote login
and also for executing commands in a remote machine. It is intended to
replace rlogin and rsh, and provide secure encrypted
communications between two untrusted hosts over an insecure network. X11
connections (i.e. graphics) can also be forwarded over the secure channel
(another advantage over telnet, rlogin and rsh).
ssh is not a standard system utility, although it is a de facto standard.
To login to a remote machine with your current user name:
$ ssh hostname
To login to a remote machine with a different user name:
$ ssh username@hostname
ssh clients are also available for Windows machines
(e.g. there is a good ssh client called putty).
- ping machinename
The ping utility is useful for checking round-trip
response time between machines. e.g.
$ ping www.google.com
measures the reponse time delay between the current machine
and the web server at Google.
ping is also useful to check whether a machine is still "alive" in
some sense.
- traceroute machinename
traceroute shows the full path taken to reach
a remote machine, including the delay to each machine along the route.
This is particularly useful in tracking down the location of network problems.
- ftp machinename (file transfer protocol)
ftp is an insecure way of transferring files
between computers. When you connect to a machine via ftp, you will be asked
for your username and password. If you have an account on the machine,
you can use it, or you can can often use the user "ftp" or "anonymous".
Once logged in via FTP, you can list files (dir), receive files
(get and mget) and send files (put and mput).
(Unusually for UNIX) help will show you a list of available commands.
Particularly useful are binary (transfer files preserving all
8 bits) and prompt n (do not confirm each file on multiple file
transfers). Type quit to leave ftp and return to the
shell prompt.
- scp sourcefiles destination (secure copy)
scp is an ssh tool which provides a secure way of transferring files between
computers. It works just like the UNIX cp command except that
the arguments can specify a user and remote machine as well as files. For example:
$ scp jon@srv.ucla.edu:~/hello.txt .
will (subject to correct authentication) copy the file hello.txt
from the user account jon on the remote machine
srv.ucla.edu into the current directory (.) on the local machine.
- wget URL
wget provides a way to retrieve files from the
web (using the HTTP protocol). wget is non-interactive,
which means it can run in the background, while the user is not logged
in (unlike most web browsers). The content retrieved by wget
is stored as raw HTML text (which can be viewed later using a web browser).
- finger, who
finger and who show the list of users logged into a machine,
the terminal they are using, and the date they logged in on.
$ who
jon pts/2 Dec 5 19:41
$
- write, talk
write is used by users on the same machine who
want to talk to each other. You should specify the user and (optionally)
the terminal they are on:
$ write will pts/2
hello jon
Lines are only transmitted when you press return.
To return to the shell prompt, press ctrl-d (the UNIX end of file marker).
- sed (stream editor)
sed allows you to perform basic text transformations
on an input stream (i.e. a file or input from a pipeline). For example,
you can delete lines containing particular string of text, or you can substitute
one pattern for another wherever it occurs in a file. Although sed is
a mini-programming language all on its own and can execute entire scripts,
its full language is obscure and probably best forgotten (being based on
the old and esoteric UNIX line editor ed). sed is probably
at its most useful when used directly from the command line with simple
parameters:
$ sed "s/pattern1/pattern2/" inputfile outputfile
(substitutes pattern2 for pattern1 once per line)
$ sed "s/pattern1/pattern2/g" inputfile outputfile
(substitutes pattern2 for pattern1 for every pattern1 per line)
$ sed "/pattern1/d" inputfile outputfile
(deletes all lines containing pattern1)
$ sed "y/string1/string2/" inputfile outputfile
(substitutes characters in string2 for those in string1)
- awk (Aho, Weinberger and Kernigan)
awk is useful for manipulating files that contain
columns of data on a line by line basis. Like sed, you can either
pass awk statements directly on the command line, or you can write
a script file and let awk read the commands from the script.
Say we have a file of cricket scores called cricket.dat
containing columns for player number, name, runs and the way in which they
were dismissed:
1 atherton 0 bowled
2 hussain 20 caught
3 stewart 47 stumped
4 thorpe 33 lbw
5 gough 6 run-out
To print out only the first and second columns we can say:
$ awk '{ print $1 " " $2 }' cricket.dat
atherton 0
hussain 20
stewart 47
thorpe 33
gough 6
$
Here $n stands for the nth field
or column of each line in the data file. $0 can be used to denote
the whole line.
We can do much more with awk. For example, we
can write a script cricket.awk to calculate the team's batting
average and to check if Mike Atherton got another duck:
$ cat > cricket.awk
BEGIN { players = 0; runs = 0 }
{ players++; runs +=$3 }
/atherton/ { if (runs==0) print "atherton duck!"
}
END { print "the batting average is " runs/players
}
(ctrl-d)
$
$ awk -f cricket.awk cricket.dat
atherton duck!
the batting average is 21.2
$
The BEGIN clause is executed once at the start
of the script, the main clause once for every line, the /atherton/
clause only if the word atherton occurs in the line and the
END clause once at the end of the script.
awk can do a lot more. See the manual pages for
details (type man awk).
- make
make is a utility which can determine automatically
which pieces of a large program need to be recompiled, and issue the commands
to recompile them. To use make, you need to create a file called Makefile
or makefile that describes the relationships among files
in your program, and the states the commands for updating each file.
Here is an example of a simple makefile:
scores.out: cricket.awk cricket.dat
[TAB]awk -f cricket.awk cricket.dat > scores.out
Here [TAB] indicates the TAB key. The interpretation of
this makefile is as follows:
make is invoked simply by typing:
$ make
awk -f cricket.awk cricket.dat > scores.out
$
Since scores.out did not exist, make
executed the commands to create it. If we now invoke make again, nothing
happens:
$ make
make: `scores.out' is up to date.
$
But if we modify cricket.dat and then run make
again, scores.out will be updated:
$ touch cricket.dat (touch simulates file modification)
$ make
awk -f cricket.awk cricket.dat > scores.out
$
make is mostly used when compiling large C, C++
or Java programs, but can (as we have seen) be used to automatically and
intelligently produce a target file of any kind.
git (Concurrent Versioning System) is a source code control system often used
on large programming projects to control the concurrent editing of source
files by multiple authors. It keeps old versions of files and maintains
a log of when, and why changes occurred, and who made them.
- git add files
You can use this to add new files into a repository that you have checked-out.
Does not actually affect the repository until a "cvs commit" is
performed.
- cvs commit files
This command publishes your changes to other developers by updating
the source code in the central repository.
- cc, gcc, CC, g++
UNIX installations usually come with a C and/or C++ compiler.
The C compiler is usually called cc or gcc, and the C++
compiler is usually called CC or g++. Most large C or
C++ programs will come with a makefile and will support the
configure utility, so that compiling and installing a package is often as simple
as:
$ ./configure
$ make
$ make install
However, there is nothing to prevent you from writing
and compiling a simple C program yourself:
$ cat > hello.c
#include <stdio.h>
int main() {
printf("hello world!\n");
return 0;
}
(ctrl-d)
$ cc hello.c -o hello
$ ./hello
hello world!
$
Here the C compiler (cc) takes as input the C
source file hello.c and produces as output an executable program
called hello. The program hello may then be executed
(the ./ tells the shell to look in the current directory to find
the hello program).
5.13 Manual Pages
- man
More information is available on most UNIX commands is
available via the online manual pages, which are accessible through the
man command. The online documentation is in fact divided into sections. Traditionally,
they are
- User-level commands
- System calls
- Library functions
- Devices and device drivers
- File formats
- Games
- Various miscellaneous stuff - macro packages etc.
- System maintenance and operation commands
Sometimes man gives you a manual page from the
wrong section. For example, say you were writing a program and you needed
to use the rmdir system call. man rmdir gives you the
manual page for the user-level command rmdir. To force man
to look in Section 2 of the manual instead, type man 2 rmdir (orman
-s2 rmdir on some systems).
man can also find manual pages which mention
a particular topic. For example, man -k postscript should produce
a list of utilities that can produce and manipulate postscript files.
- info
info is an interactive, somewhat more friendly
and helpful alternative to man. It may not be installed on all
systems, however.