UNIX:Networks

This section introduces other useful UNIX system utilities and covers:

Connecting to remote machines.
Networking routing utilities.
Remote file transfer.
Other Internet-related utilities.
Facilities for user information and communication.
Printer control.
Email utilities.
Advanced text file processing with sed and awk.
Target directed compilation with make.
Version control with CVS.
C++ compilation facilities.
Manual pages.

ssh machinename

ssh is a secure alternative for remote login and also for executing commands in a remote machine. It is intended to replace rlogin and rsh, and provide secure encrypted communications between two untrusted hosts over an insecure network. X11 connections (i.e. graphics) can also be forwarded over the secure channel (another advantage over telnet, rlogin and rsh). ssh is not a standard system utility, although it is a de facto standard.

To login to a remote machine with your current user name:

$ ssh hostname

To login to a remote machine with a different user name:

$ ssh username@hostname

ssh clients are also available for Windows machines (e.g. there is a good ssh client called putty).

ping machinename

The ping utility is useful for checking round-trip response time between machines. e.g.

$ ping www.google.com

measures the reponse time delay between the current machine and the web server at Google. ping is also useful to check whether a machine is still "alive" in some sense.

traceroute machinename

traceroute shows the full path taken to reach a remote machine, including the delay to each machine along the route. This is particularly useful in tracking down the location of network problems.

ftp machinename (file transfer protocol)

ftp is an insecure way of transferring files between computers. When you connect to a machine via ftp, you will be asked for your username and password. If you have an account on the machine, you can use it, or you can can often use the user "ftp" or "anonymous". Once logged in via FTP, you can list files (dir), receive files (get and mget) and send files (put and mput). (Unusually for UNIX) help will show you a list of available commands. Particularly useful are binary (transfer files preserving all 8 bits) and prompt n (do not confirm each file on multiple file transfers). Type quit to leave ftp and return to the shell prompt.

scp sourcefiles destination (secure copy)

scp is an ssh tool which provides a secure way of transferring files between computers. It works just like the UNIX cp command except that the arguments can specify a user and remote machine as well as files. For example:

$ scp jon@srv.ucla.edu:~/hello.txt .

will (subject to correct authentication) copy the file hello.txt from the user account jon on the remote machine srv.ucla.edu into the current directory (.) on the local machine.

wget URL

wget provides a way to retrieve files from the web (using the HTTP protocol). wget is non-interactive, which means it can run in the background, while the user is not logged in (unlike most web browsers). The content retrieved by wget is stored as raw HTML text (which can be viewed later using a web browser).

finger, who

finger and who show the list of users logged into a machine, the terminal they are using, and the date they logged in on.

$ who
jon pts/2 Dec 5 19:41
$

write, talk

write is used by users on the same machine who want to talk to each other. You should specify the user and (optionally) the terminal they are on:

$ write will pts/2
hello jon

Lines are only transmitted when you press return. To return to the shell prompt, press ctrl-d (the UNIX end of file marker).

sed (stream editor)

sed allows you to perform basic text transformations on an input stream (i.e. a file or input from a pipeline). For example, you can delete lines containing particular string of text, or you can substitute one pattern for another wherever it occurs in a file. Although sed is a mini-programming language all on its own and can execute entire scripts, its full language is obscure and probably best forgotten (being based on the old and esoteric UNIX line editor ed). sed is probably at its most useful when used directly from the command line with simple parameters:

$ sed "s/pattern1/pattern2/" inputfile outputfile

(substitutes pattern2 for pattern1 once per line)

$ sed "s/pattern1/pattern2/g" inputfile outputfile

(substitutes pattern2 for pattern1 for every pattern1 per line)

$ sed "/pattern1/d" inputfile outputfile

(deletes all lines containing pattern1)

$ sed "y/string1/string2/" inputfile outputfile

(substitutes characters in string2 for those in string1)

awk (Aho, Weinberger and Kernigan)

awk is useful for manipulating files that contain columns of data on a line by line basis. Like sed, you can either pass awk statements directly on the command line, or you can write a script file and let awk read the commands from the script.

Say we have a file of cricket scores called cricket.dat containing columns for player number, name, runs and the way in which they were dismissed:

	  1 atherton   0   bowled
	  2 hussain  20  caught
	  3 stewart  47  stumped
	  4 thorpe   33  lbw
	  5 gough    6  run-out

To print out only the first and second columns we can say:

$ awk '{ print $1 " " $2 }' cricket.dat
atherton 0
hussain 20
stewart 47
thorpe 33
gough 6
$

Here $n stands for the nth field or column of each line in the data file. $0 can be used to denote the whole line.

We can do much more with awk. For example, we can write a script cricket.awk to calculate the team's batting average and to check if Mike Atherton got another duck:

$ cat > cricket.awk
BEGIN { players = 0; runs = 0 }
{ players++; runs +=$3 }
/atherton/ { if (runs==0) print "atherton duck!"
}
END { print "the batting average is " runs/players
}
(ctrl-d)
$
$ awk -f cricket.awk cricket.dat
atherton duck!
the batting average is 21.2
$

The BEGIN clause is executed once at the start of the script, the main clause once for every line, the /atherton/ clause only if the word atherton occurs in the line and the END clause once at the end of the script.

awk can do a lot more. See the manual pages for details (type man awk).

make

make is a utility which can determine automatically which pieces of a large program need to be recompiled, and issue the commands to recompile them. To use make, you need to create a file called Makefile or makefile that describes the relationships among files in your program, and the states the commands for updating each file.

Here is an example of a simple makefile:

scores.out: cricket.awk cricket.dat
[TAB]awk -f cricket.awk cricket.dat > scores.out

Here [TAB] indicates the TAB key. The interpretation of this makefile is as follows:

scores.out is the target of the compilation
scores.out depends on cricket.awk and cricket.dat
if either cricket.awk or cricket.dat have been modified since scores.out was last modified or if scores.out does not exist, update scores.out by executing the command:

awk -f cricket.awk cricket.dat > scores.out

make is invoked simply by typing:

$ make
awk -f cricket.awk cricket.dat > scores.out
$

Since scores.out did not exist, make executed the commands to create it. If we now invoke make again, nothing happens:

$ make
make: `scores.out' is up to date.
$

But if we modify cricket.dat and then run make again, scores.out will be updated:

$ touch cricket.dat (touch simulates file modification)
$ make
awk -f cricket.awk cricket.dat > scores.out
$

make is mostly used when compiling large C, C++ or Java programs, but can (as we have seen) be used to automatically and intelligently produce a target file of any kind.

git (Concurrent Versioning System) is a source code control system often used on large programming projects to control the concurrent editing of source files by multiple authors. It keeps old versions of files and maintains a log of when, and why changes occurred, and who made them.

git add files

You can use this to add new files into a repository that you have checked-out. Does not actually affect the repository until a "cvs commit" is performed.

cvs commit files

This command publishes your changes to other developers by updating the source code in the central repository.

cc, gcc, CC, g++

UNIX installations usually come with a C and/or C++ compiler. The C compiler is usually called cc or gcc, and the C++ compiler is usually called CC or g++. Most large C or C++ programs will come with a makefile and will support the configure utility, so that compiling and installing a package is often as simple as:

$ ./configure
$ make
$ make install

However, there is nothing to prevent you from writing and compiling a simple C program yourself:

$ cat > hello.c
#include <stdio.h>
int main() {
printf("hello world!\n");
return 0;
}
(ctrl-d)
$ cc hello.c -o hello
$ ./hello
hello world!
$

Here the C compiler (cc) takes as input the C source file hello.c and produces as output an executable program called hello. The program hello may then be executed (the ./ tells the shell to look in the current directory to find the hello program).

5.13 Manual Pages

man

More information is available on most UNIX commands is available via the online manual pages, which are accessible through the man command. The online documentation is in fact divided into sections. Traditionally, they are

User-level commands
System calls
Library functions
Devices and device drivers
File formats
Games
Various miscellaneous stuff - macro packages etc.
System maintenance and operation commands

Sometimes man gives you a manual page from the wrong section. For example, say you were writing a program and you needed to use the rmdir system call. man rmdir gives you the manual page for the user-level command rmdir. To force man to look in Section 2 of the manual instead, type man 2 rmdir (orman -s2 rmdir on some systems).

man can also find manual pages which mention a particular topic. For example, man -k postscript should produce a list of utilities that can produce and manipulate postscript files.

info

info is an interactive, somewhat more friendly and helpful alternative to man. It may not be installed on all systems, however.

UNIX and Linux

5.1 Introduction

5.2 Connecting to Remote Machines

5.3 Network routing utilities

5.4 Remote File Transfer

5.5 Other Internet-related utilities

5.6 User Information and Communication

5.9 Advanced Text File Processing

5.10 Target Directed Compilation

5.11 Version control with Git

5.12 C/C++ compilation utilities

5.13 Manual Pages