Last time, we learned some basics of the UNIX system, and the basics of how to use the UNIX shell. We also covered a few basic bash commands that can be used for browsing and makding changes the the files and directories on the file system. Specifically, we learned about:
pwd
Displays the “current working directory” of the shell sessionls
Lists all of the files and folders (directories) in the current shell sessioncd
Allows us to change to a new directory in ths shell sessionmkdir
Create new directorie(s)touch
Create new file(s)rm
Remove file(s)rmdir
Remove directorie(s)In today’s lecture, we will add several other shell commands to our arsenal. Here we go…
Back to the echo
command for a bit. As mentioned above, the echo
command prints out whatever text comes after the command. A very simple example is echo Hello World
:
$ echo Hello World
Hello World
echo
reads in all of the text that follows it, which in this case is Hello World
. All that is does is print this out to the command-line. Pretty nifty! This command has some more complex functionality, but for now, this is all we need to know about it.
No, we’re not talking about animals here. cat
is a command that prints out the contents of files. Let’s say we have a file in the current working directory called names.txt
:
$ pwd
/Users/bddicken/Desktop/test
$ ls
names.txt image.jpg
We can discover the contents of the names.txt
file by running cat names.txt
:
$ cat names.txt
Anne Berkley
Donna King
Bill Jimmer
Cam Smith
This files has four names in it, one per line of the file. Cool! Above we saw that the current working directory also contained an image file called image.jpg
. Can cat
display an image? Let’s find out. If we try running cat image.jpg
, something like this will be diplayed by bash
$ cat image.jpg
????JFIFHH??ExifMM* z ??(1
?2??i?SAMSUNGSCH-I535HHI535VRUCML12014:03:21 16:28:00???????"?'d?0220?֑??
???
?0100????????q2014:03:21 16:28:002014:03:21 16:28:00)d)d
?
http://ns.adobe.com/xap/1.0/<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?> <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="XMP Core 5.4.0"> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/" xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/" xmp:ModifyDate="2014-03-21T16:28:00" xmp:CreatorTool="I535VRUCML1" xmp:CreateDate="2014-03-21T16:28:00" photoshop:DateCreated="2014-03-21T16:28:00"/> </rdf:RDF> </x:xmpmeta>
<?xpacket end="w"?>??xPhotoshop 3.08BIM?ZG?1628002014032120140321<1628008BIM%?S??n??j?Q?7????????
???}!1AQa"q2??#B??R??$3br?
%&'()*456789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz???????????????????????????????????????????????????????????????????????????
???w!1AQaq"2B???? #3R?br?
$4?%?&'()*56789:CDEFGHIJSTUVWXYZcdefghijstuvwxyz??????????????????????????????????????????????????????????????????????????C
?? ??
Ew! What is this? Well, it makes sense that cat
would have trouble displaying an image. cat
prints the contents of the file to the text output of the shell. Obviously, a .jpg
file is not a text file, so it cannot display the image in a reasonable way. Thus, cat
should only be used for printing out files whose contents are text. You will have the opportunity to use cat
in the upcoming homework assignments.
As the name suggests, the sort
command can be used to sort text content. We can re-use the names.txt
file from the cat
exable above. As we already saw, the names in names.txt
are not sorted alphabetically. What should we do if we want to sort them?
$ sort names.txt
Anne Berkley
Bill Jimmer
Cam Smith
Donna King
Look at that, the names get printed similarly to how they were printed with cat, but now they are in alphabetical order. One important thing to note is that running sort names.txt
does not change the contents of the actual file, it just reads the contents of the file, and sorts the output. After running this sort
command, we can again cat
the file to see that the order has not changed:
$ cat names.txt
Anne Berkley
Donna King
Bill Jimmer
Cam Smith
sort
comes in very handy when doing data processing on large text files. It is often very useful to sort the contents of a file either alphabetically or numerically.
Grep can be used to search for text in text files. grep
is a very powerful tool with many features, but in this lecture we are just going to cover some basics. Again, we shall use the example file names.txt
from above.
Perhaps we want to check if someone with the name “Bill” exists in our file. We would run the following
grep Bill names.txt
Bill Jimmer
We found a Bill! Now, what if we want to search of all names that have an “nn” in it. The command is very similar:
$ grep nn names.txt
Anne Berkley
Donna King
We found two!
Now, what if we want to search for all names that have an “a” in it.
However, we want to ignore case, so that we check the first (capitalized) letter, and all letters in the rest of the name.
To ignore case, we can add the -i
option to the command:
$ grep -i A names.txt
Anne Berkley
Donna King
Cam Smith
Three names!
Another capability grep has it to return only an exact word match, rather than matching anything on one of the lines of text.
This can be acheived with the -w
option.
$ grep -w anne names.txt
$
This matches no word exactly, so nothing is returned.
As with most options, the -w
option can be used in conjunction with other options, such as -i
.
If we add -i
to the previous command, we will get a match, because we will be doing an exact word search and case will be ignored.
$ grep -w -i anne names.txt
Anne Berkley
$
And obviously, it works even with -i
as long as the correct capitalization is used:
$ grep -w Bill names.txt
Bill Jimmer
These two commands are tightly related, so I will introduce them together. head
and tail
are to be used for printing the beginning (head
) and end (tail
) lines of a text file. Obviously, this is very usefule when you have large file(s) and dont want to look at the whole file at once. For example, you might have a tet file with student grades that is sorted by grade. If you want to see the students with the best scores, you could use head
to find these.
Say we have a file named grades.txt
with the following text contents:
99 Jimmy Smith
97 Sally Talbot
90 Donna Sloan
89 John Cooper
87 Jared Ganzales
87 Mary West
83 Sherlock Holmes
77 Bob Thorton
76 David Abraham
62 Fred Francis
The first column is the grade, the second is the first name, and the last is the last name. To look at only the top three grades, we can run:
$ head -n 3 grades.txt
99 Jimmy Smith
97 Sally Talbot
90 Donna Sloan
The -n 3
allows the user to specify the number of lines to show at the beginning of the file. Similarly, to see the lowest grades, we can run:
$ tail -n 3 grades.txt
77 Bob Thorton
76 David Abraham
62 Fred Francis
The cut
command is used to cut out specific regions of each line of a file or input stream.
This command is very useful if we don’t want to do a specific operation (such as sorting or searching) on an entire file, but rather a particular “column” of the file.
Generally speaking, when using cut we typically need to specify two command-line arguments.
We need to specify the character that we want to use as the delimiter to separate columns of the input file.
This is specified with the -d
option.
We also need to specify which column we want to extract, once the separator has been specified.
This is specified with the -f
option (“f” is for “field”).
In grades.txt
, the columns of data are separated with spaces, so do operate on this file we should use -d " "
.
We can grab just the first column (the grades) with:
$ cut -d " " -f 1 grades.sh
99
97
90
89
87
87
83
77
76
62
Or the second column (first-name) with:
$ cut -d " " -f 2 grades.sh
Jimmy
Sally
Donna
John
Jared
Mary
Sherlock
Bob
David
Fred
Or the third column (last-name) with:
$ cut -d " " -f 3 grades.sh
Smith
Talbot
Sloan
Cooper
Ganzales
West
Holmes
Thorton
Abraham
Francis
What if the “columns” in grades.txt
were separated with a character other than spaces?
For example, commas:
99,Jimmy,Smith
97,Sally,Talbot
90,Donna,Sloan
89,John,Cooper
87,Jared,Ganzales
87,Mary,West
83,Sherlock,Holmes
77,Bob,Thorton
76,David,Abraham
62,Fred,Francis
To do this, just replace -d " "
with -d ","
:
$ cut -d "," -f 2 grades.sh
Jimmy
Sally
Donna
John
Jared
Mary
Sherlock
Bob
David
Fred
A man page (short for manual page) is a form of software documentation usually found on a Unix or Unix-like operating system. Most common tools and commands in bash have ther own man page. To read a manual page for a Unix command, a user can type:
man <command_name>
So, to read the man page for cat
, one would simply type man cat
, and the following would show up in bash:
CAT(1) BSD General Commands Manual CAT(1)
NAME
cat -- concatenate and print files
SYNOPSIS
cat [-benstuv] [file ...]
DESCRIPTION
The cat utility reads files sequentially, writing them to the standard output. The file oper-
ands are processed in command-line order. If file is a single dash (`-') or absent, cat reads
from the standard input. If file is a UNIX domain socket, cat connects to it and then reads it
until EOF. This complements the UNIX domain binding capability available in inetd(8).
The options are as follows:
-b Number the non-blank output lines, starting at 1.
-e Display non-printing characters (see the -v option), and display a dollar sign (`$') at
the end of each line.
-n Number the output lines, starting at 1.
...
The man page can be browsed with the up and down arrows. Man pages, as well as other tutorials and documentations, can also be found online (just Google for it). When looking for documentation on various bash commands, look to either the man pages or Google.
In a few of the examples in these notes, we have seen the concept of passing “command line argument” to bash programs.
Ultimately, command-line argument(s) are token(s) (strings of characters) that come after the command name.
Shell commands have the ability to read the text that comes “after” the command name, and use this text to change the behavior of the program.
Even in the sime example of grep Bill names.txt
, Bill
and names.txt
is a command line argument.
grep
reads in these strings and interprets them as necessary.
It uses the first token/string it encounters (Bill
) to determine what to search for.
It uses the second token/string (names.txt
) as the name of the file search in.
These kinds of arguments are called positional arguments, because the command interprets them based on their position (order) that they are listed in.
grep
assumes that the search term comes first, and the file nam comes second.
Many shell programs also have non-positional arguments and options.
These are typically specified in the form of command -CHAR specification
.
In the head
example earlier, we ran the command head -n 3 grades.txt
.
-n 3
tells head
to only print hte first three lines of grades.txt
.
The -n
specifies what option is about to be specified, and then it is specified with 3
.
This is a non-positional argument, because the -n 3
could be put in (nearly) any place in the command.
Both head -n 3 grades.txt
and head grades.txt -n 3
would work fine.
Some non-positional arguments are boolean (specifiying a particular feature to be either enabled or disabled).
The command grep -i A names.txt
is an example of this.
-i
tells grep
to ignore case when doing the search.
Both grep -i A names.txt
and grep A names.txt -i
are valid.
It is sometimes useful to run multiple bash commands with just a single line of bash.
If you want to do this, you can separate multiple separate bash commands with ;
.
Let’s run through a few examples.
Perhaps one wants to list all of the files on the Desktop
directory and the Documents
directory simultaneously:
$ echo "--- DESKTOP FILES ---" ; ls /Users/bddicken/Desktop ; echo "--- DOCUMENT FILES ---" ; ls /Users/bddicken/Documents ;
--- DESKTOP FILES ---
ben-small.jpg hagura.zip ubuntu-14.04.5-desktop-amd64.iso
create-dirs-and-files.sh test unetbootin-mac-625.dmg
--- DOCUMENT FILES ---
Untitled.txt peers pw
In this single line of bash, 4 separate bash commands were executed:
echo "--- DESKTOP FILES ---"
ls /Users/bddicken/Desktop
echo "--- DOCUMENT FILES ---"
ls /Users/bddicken/Documents
These are run in the order they are placed in on the command-line, separated by ;
.
Another example:
$ echo "--- The first two lines of grades.txt ---" ; head -n 2 grades.sh ; echo "--- The last two lines of grades.txt ---" ; tail -n 2 grades.sh
--- The first two lines of grades.txt ---
99,Jimmy,Smith
97,Sally,Talbot
--- The last two lines of grades.txt ---
76,David,Abraham
62,Fred,Francis
In a single line of bash, we ran the following four commands in the order provided:
echo "--- The first two lines of grades.txt ---"
head -n 2 grades.sh
echo "--- The last two lines of grades.txt ---"
tail -n 2 grades.sh
Future homework assignments and exams may require you to know more bash commands than the ones specifically covered in class. In such cases, use google, man pages or office hours to get help on learning new commands.
All of the commands we have learned about so far:
pwd
Displays the “current working directory” of the shell sessionls
Lists all of the files and folders (directories) in the current shell sessioncd
Allows us to change to a new directory in ths shell sessionmkdir
Create new directorie(s)touch
Create new file(s)rm
Remove file(s)rmdir
Remove directorie(s)echo
Prints the text passed to itcat
Prints the contents of files. Works best for text-based filessort
Sorts text inputgrep
Searches for strings in files and inputhead
Prints the beginning of filestail
Prints the end of filescut
Extract regions of each line from a file or input stream