Using the Shell


Why the shell

  • Programmatically interact with OS (file system)
  • Origins in the 1960s and 1970s
  • “designed by computer scientists for computer scientists”
  • Evolution - interactive command language to scripting programming language
  • “quick and dirty” prototyping

Unix Design Philosophy

“Even though the UNIX system introduces a number of innovative programs and techniques, no single program or idea makes it work well. Instead, what makes it effective is the approach to programming, a philosophy of using the computer. Although that philosophy can’t be written down in a single sentence, at its heart is the idea that the power of a system comes more from the relationships among programs than from the programs themselves. Many UNIX programs do quite trivial things in isolation, but, combined with other programs, become general and useful tools.”

The UNIX Programming Environment, Brian Kernighan and Rob Pike

Where am I? (pwd, ls, cd)

pwd
## /Users/rundel/Data/shell_ex
ls -l
## total 8
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 haiku.txt
cd bob
pwd
## /Users/rundel/Data/shell_ex/bob

ls -la
cd ..
## total 24
## drwxr-xr-x  6 rundel  staff   204 Aug 26 22:27 .
## drwxr-xr-x  5 rundel  staff   170 Aug 29 11:45 ..
## -rw-r--r--@ 1 rundel  staff  6148 Aug 31 01:14 .DS_Store
## drwxr-xr-x  4 rundel  staff   136 Aug 14 16:16 data
## drwxr-xr-x  8 rundel  staff   272 Aug 29 11:45 labs
## -rw-r--r--  1 rundel  staff  3944 Aug 26 22:27 notes.txt
cd .
pwd
## /Users/rundel/Data/shell_ex/bob
cd ..
pwd
## /Users/rundel/Data/shell_ex

Creating or Deleting directories (mkdir, rmdir)

ls -l
## total 8
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 haiku.txt
mkdir test
ls -l
## total 8
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 haiku.txt
## drwxr-xr-x  2 rundel  staff   68 Sep  4 14:33 test

rmdir test
ls -l
## total 8
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 haiku.txt

Copying, moving and deleting (cp, mv, rm)

cp haiku.txt awesome_haiku.txt
ls -l
## total 16
## -rw-r--r--  1 rundel  staff  218 Sep  4 14:33 awesome_haiku.txt
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 haiku.txt
rm awesome_haiku.txt
ls -l
## total 8
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 haiku.txt

ls -l
## total 8
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 haiku.txt
mv haiku.txt better_haiku.txt
ls -l
## total 8
## -rw-r--r--  1 rundel  staff  218 Aug 14 16:16 better_haiku.txt
## drwxr-xr-x  6 rundel  staff  204 Aug 26 22:27 bob

Wildcards and the shell

  • * - matches any number of characters in a filename, including none.
  • ? - matches any single character.
  • [ ] - set of characters that may match a single character at that position.
  • - - used within [ ] denotes a range of characters or numbers.

ls bob/labs
## Lab1-PartA.txt
## Lab1-PartB.txt
## Lab1-PartC.txt
## Lab2.txt
## Lab3.txt

ls bob/labs/Lab*.txt
## bob/labs/Lab1-PartA.txt
## bob/labs/Lab1-PartB.txt
## bob/labs/Lab1-PartC.txt
## bob/labs/Lab2.txt
## bob/labs/Lab3.txt
ls bob/labs/Lab?.txt
## bob/labs/Lab2.txt
## bob/labs/Lab3.txt

ls -l bob/labs/*[AC].txt
## -rw-r--r--  1 rundel  staff  7 Aug 26 22:09 bob/labs/Lab1-PartA.txt
## -rw-r--r--  1 rundel  staff  7 Aug 26 22:09 bob/labs/Lab1-PartC.txt
ls -l bob/labs/*[A-C].txt
## -rw-r--r--  1 rundel  staff  7 Aug 26 22:09 bob/labs/Lab1-PartA.txt
## -rw-r--r--  1 rundel  staff  7 Aug 26 22:09 bob/labs/Lab1-PartB.txt
## -rw-r--r--  1 rundel  staff  7 Aug 26 22:09 bob/labs/Lab1-PartC.txt

Home directory and ~

~ is a special character that expands to the name of your home directory. If you append another user’s login name to the character, it refers to that user’s home directory.

cd ~
pwd
## /Users/rundel
cd ~guest
pwd
## /Users/Guest

Examining files (cat, more, head, tail)

cat bob/notes.txt
## Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec eget
## interdum nibh. Curabitur ut accumsan eros, nec maximus urna. Aliquam
## turpis dolor, dapibus vel erat quis, pretium dictum ligula. Etiam
## placerat eros sem, nec gravida ante facilisis eu. Curabitur vel elit
## suscipit, viverra risus quis, laoreet enim. Praesent efficitur felis a
## turpis imperdiet, eget ullamcorper massa condimentum. Duis ex lorem,
## ornare ut volutpat at, congue fermentum massa. Donec hendrerit enim
## ultrices dapibus finibus. Vivamus vehicula nisl eget tellus cursus, in
## porttitor lacus commodo. Suspendisse mattis libero a sem blandit, vel
## vehicula diam imperdiet. Duis pellentesque sem mauris, sit amet
## vehicula erat placerat rutrum. Maecenas sollicitudin lectus id
## accumsan porta. Integer ac lacinia massa. Ut ut convallis felis.
## Maecenas eu eros mollis, sagittis nunc sit amet, rutrum eros. Aliquam
## sed lacinia lacus.
## 
## Donec aliquam sodales mauris ut malesuada. Nam at accumsan tortor.
## Integer a lectus ut lacus fringilla viverra. Etiam bibendum dictum
## odio non ullamcorper. Vestibulum pulvinar et est nec viverra. Morbi
## tempus auctor enim id tristique. Pellentesque vulputate pretium leo,
## in mattis tellus laoreet ut. Quisque aliquet a erat quis pulvinar.
## Duis vehicula porttitor rutrum. Integer convallis at leo in tristique.
## In erat arcu, mattis vitae ante nec, ultricies porttitor lorem.
## Interdum et malesuada fames ac ante ipsum primis in faucibus. Sed a
## nisi eget eros egestas interdum. Nunc eu odio fringilla, euismod massa
## a, placerat arcu. Sed finibus tellus nulla, ac sodales elit suscipit
## 
## ...

head -n 7 bob/notes.txt
## Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec eget
## interdum nibh. Curabitur ut accumsan eros, nec maximus urna. Aliquam
## turpis dolor, dapibus vel erat quis, pretium dictum ligula. Etiam
## placerat eros sem, nec gravida ante facilisis eu. Curabitur vel elit
## suscipit, viverra risus quis, laoreet enim. Praesent efficitur felis a
## turpis imperdiet, eget ullamcorper massa condimentum. Duis ex lorem,
## ornare ut volutpat at, congue fermentum massa. Donec hendrerit enim

tail -n 15 bob/notes.txt
## semper libero. Nam vel magna convallis, lacinia odio non, tempus nunc.
## Proin fermentum justo condimentum lectus dignissim dictum. Nulla ac
## magna nibh.
## 
## Praesent id enim eget ex rutrum auctor id at quam. Mauris ultricies
## velit eu turpis condimentum ultrices. Mauris lacinia scelerisque
## efficitur. Nunc eu sem eget nulla luctus mattis. Integer ultrices dui
## eget tellus fermentum dapibus. Nullam interdum ante sit amet
## condimentum tincidunt. Nulla ac ullamcorper turpis. Etiam hendrerit
## lectus mi, vitae vehicula felis lobortis blandit. Morbi maximus
## efficitur libero, ac efficitur mi sollicitudin in. Donec dictum et
## arcu consequat consequat. Phasellus pharetra cursus ligula, vitae
## faucibus enim dictum quis. Proin eu eros cursus, tincidunt metus sed,
## pretium velit. Etiam a laoreet urna. Integer sed tristique odio, sed
## venenatis leo. Aliquam erat volutpat.

Pipes and Redirection

cat bob/notes.txt | wc 
##       65     580    3944
cat bob/notes.txt | grep [Ll]orem
## Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec eget
## turpis imperdiet, eget ullamcorper massa condimentum. Duis ex lorem,
## In erat arcu, mattis vitae ante nec, ultricies porttitor lorem.
## mauris. In orci lectus, placerat in orci a, pretium facilisis lorem.

cd bob/labs/
cat Lab1-PartA.txt >  Lab1_1.txt
cat Lab1-PartB.txt >> Lab1_1.txt
cat Lab1-PartC.txt >> Lab1_1.txt
cat Lab1_1.txt
## Part A
## Part B
## Part C
cat Lab1-PartA.txt Lab1-PartB.txt Lab1-PartC.txt > Lab1_2.txt
cat Lab1_2.txt
## Part A
## Part B
## Part C

cat Lab1-Part[A-C].txt > Lab1_3.txt
cat Lab1_3.txt
## Part A
## Part B
## Part C

Oroborus cleverness

Want to see the 50th line of the file and nothing else?

head -n 50 bob/notes.txt | tail -n 1
## quis, consectetur risus. Praesent sit amet pellentesque ipsum, ac

What about the just the penultimate line?

tail -n 2 bob/notes.txt | head -n 1
## pretium velit. Etiam a laoreet urna. Integer sed tristique odio, sed

Finding stuff (find)

find . -name "*.txt"
## ./bob/data/first.txt
## ./bob/data/second.txt
## ./bob/labs/Lab1-PartA.txt
## ./bob/labs/Lab1-PartB.txt
## ./bob/labs/Lab1-PartC.txt
## ./bob/labs/Lab2.txt
## ./bob/labs/Lab3.txt
## ./bob/notes.txt
## ./haiku.txt

NASA Web logs

Lets download the data:

curl -O ftp://ita.ee.lbl.gov/traces/NASA_access_log_Jul95.gz
curl -O ftp://ita.ee.lbl.gov/traces/NASA_access_log_Aug95.gz

these files are compressed so we need to ungzip them,

gunzip NASA_access_log_Jul95.gz
gunzip NASA_access_log_Aug95.gz

We can also look at how many entries there are

cat NASA_access_log_Jul95 | wc -l
cat NASA_access_log_Aug95 | wc -l
##  1891714
##  1569898

head NASA_access_log_Jul95
## 199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
## unicomp6.unicomp.net - - [01/Jul/1995:00:00:06 -0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985
## 199.120.110.21 - - [01/Jul/1995:00:00:09 -0400] "GET /shuttle/missions/sts-73/mission-sts-73.html HTTP/1.0" 200 4085
## burger.letters.com - - [01/Jul/1995:00:00:11 -0400] "GET /shuttle/countdown/liftoff.html HTTP/1.0" 304 0
## 199.120.110.21 - - [01/Jul/1995:00:00:11 -0400] "GET /shuttle/missions/sts-73/sts-73-patch-small.gif HTTP/1.0" 200 4179
## burger.letters.com - - [01/Jul/1995:00:00:12 -0400] "GET /images/NASA-logosmall.gif HTTP/1.0" 304 0
## burger.letters.com - - [01/Jul/1995:00:00:12 -0400] "GET /shuttle/countdown/video/livevideo.gif HTTP/1.0" 200 0
## 205.212.115.106 - - [01/Jul/1995:00:00:12 -0400] "GET /shuttle/countdown/countdown.html HTTP/1.0" 200 3985
## d104.aa.net - - [01/Jul/1995:00:00:13 -0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985
## 129.94.144.152 - - [01/Jul/1995:00:00:13 -0400] "GET / HTTP/1.0" 200 7074
head NASA_access_log_Aug95
## in24.inetnebr.com - - [01/Aug/1995:00:00:01 -0400] "GET /shuttle/missions/sts-68/news/sts-68-mcc-05.txt HTTP/1.0" 200 1839
## uplherc.upl.com - - [01/Aug/1995:00:00:07 -0400] "GET / HTTP/1.0" 304 0
## uplherc.upl.com - - [01/Aug/1995:00:00:08 -0400] "GET /images/ksclogo-medium.gif HTTP/1.0" 304 0
## uplherc.upl.com - - [01/Aug/1995:00:00:08 -0400] "GET /images/MOSAIC-logosmall.gif HTTP/1.0" 304 0
## uplherc.upl.com - - [01/Aug/1995:00:00:08 -0400] "GET /images/USA-logosmall.gif HTTP/1.0" 304 0
## ix-esc-ca2-07.ix.netcom.com - - [01/Aug/1995:00:00:09 -0400] "GET /images/launch-logo.gif HTTP/1.0" 200 1713
## uplherc.upl.com - - [01/Aug/1995:00:00:10 -0400] "GET /images/WORLD-logosmall.gif HTTP/1.0" 304 0
## slppp6.intermind.net - - [01/Aug/1995:00:00:10 -0400] "GET /history/skylab/skylab.html HTTP/1.0" 200 1687
## piweba4y.prodigy.com - - [01/Aug/1995:00:00:10 -0400] "GET /images/launchmedium.gif HTTP/1.0" 200 11853
## slppp6.intermind.net - - [01/Aug/1995:00:00:11 -0400] "GET /history/skylab/skylab-small.gif HTTP/1.0" 200 9202

Working with (delimited) textual data

The first column contains the url / ip of who made the http request, lets see what we can do with the July logs.

cut -d" " -f1 NASA_access_log_Jul95 | head
## 199.72.81.55
## unicomp6.unicomp.net
## 199.120.110.21
## burger.letters.com
## 199.120.110.21
## burger.letters.com
## burger.letters.com
## 205.212.115.106
## d104.aa.net
## 129.94.144.152

cut -d" " -f1 NASA_access_log_Jul95 | sort | uniq | head
## cut: NASA_access_log_Jul95: Illegal byte sequence
## 128.102.86.254
## 128.111.114.75
## 128.117.71.26
## 128.120.12.14
## 128.126.50.31
## 128.138.177.51
## 128.147.44.103
## 128.148.15.20
## 128.148.30.57
## 128.158.21.103
cut -d" " -f1 NASA_access_log_Jul95 | sort | uniq | wc
## cut: NASA_access_log_Jul95: Illegal byte sequence
##     9021    9021  184451

cut -d" " -f1 NASA_access_log_Jul95 | sort | uniq -c | head -n 7
## cut: NASA_access_log_Jul95: Illegal byte sequence
##    3 128.102.86.254
##    8 128.111.114.75
##    4 128.117.71.26
##   17 128.120.12.14
##   22 128.126.50.31
##    2 128.138.177.51
##    5 128.147.44.103
cut -d" " -f1 NASA_access_log_Jul95 | sort | uniq -c | sort -rn | head -n 7
## cut: NASA_access_log_Jul95: Illegal byte sequence
## 1400 piweba3y.prodigy.com
## 1099 alyssa.prodigy.com
##  861 piweba1y.prodigy.com
##  829 disarray.demon.co.uk
##  669 www-b6.proxy.aol.com
##  620 piweba4y.prodigy.com
##  538 www-d4.proxy.aol.com

HTTP Status Codes

  • 1xx Informational
  • 2xx Success
  • 3xx Redirection
  • 4xx Client Error
  • 5xx Server Error

Examples:

  • 200 OK - Standard response for successful HTTP requests.
  • 404 Not Found - The requested resource could not be found but may be available later.

Some Exercises

Enough talking, now it is time for you to try some of this stuff. If you haven’t yet download the data onto your computer and see if you can do the following (work with your neighbors):

  • Examine the status codes, which are the most common?
  • In looking at the status codes did you notice anything out of the ordinary?
  • What can you tell me about missing web pages (404s), is it always the same pages that are missing or is there no obvious pattern?
  • What was the most popular page during each of the two month logging periods?

Acknowledgments

Above materials are derived in part from the following sources: