What You Should Know For Lecture: Week 4

Monday:

Announcements-

Homework

Any exercises not completed in class will be due Wednesday. There is also an HW 4 assignment due on Wednesday by 11 pm.

Readings Read about job submission, job monitoring, and compute nodes on Hoffman2.

Project If you are worried about your project, come to my office hours. They are after class at the Music Cafe.


What is a loop?

We played around with a ‘for’ loop last time, but what is the purpose of loops? They make our lives easier buy doing repetitive tasks. They can ‘do’ an number of things for a list of items, a range of numbers, while some condition is true, or if something exists. I personally love loops, and I hope that you will too.


The ‘for’ loop revisited. This is essentially what it does:

for thing in list_of_things
do
    operation_using $thing    # Indentation within the loop is not required, but aids legibility
done

The loop we saw last time looked like this:

$ for i in ./*  # i is a script in the list of scripts found in ./*
> do
> bash $i # bash ran each script in the list of scripts.
> done

We can also do this as a one liner….

for i in ./*; do bash $i; done  # the ';' indicate new lines

We used a for loop to do something to a bunch of files. We can use a for loop to do something to a list or items:

for word in alpha beta kappa delta epsilon
do
  touch ${word}-file.txt
done

How would this look as a one liner?

for word in alpha beta kappa delta epsilon; do touch ${word}-file.txt; done

These could also be filenames…. cd into ~/classdata/Homework_data/data-shell/creatures for this…

for x in basilisk.dat unicorn.dat
do
    head -n 2 $x | tail -n 1
done

cd back into the directory that you made for today’s class.

If there are spaces in items of a list you need to use “”. For example:

for spacenames in "outer space" "office space" "personal space"
do
  echo $spacenames
done

We can also iterate over a sequence of numbers:

for NUM in `seq 1 1 10`
do
  echo ${NUM}
done

What happens if we change the values in the seq?

for NUM in `seq 1 2 10`
do
  echo ${NUM}
done
for NUM in `seq 2 1 10`
do
  echo ${NUM}
done
for NUM in `seq 1 1 5`
do
  echo ${NUM}
done

‘seq X Y Z’ * X=starting value * Y=number of count between digits to return * Z= upper limit to return

We can also iterate over a list of things contained within a document.

for line in `cat ~/classdata/Homework_data/data-shell/data/animals.txt`;
do
  echo $line
done

Did it work? Did you use ` or ’ it makes a difference….

Write a for loop that loops over the colors of the rainbow and prints them to a file. The colors are red, orange, yellow, green, blue, indigo, violet.


While and Until loops!

The while loop does something until a condition is true. We will use a similar example to what was in the reading:

counter=3  
while [ $counter -le 5 ]
do
echo $counter
((counter++))
done

Why did we do counter=3

What does [ $counter -le 5 ] mean?

What does ((counter++)) do?

The until loop does something until a condition is met. We will use a similar example to what was in the reading:

counter=2
until [ $counter -gt 5 ]
do
echo $counter
((counter++))
done

How does [ $counter -le 5 ] differ from [ $counter -gt 5 ]?


integer comparison

-eq is equal to

[ “\(a" -eq "\)b” ]

-ne is not equal to

[ “\(a" -ne "\)b” ]

-gt is greater than

[ “\(a" -gt "\)b” ]

-ge is greater than or equal to

[ “\(a" -ge "\)b” ]

-lt is less than

[ “\(a" -lt "\)b” ]

-le is less than or equal to

[ “\(a" -le "\)b” ]

< is less than (within double parentheses)

((“\(a" < "\)b”))

<= is less than or equal to (within double parentheses)

((“\(a" <= "\)b”))

> is greater than (within double parentheses)

((“\(a" > "\)b”))

>= is greater than or equal to (within double parentheses)

((“\(a" >= "\)b”))


Ranges

Similar to until and while loops, we can write a for loop that covers ranges of integers

for numb in {10..1}
do
  echo $numb
done

How is this different?

for numb in {1..10}
do
  echo $numb
done

We can also skip over integers like we did with seq 1 2 10

for numb in {1..10..3}
do
  echo $numb
done

How is this different?

for numb in {10..1..3}
do
  echo $numb
done

String Comparison

= : is equal to

if [ “\(a" = "\)b” ]

if [ “\(a"="\)b” ] this does not work because it is missing whitespace.

== : is equal to

if [ “\(a" == "\)b” ]

if [ “\(a" =. "\)b” ] also does the same thing

!= : is not equal to

if [ “\(a" != "\)b” ]

< : is less than, in ASCII alphabetical order

if [[ “\(a" < "\)b” ]] Notice that you can use double brackets

if [ “\(a" \< "\)b” ]

Note that the “<” needs to be escaped within a single brackets. Escaped is when you add an “" in front of a character. We will see this again with regular expressions.

> : is greater than, in ASCII alphabetical order

if [[ “\(a" > "\)b” ]]

if [ “\(a" \> "\)b” ]

-z : string is null, that is, has zero length

if [ -z $a ]

-n : string is not null.


The If then else statement

This is very much like the if then statement but instead of having a single condition to meet, it has two.

String="not dead yet"  

if [ -n "$String" ] # -n means not null see above
then
  echo "\$String is NOT null."
  echo $String
else
  echo "\$String is null."
fi     # $String is Not null.

Then this one:

String=""  

if [ -n "$String" ]
then
  echo "\$String is NOT null."
  echo $String
else
  echo "\$String is null."
fi     # $String is Not null.

In class exercises:

Make a directory for todays class. cd into that directory.

Copy all of your scripts and text files into ~/classdata/In_class/Week4/Monday

  1. Write a script called cut_stuff.sh . Your script needs to include the following:
  • comments on usage
  • It must take an input file, a delimiter, the column to be printed, and an output file as arguments. For example: $ sh cut_stuff_EC.sh ~/classdata/Homework_data/data-shell/data/amino-acids.txt : 1 cut_stuff_EC_DC_out.sh

  • It must cut the text file at the first argument keep the text in the column indicated by the second argument and print that column to a new folder. Hint see line 171-177 of Week 2 Wednesday.

  1. Take the for loop that you used in class to print the colors of the rainbow (as is) and make a script called rainbow1.sh that runs that for loop, then takes that output file and generates another output file that has the colors printed in reverse order.

  2. Write and comment a script called rainbow2.sh that takes the for loop that you used in class to print the colors of the rainbow (as is) and if the color is equal to blue print sky, for every other color print fire. Save this script as color.sh . The output should be as follows:

    red
    fire
    orange
    fire
    yellow
    fire
    green
    fire
    blue
    sky
    indigo
    fire
    violet
    fire
  3. Write an script called until8.sh that contains an until loop that prints a counter from 5 until the counter is greater than 8.

  4. Write an script called while_8.sh that contains an while loop that prints a counter while the counter is not greater than 8. Start the counter at 5.

  5. Write an script called range8.sh that contains an range loop that prints a range of numbers from 5 to 8.


Make a text file called Week4_monday.txt

If you ls the the directory ~/classdata/Homework_data/data-shell/molecules directory, you get the following output:

cubane.pdb  ethane.pdb  methane.pdb  octane.pdb  pentane.pdb  propane.pdb
  1. What is the output of the following code?

    $ for datafile in *.pdb
    > do
    >    ls *.pdb
    > done
  2. What is the output of the following code?

    $ for datafile in *.pdb
    > do
    >   ls $datafile
    > done
  3. Why do these two loops give different outputs?

  4. What is output of running the following loop in the data-shell/molecules directory?

    $ for filename in c*
    > do
    >    ls $filename
    > done

    a. No files are listed. b. All files are listed. c. Only cubane.pdb, octane.pdb and pentane.pdb are listed. d. Only cubane.pdb is listed.

  5. How would the output differ from using this command instead?

    $ for filename in *c*
    > do
    >    ls $filename
    > done

    a. The same files would be listed. b. All the files are listed this time. c. No files are listed this time. d. The files cubane.pdb and octane.pdb will be listed. e. Only the file octane.pdb will be listed.

  6. In the data-shell/molecules directory, what is the effect of this loop?

    for alkanes in *.pdb
    do
    echo $alkanes
    cat $alkanes > alkanes.pdb
    done

    a. Prints cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, pentane.pdb and propane.pdb, and the text from propane.pdb will be saved to a file called alkanes.pdb. b. Prints cubane.pdb, ethane.pdb, and methane.pdb, and the text from all three files would be concatenated and saved to a file called alkanes.pdb. c. Prints cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, and pentane.pdb, and the text from propane.pdb will be saved to a file called alkanes.pdb. d. None of the above.

  7. What would be the output of the following loop?

for datafile in *.pdb
> do
> cat $datafile >> all.pdb
> done

a. All of the text from cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, and pentane.pdb would be concatenated and saved to a file called all.pdb. b. The text from ethane.pdb will be saved to a file called all.pdb. c. All of the text from cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, pentane.pdb and propane.pdb would be concatenated and saved to a file called all.pdb. d. All of the text from cubane.pdb, ethane.pdb, methane.pdb, octane.pdb, pentane.pdb and propane.pdb would be printed to the screen and saved to a file called all.pdb.