COMP1511 19T2
COMP1511 19T2

Objectives

  • simple processing of characters
  • using command line arguments
  • an introduction to encryption & decryption

Preparation

Before the lab you should re-read the relevant lecture slides and their accompanying examples.

Getting Started

Create a new directory for this lab called lab06 by typing:
mkdir lab06
Change to this directory by typing:
cd lab06

Introduction

WWII code-breaking at Bletchley Park was the genesis of modern computing. In this lab you too will perform computer-assisted code-breaking but don't worry the ciphers you must break are simpler than the Nazi's Lorenz and Enigma ciphers.

The exercises in the labs require you to read characters. Use the library function getchar to do this.

getchar reads the next character from standard input and returns it. If getchar is unable to read a character it return the special value EOF.

When input is coming from a file, getchar will return EOF after the last character in the file is read.

When input is coming from a (Linux/OSX) terminal, you can indicate no more characters can be read by typing Ctrl+D. This will cause getchar to return EOF

In some of this week's lab exercises you will find it convenient to put the test input in a file, rather than type it every time you want to test the program.

You can use a < character to indicate to the shell that you want to run a program taking its input from a file.

So for example you might create the file input.txt with gedit and then run a.out takes its input from the file rather the terminal:

gedit input.txt &
./a.out <input.txt

Exercise: Devowelling Text (pair)

This is a pair exercise to complete with your lab partner.
Write a C program devowel.c which reads characters from its input and writes the same characters to its output, except it does not write lower case vowels ('a', 'e','i', 'o', 'u').

Your program should stop only at the end of input.

For example:

./devowel
Are you saying 'Boo' or 'Boo-Urns'?
Ar y syng 'B' r 'B-Urns'?
In this house, we obey the laws of thermodynamics!
In ths hs, w by th lws f thrmdynmcs!

Hint: hint use getcharto read characters (don't use scanf or fgets).

Hint: you need only a single int variable. Don't use an array.

Hint: use putchar to output each character.

Hint: make sure you understand this example program which reads characters until end of input.

Hint: make sure you understand this example program which reads characters, printing them with lower case letters converted to upper case.

Hint: create a function with a prototype like this:

int is_vowel(int character);
which returns 1 the character is a lower case vowel and 0 otherwise.

Hint: To tell the program you have finished typing, you can press Ctrl+D.

New! You can run an automated code style checker using the following command:
1511 style devowel.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest devowel

Autotest Results

97% of 447 students who have autotested devowel.c so far, passed all autotest tests.
  • 97% passed test 0
  • 97% passed test 1
  • 97% passed test 2
When you are finished on this exercise you and your lab partner must both submit your work by running give:
give cs1511 lab06_devowel devowel.c
Note, even though this is a pair exercise, you both must run give from your own account before Monday 15 July 17:00 to obtain the marks for this lab exercise.

Exercise: Its a Case of Swapping (pair)

This is a pair exercise to complete with your lab partner.
Write a C program swap_case.c which reads characters from its input and writes the same characters to its output with lower case letters converted to upper case and upper case letters converted to lower case.

Your program should stop only at the end of input.

For example:

dcc swap_case.c -o swap_case
./swap_case
Are you saying 'Boo' or 'Boo-Urns'?
aRE YOU SAYING 'bOO' OR 'bOO-uRNS'?
In this house, we obey the laws of thermodynamics!
iN THIS HOUSE, WE OBEY THE LAWS OF THERMODYNAMICS!
UPPER !@#$% lower
upper !@#$% LOWER

Hint: hint use getcharto read characters (don't use scanf or fgets).

Hint: you need only a single int variable. Don't use an array.

Hint: use putchar to output each character.

Hint: make sure you understand this example program which reads characters until end of input.

Hint: make sure you understand this example program which reads characters, printing them with lower case letters converted to upper case.

Hint: create a function with a prototype like this:

int swap_case(int character);
which:
  • returns the character in lower case if it is an upper case letter
  • returns the character in upper case if it is a lower case letter
  • returns the character unchanged otherwise

New! You can run an automated code style checker using the following command:
1511 style swap_case.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest swap_case

Autotest Results

98% of 432 students who have autotested swap_case.c so far, passed all autotest tests.
  • 98% passed test 0
  • 98% passed test 1
  • 98% passed test 2
When you are finished on this exercise you and your lab partner must both submit your work by running give:
give cs1511 lab06_swap_case swap_case.c
Note, even though this is a pair exercise, you both must run give from your own account before Monday 15 July 17:00 to obtain the marks for this lab exercise.

Exercise: Encrypting Text with a Caesar Cipher (pair)

This is a pair exercise to complete with your lab partner.
Write a C program caesar.c which reads characters from its input and writes the characters to its output encrypted with a Caesar cipher.

A Caesar cipher shifts each letter a certain number of positions in the alphabet.

The number of positions to shift will be given to your program as a command line argument.

Characters other than letters should not be encrypted.

Your program should stop only at the end of input.

Your program should contain at least one function other than main.

For example:

./caesar 1
This life well it's slipping right through my hands
Uijt mjgf xfmm ju't tmjqqjoh sjhiu uispvhi nz iboet
These days turned out nothing like I had planned
Uiftf ebzt uvsofe pvu opuijoh mjlf J ibe qmboofe

./caesar 10
abcdefghijklmnopqrstuvwxyz
klmnopqrstuvwxyzabcdefghij
ABCDEFGHIJKLMNOPQRSTUVWXYZ
KLMNOPQRSTUVWXYZABCDEFGHIJ

./caesar -42
Control well it's slipping right through my hands
Myxdbyv govv sd'c cvszzsxq bsqrd drbyeqr wi rkxnc
These days?
Droco nkic?

Hint: handle upper and lower case letters separately

Hint: use %

Hint: use atoi to convert the first command-line argument to an int.

Hint:make sure you understand this example program which uses a atoi to convert command-line arguments to an ints.

Hint: create a function with a prototype like this:

int encrypt(int character, int shift);
which returns the character shifted by the specified amount

Manually Cracking a Caesar Cipher

Here is some (New Zealand) English text that has been encrypted with a Caesar cipher.
Z uf dp drbvlg ze jfdvsfup vcjv'j tri
Nv fiuvi uzwwvivek uizebj rk kyv jrdv srij
Z befn rsflk nyrk pfl uzu reu Z nreer jtivrd kyv kilky
Jyv kyzebj pfl cfmv kyv svrty, pfl'iv jlty r urde czri
Use the program you have just written to discover the secret text?

Hint:: try different shifts until you see English.

You program will only be tested with an appropriate command line argument - but a good programmer would check the command line argument is present and appropriate.

New! You can run an automated code style checker using the following command:
1511 style caesar.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest caesar

Autotest Results

88% of 385 students who have autotested caesar.c so far, passed all autotest tests.
  • 98% passed test 0
  • 96% passed test 1
  • 95% passed test 2
  • 95% passed test 3
  • 95% passed test 4
  • 94% passed test 5
  • 89% passed test 6
  • 94% passed test 7
  • 88% passed test 8
When you are finished on this exercise you and your lab partner must both submit your work by running give:
give cs1511 lab06_caesar caesar.c
Note, even though this is a pair exercise, you both must run give from your own account before Monday 15 July 17:00 to obtain the marks for this lab exercise.

Exercise: Working Out the Letter Frequencies of Text (pair)

This is a pair exercise to complete with your lab partner.
Write a C program frequency_analysis.c which reads characters from its input until end of input.

It should then print the occurrence frequency for each of the 26 letters 'a'..'z'.

The frequency should be printed as a decimal value and an absolute number in exactly the format below.

Note upper and lower case letters are counted together.

For example:

./frequency_analysis
Hello and goodbye.

'a' 0.066667 1
'b' 0.066667 1
'c' 0.000000 0
'd' 0.133333 2
'e' 0.133333 2
'f' 0.000000 0
'g' 0.066667 1
'h' 0.066667 1
'i' 0.000000 0
'j' 0.000000 0
'k' 0.000000 0
'l' 0.133333 2
'm' 0.000000 0
'n' 0.066667 1
'o' 0.200000 3
'p' 0.000000 0
'q' 0.000000 0
'r' 0.000000 0
's' 0.000000 0
't' 0.000000 0
'u' 0.000000 0
'v' 0.000000 0
'w' 0.000000 0
'x' 0.000000 0
'y' 0.066667 1
'z' 0.000000 0
./frequency_analysis
Hey! Hey! Hey!
I don't like walking around this old and empty house
So hold my hand, I'll walk with you my dear

'a' 0.072289 6
'b' 0.000000 0
'c' 0.000000 0
'd' 0.084337 7
'e' 0.084337 7
'f' 0.000000 0
'g' 0.012048 1
'h' 0.096386 8
'i' 0.072289 6
'j' 0.000000 0
'k' 0.036145 3
'l' 0.084337 7
'm' 0.036145 3
'n' 0.060241 5
'o' 0.084337 7
'p' 0.012048 1
'q' 0.000000 0
'r' 0.024096 2
's' 0.036145 3
't' 0.048193 4
'u' 0.036145 3
'v' 0.000000 0
'w' 0.036145 3
'x' 0.000000 0
'y' 0.084337 7
'z' 0.000000 0
Hint: hint use getcharto read characters (don't use scanf or fgets).

Hint: make sure you understand this example program which reads characters until end of input.

Hint: use an array to store counts of each letter.

Hint: make sure you understand this example program which counts integers from the range 0..99.

New! You can run an automated code style checker using the following command:
1511 style frequency_analysis.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest frequency_analysis

Autotest Results

96% of 330 students who have autotested frequency_analysis.c so far, passed all autotest tests.
  • 97% passed test 0
  • 96% passed test 1
  • 96% passed test 2
When you are finished on this exercise you and your lab partner must both submit your work by running give:
give cs1511 lab06_frequency_analysis frequency_analysis.c
Note, even though this is a pair exercise, you both must run give from your own account before Monday 15 July 17:00 to obtain the marks for this lab exercise.

Exercise: Encrypting Text with a Substitution Cipher (pair)

This is a pair exercise to complete with your lab partner.
Write a C program substitution.c which reads characters from its input and writes the characters to its output encrypted with a Substitution cipher.

A Substitution cipher maps each letter to another letter.

The mapping will be given to your program as a single command line argument. This command line argument will contain 26 characters: an ordering of the letters 'a'..'z'.

Characters other than letters should not be encrypted.

Your program should stop only at the end of input.

Your program should contain at least one function other than main.

For example:

./substitution qwertyuiopasdfghjklzxcvbnm
I was scared of dentists and the dark
O vql leqktr gy rtfzolzl qfr zit rqka
I was scared of pretty girls and starting conversations
O vql leqktr gy hktzzn uoksl qfr lzqkzofu egfctklqzogfl

./substitution abcdefghijklmnopqrstuvwxyz
The identity cipher!!!
The identity cipher!!!

./substitution bcdefghijklmnopqrstuvwxyza
The Caesar cipher is a subset of the substitution cipher!
Uif Dbftbs djqifs jt b tvctfu pg uif tvctujuvujpo djqifs!

Your program will only be tested with an appropriate command line argument - but a good programmer would check the command line argument is present and appropriate.
New! You can run an automated code style checker using the following command:
1511 style substitution.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest substitution

Autotest Results

94% of 282 students who have autotested substitution.c so far, passed all autotest tests.
  • 94% passed test 0
  • 95% passed test 1
  • 94% passed test 2
  • 94% passed test 3
  • 94% passed test 4
When you are finished on this exercise you and your lab partner must both submit your work by running give:
give cs1511 lab06_substitution substitution.c
Note, even though this is a pair exercise, you both must run give from your own account before Monday 15 July 17:00 to obtain the marks for this lab exercise.

Exercise: Decrypting a Substitution Cipher (pair)

This is a pair exercise to complete with your lab partner.
Write a C program decode.c which decrypts text encrypted by substitution.c

For example:

./decode qwertyuiopasdfghjklzxcvbnm
O vql leqktr gy rtfzolzl qfr zit rqka
I was scared of dentists and the dark
O vql leqktr gy hktzzn uoksl qfr lzqkzofu egfctklqzogfl
I was scared of pretty girls and starting conversations

./decode abcdefghijklmnopqrstuvwxyz
The identity cipher!!!
The identity cipher!!!
./decode bcdefghijklmnopqrstuvwxyza
Uif Dbftbs djqifs jt b tvctfu pg uif tvctujuvujpo djqifs!
The Caesar cipher is a subset of the substitution cipher!

Your program will only be tested with an appropriate command line argument - but a good programmer would check the command line argument is present and appropriate.

Manually Cracking a Substitution Cipher

This English text was encrypted with a substitution cipher.
Di jd, vdl'ht xtqa dh O qn
Vdl rdlwk O'ss wdkith htqromu omkd ok
O fhdwqwsv xdm'k
Styk kd nv dxm rtzoetj
Wlk kiqk'j kit royythtmet om dlh dfomodmj

Vdl'ht q ndlkiyls
Kiqk qndlmkj ydh qmdkith xtta dm nv dxm
Mdx O'n q mdzts nqrt htjdlhetyls
O jkqhk q eiqom xoki nv kidluik

Kqsa oj eitqf, nv rqhsomu
Xitm vdl'ht yttsomu houik qk idnt
O xqmmq nqat vdl ndzt xoki edmyortmet
O xqmmq wt xoki vdl qsdmt
What was the original text?

Hint: use frequency_analysis.c on the encrypted text and compare the frequencies to English letter frequencies and then try your guesses with decode.c

New! You can run an automated code style checker using the following command:
1511 style decode.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest decode

Autotest Results

96% of 250 students who have autotested decode.c so far, passed all autotest tests.
  • 96% passed test 0
  • 97% passed test 1
  • 97% passed test 2
  • 97% passed test 3
  • 96% passed test 4
When you are finished on this exercise you and your lab partner must both submit your work by running give:
give cs1511 lab06_decode decode.c
Note, even though this is a pair exercise, you both must run give from your own account before Monday 15 July 17:00 to obtain the marks for this lab exercise.

Challenge Exercise: Cracking A Caesar Cipher (individual)

This is an individual exercise to complete by yourself.
Write a C program crack_caesar.c which decrypts text encrypted by an unknown Caesar cipher.

Your program should make no assumptions about the language of the original text - don't assume its English. However, you can assume the English alphabet ('a'..'z').

Your program will be given as a command-line argument the name of a file containing a large amount of unencrypted text in the same language as the encrypted text.

For example for example your program might be given this file containing 188k characters of English text (wikipedia sentences from here)

Your program will be given the encrypted text on standard input. It should print its decryption.

For example, here is some English text encrypted with a Caesar cipher with an unknown shift:

Kyzj zj fli crjk xffuspv
Z yrkv kf wvvc kyv cfmv svknvve lj uzv
Slk zk'j fmvi
Aljk yvri kyzj reu kyve Z'cc xf
Pfl xrmv dv dfiv kf czmv wfi
Dfiv kyre pfl'cc vmvi befn
So for example:
./crack_caesar wiki_sentences.txt
Kyzj zj fli crjk xffuspv
Z yrkv kf wvvc kyv cfmv svknvve lj uzv
Slk zk'j fmvi
Aljk yvri kyzj reu kyve Z'cc xf
Pfl xrmv dv dfiv kf czmv wfi
Dfiv kyre pfl'cc vmvi befn

This is our last goodbye
I hate to feel the love between us die
But it's over
Just hear this and then I'll go
You gave me more to live for
More than you'll ever know
You may assume the encrypted text of stdin contains at most 10000 characters.

You may assume the unencrypted example text in the file contains at most 250000 characters.

Hint: use fopen to open the file and fgetc to read the file. If you haven't seen them in lectures yet, read this example program to see how to use this functions to read a file.

Hint: read all the encrypted text into an array, then decrypt it.

New! You can run an automated code style checker using the following command:
1511 style crack_caesar.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest crack_caesar

Autotest Results

47% of 19 students who have autotested crack_caesar.c so far, passed all autotest tests.
  • 47% passed test 0
  • 47% passed test 1
  • 47% passed test 2
  • 63% passed test 3
When you are finished working on this exercise you must submit your work by running give:
give cs1511 lab06_crack_caesar crack_caesar.c
You must run give before Monday 15 July 17:00 to obtain the marks for this lab exercise. Note, this is an individual exercise, the work you submit with give must be entirely your own.

Extra-hard challenge: Cracking A Substitution Cipher (individual - attempt if you dare)

This is an individual exercise to complete by yourself.
Write a C program crack_substitution.c which decrypts text encrypted by an unknown s cipher.

Your program should make no assumptions about the language of the original text - don't assume its English. In other words don't hard code English properties into your program, extract the statistical properties from the sample plain text. However, you can assume the English alphabet ('a'..'z').

Your program will be given as a command-line argument the name of a file containing a large amount of unencrypted text in the same language as the encrypted text.

Your program will be given the encrypted text on standard input. You may read it all before printing the decryption.

For example:

./crack_substitution wiki_sentences.txt
M'ka paat dra qegbu, ueta md xbb
Rxu vw fxya teq
Umxvetup, ogmbbmxtd, mt Oab-Xmg teq
Red psvvag tmlrdp, vmu Jsbw
Qrat wes xtu M qaga negakag qmbu
Dra fgxzw uxwp, fmdw bmlrdp
Dra qxw wes'u cbxw qmdr va bmya x frmbu
Qmbb wes pdmbb beka va
Qrat M'v te betlag westl xtu oaxsdmnsb?
Qmbb wes pdmbb beka va
Qrat M'ka led tedrmtl osd vw xfrmtl pesb?
M yteq wes qmbb, M yteq wes qmbb
M yteq drxd wes qmbb
Qmbb wes pdmbb beka va qrat M'v te betlag oaxsdmnsb?
M'ka paat dra qegbu, bmd md sc
Xp vw pdxla teq
Frxbbatlmtl xtlabp mt x taq xla teq
Red psvvag uxwp, gefy t gebb
Dra qxw wes cbxw neg va xd wesg preq
Xtu xbb dra qxwp, M led de yteq
Wesg cgaddw nxfa xtu abafdgmf pesb

I've seen the world, done it all
Had my cake now
Diamonds, brilliant, in Bel-Air now
Hot summer nights, mid July
When you and I were forever wild
The crazy days, city lights
The way you'd play with me like a child
Will you still love me
When I'm no longer young and beautiful?
Will you still love me
When I've got nothing but my aching soul?
I know you will, I know you will
I know that you will
Will you still love me when I'm no longer beautiful?
I've seen the world, lit it up
As my stage now
Challenging angels in a new age now
Hot summer days, rock n roll
The way you play for me at your show
And all the ways, I got to know
Your pretty face and electric soul
You may assume the encrypted text on stdin contains at most 10000 characters.

You may assume the unencrypted example text in the file contains at most 250000 characters.

Hint: you will need to look at the probabilities of sequences of 2 or perhaps 3 letters occurring or perhaps the probabilities of words.

b>Hint

: the C library functions fopen, fgetc which which you need to read the file haven't been covered lectures. You'll need to research these.

An autotest is available to help you test your program but because this is a difficult problem it is possible very good attempts at the problem won't pass the autotests.

New! You can run an automated code style checker using the following command:
1511 style crack_substitution.c

When you think your program is working you can use autotest to run some simple automated tests:

1511 autotest crack_substitution

Autotest Results

0% of 5 students who have autotested crack_substitution.c so far, passed all autotest tests.
When you are finished working on this exercise you must submit your work by running give:
give cs1511 lab06_crack_substitution crack_substitution.c
You must run give before Monday 15 July 17:00 to obtain the marks for this lab exercise. Note, this is an individual exercise, the work you submit with give must be entirely your own.

Submission

When you are finished each exercises make sure you submit your work by running give.

You can run give multiple times. Only your last submission will be marked.

Don't submit any exercises you haven't attempted.

If you are working at home, you may find it more convenient to upload your work via give's web interface.

Remember you have until Monday 15 July 17:00 to submit your work.

You cannot obtain marks by e-mailing lab work to tutors or lecturers.

You check the files you have submitted here

Automarking will be run by the lecturer several days after the submission deadline for the test, using test cases that you haven't seen: different to the test cases autotest runs for you.

(Hint: do your own testing as well as running autotest)

After automarking is run by the lecturer you can view it here the resulting mark will also be available via via give's web interface

Lab Marks

When all components of a lab are automarked you should be able to view the the marks via give's web interface or by running this command on a CSE machine:

1511 classrun -sturec
The lab exercises for each week are worth in total 2 marks.

The best 8 of your 9 lab marks for weeks 2-10 will be summed to give you a mark out of 13. If their sum exceeds 13 - your mark will be capped at 13.

  • You can obtain full marks for the labs without completing challenge exercises
  • You can miss 1 lab without affecting your mark.