Lab 5: Using Classes


Introduction

In this lab's exercise, we examine two language translation problems. The first problem is the problem of making plural nouns, and the second problem is converting a word from English into Pig Latin. While they're rather different problems, both involve string manipulation. We will have to explore the string library of C++.

Making Plural Nouns

Making a noun plural usually consists of adding an 's', but sometimes there are special cases we must watch out for. Consider these examples:
Singular noun Plural noun
exercise exercises
noun nouns
word words
abscess abscesses
summons summonses
box boxes
hobby hobbies
party parties

Translating English into Pig Latin

As for Pig Latin, there are two rules:

  1. If the English word begins with a consonant, move the initial consonants to the end of the word and tack on "ay".
  2. If the English word begins with a vowel, tack on "yay" to the end of the word.
English Pig Latin English Pig Latin
alphabetalphabetyay nerderdnay
billygoatillygoatbay orthodoxorthodoxyay
crazyazycray pricklyicklypray
drippingippingdray quasimodoasimodoquay
eligibleeligibleyay rhythmythmrhay
farmarmfay spryyspray
ghostostghay threeeethray
happyappyhay uglyuglyyay
illegalillegalyay vigilantigilantvay
juryuryjay wretchedetchedwray
killjoyilljoykay xerxeserxesxay
limitimitlay yellowellowyay
messyessymay zippyippyzay

Files

Directory: lab5

Create the specified directory, and copy the files above into the new directory. Only gcc users need a makefile; all others should create a project and add all of the .cpp files to it.

Add your name, date, and purpose to the opening documentation of the code and documentation files; if you're modifying and adding to the code written by someone else, add your data as part of the file's modification history.

Strings of Characters

To solve both problems, we first look at the string library.

The string data type in C++ is implemented with a class. A class is a way for us to encapsulate both data and actions together into one object. In a future lab, we'll look at how we can write our own classes, but for now we'll use the classes that C++ provides for us.

A string object is used to hold a string (or sequence) of characters. A single character (i.e., a char) is rarely that useful by itself, so you'll find yourself using strings any time you need to process text.

string Objects

We can easily declare and initialize string objects:

string englishWord = "farm";
After the declaration, englishWord is a string object. A string is an indexed type, so that we can access the individual characters in the string:
The numbers 0, 1, 2 and 3 are the index values of the characters within englishWord, and can be used to access the individual characters within the string object.

The most important thing about indexing the characters of a string is where the indexing starts:

Helpful hint: The indexing of a string always starts at index 0.

So the first character of the string has index 0, the second characters has index 1, the third character has index 2, and so on up to the last character whose index is one less than the size of the string. Keep all of this in mind as you use indices to access the characters and substrings of a string.

string Operations

As mentioned above, a class encapsulates both data and actions into one object. As a class, the string class has many operations we can use on string objects.

An operation in a class is usually implemented as a method (also known as an instance method or a function member). A method is merely a function defined in a class. We'll see how to make these definitions ourselves in a later lab; for now all we need is the ability to call them. This requires a slightly different syntax for our function call:

obj.method(args);
In the past, we didn't have an object before the function name when we called the function; but with a method, the method must be applied to an object. Many object-oriented programmers think of methods as messages that they send to the objects. For example,
int size = stringObject.size();
We'd think of stringObject receiving a message size() which asks the object to figure out its size. In other words, "Hey, stringObject, what's your size?" This metaphor becomes more important when we implement classes, but it's also helpful in mastering this new syntax for calling methods.

Let's look at some of the provided operations of the string class. Suppose that str is a string:

Description Syntax Explanation
Index operator str[index] Returns the character at index from str. If index is out of range for str, it will return junk or crash your program.
Size of string str.size() Returns the size of the string.
Concatenation of two strings str1 + str2 Returns a string equal to the first string followed by the second.
Equality of two strings str1 == str2 Returns true if the two strings are equal, false otherwise.
Substrings str.substr(start,size); Returns the substring starting at index start of length size.
Finding characters in a string. str.find_first_of(pattern,index) Returns the index of the first character in pattern that it can find in str, starting at index. If it cannot find a matching character, then it returns string::npos.

The first three operations are pretty straightforward. Let's look at some examples of the substr() method:

string str = "hello";
string sub1 = str.substr(0,2);
assert (sub1 == "he");
string sub2 = str.substr(3,2);
assert (sub2 == "lo");

This code compiles and executes without any problems. The last example (involving sub2) is important because the second argument is a size, not an index (a common mistake). Often we have the index of the last character we want; if we also have the index of the first character of the substring, computing the size is simple:

size = last - first + 1

This is a very important computation, and you should keep this very handy.

The find_first_of() is a little more involved. Consider this code:

string sample = "Hello.  Nice to meet you!";
int firstIndex = sample.find_first_of(".!?", 0);
This searches sample to find the first occurrence of a character from the pattern ".!?". It does not need to find the entire pattern, just one of the characters from the pattern. So, in the code above, firstIndex should be set to 5, the index of the first period in the string.

The starting index (0 in our example) allows us to start a search in the middle of a string. Often you'll use 0 as your starting index just like above, but we could continue our example:

int secondIndex = sample.find_first_of(".!?", firstIndex+1);
This starts the search right after the previous search and finds the second occurrence of one of the characters.

Question #5.1: What will be the value of secondIndex?

Question #5.2: What would be the value of secondIndex if we started the search at firstIndex instead of firstIndex+1?

Remember that it is not important to memorize a library description like this. It is important to generally know what's available in the library, where you can find the library, and (most importantly) where you can find documentation for the library (Appendix D of C++: An Introduction to Computing by Adams and Nyhoff is one place). The discussion in this section will suffice for this lab---don't memorize this, but be sure to come back to this section when working with these methods.

if Statements

In the previous lab, we used a simple if statement. Our functions in this lab require a more powerful if statement. A full if statement looks like this:

if (BooleanExpression)
{
   Statements1
}
else
{
   Statements2
}
If the BooleanExpression is true, then only Statements1 will be executed; otherwise (that is, if BooleanExpression is false), then only Statements2 will be executed. We can chain two if statements into what is known as a multi-branch if:
if (BooleanExpression1)
{
  Statements1
}
else if (BooleanExpression2)
{
  Statements2
}
else
{
  Statements3
}
The pattern generalizes (as we'll see in the next lab). The boolean expressions are evaluated in order until the first one that evaluates to true, and then its corresponding statements are executed. If none are true, the last block of statements (which an if test) is executed. Exactly one block of statements is executed each time through the statement.

Plural Nouns

Analysis

Examine the chart of singular and plural nouns at the beginning of this exercise. Can you pick out the rules we need?

We'll use some simple rules:

  1. In general, add "s" to the end of the word.
  2. If the word ends in "s" or "x", add "es" to the end of the word.
  3. If the word ends in "y", replace the "y" with "ies".
These rules are not comprehensive, but deriving a set of comprehensive rules for making plural nouns in English is rather difficult. For example, consider the word "radius" whose plural is "radii", not "radiuses". However, this only applies to English words with Latin roots: the plural of "bonus" is "bonuses"; "-us" words are very tricky. Making the rules comprehensive would take a very long time, so we'll stick with just these simple rules.

Question #5.3: For each words in the chart above, match it up with the rule that made it plural.

Seems simple enough. It also seems reasonable that a "pluralizing function" could be generally useful. So it certainly deserves a function and even a library.

OCD

Here's our behavior:

Our function should receive a singular noun. If the noun ends in "s" or "x", return the noun with "es" tacked on the end. If the noun ends in "y", return the noun with the "y" replaced with "ies". Otherwise, return the noun with "s" tacked on the end.

So far so good. It's just a transliteration of the rules. Note that the general case comes last.

Here are the objects we need:
Description Type Kind Movement Name
the singular noun string varying in singularNoun
the index of the last character of the noun int varying local lastCharIndex
the last character of the noun char varying local lastChar
the plural noun string varying in --

We have a specification for our function:

Specification:
receive: a singular noun, a string
precondition: the noun should be singular
return: the plural version of the noun, a string
In the previous lab, we used an assert() statement to enforce our precondition. We could try to enforce the precondition for this function, but it would take a lot of work. In fact, we'd have to implement an entire dictionary because you cannot tell just by looking at the letters in a word if it is already plural or not.

This is okay. Not every precondition can or should be enforced by your code. It certainly makes the code better if you can enforce your preconditions, but it's not absolutely necessary. It is absolutely necessary, however, to clearly indicate that this is a precondition for the function. If a programmer uses your function and passes in a plural word (like "nouns") and gets strange results (like "nounses") it shouldn't be a surprise to the programmer---we warned him!

Using the string operations listed above and other operations we've seen before, here are the operations we need for this function:
Description Predefined? Name Library
receive a value yes parameter built-in
if...then... yes if statement built-in
get last character of string yes [] and size() string
compare characters yes == built-in
concatenate two strings yes + string
extract a substring yes substr() string
return a string yes return built-in

So, finally, we have our algorithm:

  1. Receive singularNoun.
  2. Let lastCharIndex be the size of singularNoun minus 1.
  3. Let lastChar be the character of singularNoun at index lastCharIndex.
  4. If lastChar is 's' or 'x',
        Return singularNoun + "es".
    Otherwise if lastChar is 'y',
    1. Let base be singularNoun without the trailing "y".
    2. Return base + "ies".
    Otherwise,
        Return singularNoun + "s".

Pig Latin Translator

Let's turn to your coding of the Pig Latin translator.

Analysis

Like the pluralizer, our Pig Latin translator will have certain rules to transform a word based on the contents of the word. In the case of the pluralizer, the rule we applied was determined by the last letter of the word. For your Pig Latin translator, it's the location of the first vowel.

Consider some examples:

The focal point in both rules is the first vowel. We'll define 'a', 'e', 'i', 'o', 'u', and 'y' as vowels. While 'y' is only sometimes a vowel, the rules work better if we consider it a vowel.

OCD

Let's try this behavior:

Receive an English word. Find the position of the first vowel in that word, checking that a vowel was actually present. If a vowel begins the word, then return the concatenation of the English word and "yay". Otherwise, the Pig Latin word consists of three parts (in order): (1) the portion of the English word from the first vowel to its end, (2) the initial consonants of the English word, and (3) "ay". Return the Pig Latin word.

Here are the objects that you need:
Description Type Kind Movement Name
the English word string varying in englishWord
index of the first vowel int varying local vowelPosition
the Pig Latin word code string varying out piglatinWord
portion of the English word from its first vowel until its end string varying local lastPart
consonants at the beginning of the English word string varying local firstPart
"yay" string constant local --
"ay" string constant local --

This gives us the following specification for our function:

Specification:
receive: englishWord, a string.
precondition: englishWord should have at least one vowel in it.
return: piglatinWord, a string.

Since this method could be useful in many places (okay, just a bit of a stretch there), we'll use a library. Use the specification to create a prototype for a function named EnglishToPigLatin() in piglatin.h; copy the prototype into the documentation file; and then add a stub for the same function in piglatin.cpp.

The rest of the design is up to you:

Question #5.4: Write out a chart outlining the operations you need for the Pig Latin translator.

Question #5.5: Write an algorithm for the Pig Latin translator.

Plural Nouns (Again)

Coding

We can code up our algorithm for the "pluralizing" algorithm like so:

string Pluralize (string singularNoun)
{
  int lastCharIndex = singularNoun.size() - 1;
  char lastChar = singularNoun[lastCharIndex];
  if ((lastChar == 's') || (lastChar == 'x'))
    return singularNoun + "es";
  else if (lastChar == 'y')
    {
      string base = singularNoun.substr (0, lastCharIndex);
      return base + "ies";
    }
  else
    return singularNoun + "s";
}

Spend some time comparing the algorithm to the code.

Question #5.6: How much time did you spend comparing the algorithm and the code?

It wasn't enough time. Spend some more time.

Now let's break this down:

Step 1 is done automatically through the parameter passing mechanism.

Step 2 uses the size() method to get the index of the last character. Step 3 uses this index to get the last character of singularNoun.

Step 4 is a multi-branch if statement. The conditions of the ifs compare two chars. When we figure out what rule we need to apply, each rule results in a string concatenation. Two of those are quite simple, but the second one deserves a closer examination.

Recall that the substr() method needs a beginning index and a size of the substring. The beginning index is easy: since we want everything but the last character (which is a 'y' we're discarding), we need to start at index 0. It doesn't matter what's in singularNoun or how long it is, a string always begins at index 0.

But how long is this substring? lastCharIndex is the index of the last character. That's the character we need to avoid, so for our substring, the previous character is the last character of our substring. That previous character has index lastCharIndex-1. Using the formula above, we come up with:

substrsize = (lastCharIndex-1) - 0 + 1 = lastCharIndex
This explains why lastCharIndex is doing double duty as a index into singularNoun and as a size for the substring.

Go back one more time to compare the algorithm and the code. There's supposed to be little thought going from the algorithm to the code. You'll struggle with the syntax, but you don't have to get too creative. The double use of lastCharIndex is a bit tricky, but otherwise the two match up very well.

Pig Latin Translator (Again)

Coding

The trickiest operation you'll have to worry about is finding the first vowel in a word. Unlike the noun pluralizer, you cannot test a fixed position with the index operator. You have to search for it. However, read the list of operations above carefully because one of those methods (probably the one we didn't use in the pluralizer---hint! hint! hint!) will mostly solve the problem for you.

Then, extract substrings from the English word to build up the Pig Latin word. Remember to use the formula above to compute the size of a substring.

Code up your algorithm, compile, and run your program.

Testing

Test your Pig Latin translator on all of the words listed above plus others that you come up with.

Maintenance

As you test your function, you'll discover that your function doesn't work on all words. (This should teach you to read the entire lab before working on it!) In particular, your function has problems with words that start with 'y' (like "yellow") or that contain a 'q' in the initial consonants (like "quiet" or "squire").

Words beginning with 'y' pose a problem because we consider 'y' to be a vowel. If we didn't, words like "style" or "spry" wouldn't translate properly. It appears that we need a special case for words beginning with 'y' because then it usually works as a consonant. But this new rule isn't perfect: "Ypsilanti" begins with a 'y' vowel.

Words with a 'q' in the initial consonants also poses a problem because the 'u' after it should also be moved to the end of the word: "squire" should translate to "iresquay", not "uiresqay". This requires a special case. We can complicate the special case further if we permit English words that do not have a 'u' after a 'q'. Strictly speaking, this isn't permitted in English, but many foreign words absorbed into English have a 'q' without a 'u' right after it.

These are both cases that should be considered for a full-fledged Pig Latin translator, although we won't do them here. See Project #5.1.

Submit

Turn in a copy of your code, your answers to the questions, and sample runs of your program demonstrating that it works correctly.

Terminology

class, encapsulate, function member, index, indexed type, instance method, message, method
Lab Home Page | Prelab Questions | Homework Projects
© 2003 by Prentice Hall. All rights reserved.
Report all errors to Jeremy D. Frens.