Perl

Menu

In-Class Assignment #4

Write a program to search a long string for all duplicate words. For each duplicate word encountered, print all the text that came between the two instances.

Caveats

You must find all instances of duplicates, regardless of whether or not they appear on different lines

The duplicate words should be case insensitive ('Hello' and 'heLLo' are duplicates)

For the purposes of this assignment, a "word" can be defined as a sequence of one or more letters, numbers, or underscores (just as Perl defines it)

Make sure you are looking at whole words only ("The" and "there" should not be found as duplicates)

Advice

Start with the most basic part of this assignment, and then gradually make the expression more complicated:

  1. Find a single pair of duplicate word-character sequences, and print out what's in between
  2. Make that search case insensitive
  3. Allow the search to span newlines
  4. Restrict the search to be looking at whole words only (on both the first and second occurrance)
  5. Look for all occurrences of this pattern, in a looping construct
  6. Prevent your regexp from "eating up" everything after the first occurrence of the first duplicate word

Getting Started

Here is a skeletal program you can use to get started. Copy and paste it into your editor:

#!/usr/bin/env perl
use strict;
use warnings;

#read entire contents of __DATA__ into one scalar
local $/ = undef;
my $string = <DATA>;

####Your code here####

__DATA__
The quick brown fox jumps over the lazy dog,
A fox stands by my dog over there.
This text is over now.

Sample Output

Using the sample input built into the code above, your output should be similar to:

Between copies of 'The':
===================================
 quick brown fox jumps over
===================================

Between copies of 'fox':
===================================
 jumps over the lazy dog,
A
===================================

Between copies of 'over':
===================================
 the lazy dog,
A fox stands by my dog over there.
This text is
===================================

Between copies of 'dog':
===================================
,
A fox stands by my
===================================

Between copies of 'over':
===================================
 there.
This text is
===================================

Submission

Your program must run on rcs-sun4.rpi.edu using Perl 5.8. When you are ready to submit, execute the program ~lallip/public/ic_submit.pl and follow the prompts.

This program must be submitted by 6:00pm today, Wednesday September 28

Perl Quotes
Perl Quotes