Homework 5
This assignment will get you used to using built-in and CPAN modules to do much of your work for you, and to searching CPAN to find a module best suited for your task. While it is conceivable that you could write this entire assignment without using any external modules, it would probably take you five times as long as be three times as hard.
Description
Write a program that takes two to four options on the command line. The --url option will be a web site address that you will retrieve. The --file option will be the name of a local file in which to store the modified web site. The optional --email option will be an email address to which you will send results. Finally, the user may or may not specify a --debug option (which has no value) which will determine whether or not debugging statements should be printed out.
Obtain the contents of the URL that the user specified on the command line. Within that document, find all numbers (with or without decimal points, signed or unsigned), and replace them with the Roman-numeral version of the integer portion of that number. Save the modified document to the file named by the user's --file option.
If the user specified a --email option, send that address a summary email that contains:
- The URL of the original web site
- The total number of substitutions made
- The minimum and maximum numbers found and replaced
- The current date and time, in format: 'November 2, 2005 4:52PM'
If the user provided the --debug option, print out explanitory debugging statements at each step of the way. ("Now obtaining $url", "Making substitutions", "$url saved in $file", "sending email to $address", etc)
Notes
- All but one of the modules I presume you will use for this assignment were covered in class. The remaining module can be found from a simple search at search.cpan.org
- --url and --file are required options - your program should exit with an error if the user does not specify them. --email and --debug are not.
- Regexp::Common does have a minor bug when it comes to searching for numbers. For some reason, a single period is taken to be a valid real number. For the purposes of this assignment, you can assume that the document you obtain does not have any periods that are not decmial points. (Though if you can find a way to work around this bug, that would obviously be preferable).
- I strongly suggest you download all of the modules you believe you will need ASAP. You do not want to be dealing with CPAN errors three hours before the deadline.
- Please do not email us to ask how to use a particular module until after you have read that module's documentation. After that, you are of course welcome and encouraged to request help in understanding any module
- You are permitted and encouraged to use any standard or CPAN modules you think will be helpful to this assignment.
- You may find it very helpful to review the modifiers portion of the second Regular Expressions lecture
- Under no circumstance should you be shelling out to any external command
like
wget,lynx,date, orls - The concept of "negative" isn't really something Roman numerals deal with. If you encounter a negative number, you can either drop the '-' or prepend it to the Roman numeral.
Grading Criteria
| Obtain four command line options | 5 |
|---|---|
| Retrieve website document | 10 |
| Replace all numbers with Roman numerals | 20 |
| Store modified document in file | 5 |
| Send email | 12.5 |
| Compute number of subs, min and max | 15 |
| Include formatted date | 12.5 |
| Clean Compilation | 5 |
| Error Checking | 5 |
| Output Style | 5 |
| Code Style | 5 |
Compilation
Points are deducted for any warning messages your code produces. Note that warnings will be enabled when we run your program, regardless of whether or not you've enabled warnings within the code.
As a reminder, while use strict; is not a requirement
for the homework, it is a very strong suggestion (also keep in mind
that it is required before emailing for help.)
If your code produces compilation errors, you will lose all the points for compilation, and the remainder of your program will be graded subjectively based on the code you've written itself. You should never submit code that does not compile.
Output Style
Output, in this case, refers to the menu and prompts printed to the user. In general, output should be well formatted and clearly identifiable. All output on one single line is bad. The raw data, without being labeled, is also bad. If it hurts my (or the TA's) eyes to look at, it's bad.
Code Style
All code that you write should be well styled. Most important are three features: A consistent and helpful indentation scheme; Meaningful variable names; explanatory but not over-abundant comments.
For examples of well-styled code, please see perldoc perlstyle
Supplemental Files
The main homework page will soon have a sample URL to use, a sample output file, sample debugging output, a sample email text, and a Frequently Asked Questions page.
Submission Instructions
This homework is due Tuesday, November 15 2005 at 11:59:59pm. As a
reminder, your program must be functional on the machine
rcs-sun4.rpi.edu. To submit your homework, log in to
rcs-sun4.rpi.edu, and run the program:
~lallip/public/hw_submit.pl
Follow the prompts. You will receive an email confirmation of your
submission. If anything goes wrong, please email Paul at lallip@cs.rpi.edu ASAP.
Remember that you may submit infinite times. Only the last homework submission will be looked at.
The homework will be accepted until Wednesday, November 16, 2005 at 3:59:59pm for a reduction of 20 points. Please note that if you make a submission at 11:30pm on the due date, and another 2 hours later, only the late submission will be graded. If your program is not complete, please use the table above to determine if a 20 point deduction warrants you completing the assignment and submitting late.
