Homework 2
This homework will involve using regular expressions to parse and modify a file.
Write a program that does basic style-checks on a Perl program. For every file listed on the command line, read the file line by line, and print out a warning if any of the lines are "suspicious". Suspicious lines include:
- lines ending in whitespace (other than the newline, of course)
- lines starting with tabs followed by spaces
- lines starting with spaces followed by tabs
- lines that are more than 80 characters long
- lines that attempt to open a bareword filehandle instead of a lexical. For example,
open FH, '<', $fileinstead ofopen my $fh, '<', $file - lines following a line that ends with
{, that does not have more whitespace starting it than the previous line did. (ie, blocks that aren't indented) - lines that have nothing but whitespace (not including the newline)
- lines with variables that have one-letter names
For that last one, every time you find a badly-named variable and you warn the user about it, prompt the user to enter a different variable name. Do this only once per badly-named variable. Make sure the replacement the user entered is a valid user-defined Perl variable name, and that it's the same kind of variable (scalar, array, hash)
Once you've finished your analysis of the file, open a new file with the same name, but with a ".mod" extension appended. Print a better-formatted version of the file:
- add in the shebang,
use strict;anduse warnings;if they weren't the first three lines in the original - Remove the extraneous whitespace ending a line
- Replace all instances of the badly-named variables with the replacement entered by the user.
- Replace all instances of the global-bareword filehandles with a similarly-named
lexical variable. Make sure the first instance is declared. For example,
open FH, '<', $file or die ... while (my $line = <FH>) { ... } close FH;becomesopen my $FH, '<', $file or die ... while (my $line = <$FH>) { ... } close $FH
Finally, print to STDOUT a list of all the scalar, array, and hash variables that have been declared,
sorted by the number of times the variable was used in the file (include this count in your list).
Keep in mind that variables can be declared by themsevles: my $foo;, or in a list:
my ($foo, @bar, %baz);. Also keep in mind that to find all the times
an array was used, you have to search not only for @foo, but also $foo[,
and similarly for hashes, not only %bar but also $bar{ and
@bar{. (This also applies to replacing the badly-named variables in the
modified file)
Sample I/O
A sample badly formatted file can be found at ~lallip/perl/bad_format.pl. The modified version with variables and filehandles replaced is at ~lallip/perl/bad_format.pl.mod. I've also made available Sample Output for this file.
Grading Criteria
| Whitespace-ending warning | 5 |
|---|---|
| tab-space and space-tab warning | 5 |
| 80-character warning | 2.5 |
| bareword filehandle warning | 7.5 |
| indentation warning | 10 |
| empty line warning | 2.5 |
| one-letter variable warning | 7.5 |
| Add shebang, strict, warnings | 5 |
| Remove ending whitespace | 2.5 |
| Replace bad variable names | 10 |
| Replace bareword filehandles | 10 |
| Show all declared variables, sorted | 12.5 |
| No Warnings | 5 |
| Error Checking | 5 |
| Code Style | 5 |
| Output Style | 5 |
Penalties
For a description of the late and compilation penalties, see the grading criteria for Homework 1
No Warnings
This, of course, applies to warnings generated by Perl, not to the warnings you intentionally create. Remember, when we run your submission, we will enable warnings even if you didn't.
Error Checking
As always, your program should never crash. Possible errors you need to check for include (but are not necessarily limited to): at least one command line argument, all files on command line exist and can be opened, you can create a new .mod file in the current directory, replacement variable name is a valid variable name, replacement variable name has not already been used in code or as replacement, etc
Code Style
One of the good checks for this assignment would be feeding your own code to your own
code and seeing what results you get. In general, code should be easily readable by a
human being. Meaningful variable names, consistent indentation, and explanatory
comments are the most important facets. For a full guy to well-styled Perl code, see
perldoc perlstyle
Output Style
Your prompts and variable list printed to STDOUT should be well labeled. The code you print to the .mod file should only have the modifications noted above.
Submission Instructions
Remember, we're now using solaris.remote.cs.rpi.edu exclusively. To submit,
run the script ~lallip/public/submit.pl and follow the prompts. You may submit
infinite times, only the last submission will be graded. Your submission is due at 11:59:59pm
Eastern time on Tuesday, February 26, 2008.
