OpSys Fall 2006 - HW2

Directories, Files and Threads

Due Date: 10/5 at 11:59PM

Submit to WebCT drop box labeled HW2

- Recursive find files in a directory.

The objectives of this assignment are:

  1. Use the stat()system call to determine some of the attributes of files and directories.
  2. Learn how to process directories ( opendir() and readdir()).
  3. Creating threads (pthread_create()).
  4. Deal with general systems programming issues including checking for, and handling of errors.

./hw2 directory substring

You are to write a program that accepts two command line parameters. Your program assumes the first parameter is a directory path, this could be absolute (like /usr/bin) or relative (like ../foo). The second parameter is a string that will identify files that you are looking for in the named directory. Your program should search through the directory looking for files whose name includes the string (any file name that contains the string is a match). Every time a match is found, your program should print a line containing the thread id (pthread_self()), the entire path to the file including the filename, and the file size in bytes.

If the initial directory holds any subdirectories, your program must search these as well. Each time your program searches a directory, it must create a new thread, and that new thread should search the subdirectory.

Below is a sample session. The output of the program is shown in blue, the command line typed by a human user is in black. This command searches in the directory /usr/share for all files whose name contains the substring "ifi".

> ./hw2 /usr/share ifi

134560768: /usr/share/openssl/man/cat3/SSL_CTX_use_certificate.3.gz 2519
134560768: /usr/share/openssl/man/cat3/SSL_get_peer_certificate.3.gz 810
134560768: /usr/share/openssl/man/cat3/SSL_CTX_use_certificate_ASN1.3.gz 2519
134560768: /usr/share/openssl/man/cat3/SSL_CTX_use_certificate_file.3.gz 2519
134560768: /usr/share/openssl/man/cat3/SSL_use_certificate.3.gz 2519
134560768: /usr/share/openssl/man/cat3/SSL_use_certificate_ASN1.3.gz 2519
134560768: /usr/share/openssl/man/cat3/SSL_use_certificate_file.3.gz 2519
134560768: /usr/share/openssl/man/cat3/SSL_CTX_use_certificate_chain_file.3.gz 2519
134559232: /usr/share/openssl/man/man3/SSL_CTX_use_certificate.3.gz 3943
134559232: /usr/share/openssl/man/man3/SSL_get_peer_certificate.3.gz 2357
134559232: /usr/share/openssl/man/man3/SSL_CTX_use_certificate_ASN1.3.gz 3943
134559232: /usr/share/openssl/man/man3/SSL_CTX_use_certificate_file.3.gz 3943
134559232: /usr/share/openssl/man/man3/SSL_use_certificate.3.gz 3943
134559232: /usr/share/openssl/man/man3/SSL_use_certificate_ASN1.3.gz 3943
134559232: /usr/share/openssl/man/man3/SSL_use_certificate_file.3.gz 3943
134559232: /usr/share/openssl/man/man3/SSL_CTX_use_certificate_chain_file.3.gz 3943
134619136: /usr/share/sendmail/cf/feature/accept_unqualified_senders.m4 417
134560768: /usr/share/man/man3/significand.3.gz 1570
134560768: /usr/share/man/man3/significandf.3.gz 1570
134560768: /usr/share/man/man3/acl_get_qualifier.3.gz 1766
134560768: /usr/share/man/man3/acl_set_qualifier.3.gz 1392
134560768: /usr/share/man/man3/pthread_getspecific.3.gz 1432
134560768: /usr/share/man/man3/pthread_setspecific.3.gz 1664
134560256: /usr/share/man/man9/pmap_is_modified.9.gz 1220
134560256: /usr/share/man/man9/pmap_ts_modified.9.gz 1220
134560256: /usr/share/man/man9/uifind.9.gz 1468

Your threads should be joinable, and each time a new thread is created, the thread that called pthread_create must wait for the new thread to finish (by calling pthread_join). In other words, you should not search subdirectories in parallel, if you try to do this with a large search (large directory hierarcy), your program will quickly run out of resources (having a bunch of open directories will fill up the file descriptor table, etc). Although there is no real benefit to using threads like this, (only one thread at a time is actually doing anything), it's a good beginning exercise in threads programming. If you want to try parallel threads, you will need to limit the number of threads running at any one time (see the extra credit section!).

Note: In C, to determine whether one string contains another, use the strstr() function (man strstr for the details). You should only print files whose name matches the string on the command line exactly (case sensitive).

- Reading Directories

Your program must be able to determine the contents of a directory. You can open a directory with the opendir system call, and read the names of items in a directory with the readdir function. Both of these have man pages ("man opendir", "man readdir").

NOTE: Since your program will need to open many (nested) directories, you will either need to:

  • Call chdir() to change the current working directory before calling opendir.
  • build pathnames (could be absolute or relative) to give to opendir.

- Using the Posix Threads Library

Your program must use the Posix threads library (pthreads) on the CS BSD machines (freebsd.remote.cs.rpi.edu, or any of monica, ashley or mary-kate.cs.rpi.edu). To link your program with the pthreads library on the CS BSD machines you need to add -lpthread to your compile command. Your compilation should look something like this:

gcc -o hw2 -Wall hw2.c -pthread

Below is a sample Makefile that will compile a file named "hw2.c" and create and executable named "hw2" including the threads library:

all: hw2

# ON CS BSD machines uncomment the next line
THREAD=-pthread

# use this if you use good'ole gcc
CC = gcc
# use this if you want c99 version 
CC = gcc3

hw2: hw2.c
        ${CC} -o hw2 -Wall hw2.c ${THREAD}

Remember that if you copy/paste from the above, it won't include the tab character needed in the rule for building hw2, and it won't work!

The Makefile above will work as-is on the BSD machines, if you are developing on another OS you may need to change the definition of the make variable THREAD

- Project Requirements

The following are the requirements for the project:

Grading: Grades will be based on the formula below. Note that to get full credit we must be able to understand your code (it must be commented!)

20%Proper output
20%Can handle any kind of path (absolute or relative).
30%Use threads correctly
20%Includes code that can handle all possible errors returned by system calls
10%Code quality (comments, organization, how hard is it to understand ?).

You can get partial credit for any part (for example if you don't get all the commands working properly).

If you code does not compile and run under FreeBSD on the CS machines, you will lose at least 50% (the remaining 50% partial credit will be awarded based on visual inspection of the code).

- How to Submit

Log in to WebCT at webct.rpi.edu using your RCS id and password. Once you get to MyWebCT click on "Operating Systems", and from there go to the homework drop boxes. Submit your files (individually, zipped or tarred) to the drop box labeled HW2

- Extra Credit

You can get extra credit (up to 10 points), for searching directories in parallel. In this case you should make all your threads run as detached, and you don't call pthread_join(). Each time you find a subdirectory, you create a new thread to handle that subdirectory and move on to the next entry in the current directory.

You won't get the extra credit unless your solution actually works! You need to consider that each thread will be using an open file descriptor (for the directory it is processing) and you can easily run out of these if you don't limit the number of threads that can be running at the same time. You also need to worry about the current working directory, as this is an attribute of the process (all threads share the same working directory, so using chdir() won't work).

If you use shared variables among threads, you need to use a mutex! Even if you can't produce a problem when testing, you must treat access to shared memory as a critical section and make sure no two threads could be in their critical sections at the same time!

-Resources