Computer programming - Lab 10

1. Write a simple version of the Unix strings program that prints all strings from a binary file that contain at least 4 consecutive printable characters except space (as checked by isgraph()).

2. Write a simple version of the Unix grep program that prints all lines from a file (second command line argument) that contain a given string (first commandline argument). Keep the common case simple but try to avoid size limitations.

3. Write a simple version of the Unix cut program that accepts a commandline argument of the form M-N with M, N nonnegative integers and prints out fields M through N from each line of a file (also named on the commandline). Consider comma as separator between fields.

4. Write a simple version of the Unix split program that splits a file named on the commandline into equally sized chunks (with the size also given on the commandline). The last chunk should contain the remainder. Output files are named xaa, xab, ..., xba, xbb, ... . Report an error if names are not enough.

5. Write a program that receives on the commandline an argument of the form -Dstr1=str2 and a file name and creates a processed version of the file, where each occurrence of string str1 is replaced by str2. (This is how the cpp preprocessor does replacements in addition to the ones specified with #define). The name of the output file is the name of the input file with .pp appended.

6. A BMP file contains in its header (54 bytes) several informations about image and file size, all as 4-byte little-endian integers:

Each line of pixels in the image occupies space rounded up to a multiple of 4 bytes.
Verify that all these data are consistent, and report any inconsistencies. You can find out the actual file size by seeking to its end, and then getting the current position.

7. Write a program that processes a .jpg file named on the command line and identifies its parts. A JPEG file starts with bytes 0xFF 0xD8. These are followed by a variable number of segments. Each segments starts with a two-byte marker: 0xFF and the second byte for the segment type. Next, there is the segment length, a two-byte integer, stored in big-endian format. The length includes the two bytes for the length, but not the two bytes for the marker.
Your program should print out for each segment the byte for the segment type (in hex) and the segment length. Stop at either the end of the image (marker 0xD9) or the start of the image stream (marker 0xDA).

Marius Minea
Last modified: Wed Nov 27 9:10:00 EET 2013