Computer programming - Lab 6

If any questions remain about these problems, please ask during office hours.

Arrays and bits

Set encoding

Any set can be encoded by its characteristic function: f(x) = 1 if x is in the set, f(x) = 0 otherwise. Thus, having fixed a universe of N elements, any subset of this set can be represented by N bits (bit k tells whether element a_k is in the set).

1. Letters appearing in a string Write a function which, given a string (char array terminated by '\0'), returns the set of lowercase letters that appear in the string. The set is represented as a 32-bit integer value (enough for the 26 lower-case letters of the alphabet). In main, use the returned value to print in alphabetical order all lowercase letters appearing in the string.

2. Filtering input Write a function that takes as parameter a 32-bit integer representing a set of lowercase characters (one bit per letter, bit 0 for 'a'), and filters text read from input until EOF, printing only those lowercase letters that are not in the given set, and any characters which are not lowercase letters. Use the function to filter out any lowercase vowels from input.

3. Character classification Implement your own version for some character classification functions from ctype.h: isalpha, isalnum, isdigit, islower, isupper, isxdigit. Use a global array with one byte per character, and one bit per property. Your functions should work for any unsigned char value, and for EOF (-1), which is not a character, thus not in any class.
Initialize the array statically or assign it in main, before calling any functions.
Optionally, write a program that after initializing the array, prints its C definition: char ctab[...] = { octal/hex, octal/hex, ... };. Redirect program output to somefile.c which you can then compile with your functions.

4. Rationals in base 2 Given two natural numbers m < n ≤ 64, compute and print the digits in the binary representation of fraction m / n . If the fraction is periodic, print out the period in parantheses, e.g. 1/10 = 0.0(0011) . Starting from the code written in class, use bit operations on a 64-bit int both for tracking remainders and storing the quotient.

String and number encodings

5. URL encoding (Percent-encoding). In URLs, several characters including &, =, $, ?, etc. have special meaning. If a URL contains these as normal characters, they must be represented (encoded) differently. The percent encoding does this with three characters: a percent % and the two hexadecimal digits of the character's ASCII code. For example, the dot '.' character has code 0x2E, thus is encoded as %2E (or %2e, case is ignored). The percent character itself (ASCII code 0x25) is encoded as %25. For simplicity, we will assume that all non-alphanumeric characters must be percent-encoded.
a) Encode. Write a function that takes a null-terminated string, an array of bytes (characters) and its length and percent-encodes as much of the string as fits, ending it with '\0'. Use bitwise operators to extract the hexadecimal digits of a character that must be encoded.
b) Decode. Write a function that takes a null-terminated string, an array of characters and its length and percent-decodes as much of the text as fits. If a percent character is not followed by two hexadecimal digits, stop decoding. Always end the decoded string with '\0'. Use bitwise operators to assemble a character that was encoded.

6. 12-bit ints. An array of bytes (unsigned char) stores unsigned integers, each represented using 12 bits (two integers fit in three bytes). Bits are used least significant first, thus the low-order nibble of the second char in a triple forms the high-order bit of the first number, and the high-order nibble forms the low-order four bits of the second number. (A nibble is 4 bits from a byte).
Write a function that takes an array and its byte length and prints all numbers in hexadecimal.
(This encoding was used for the FAT-12 file allocation table in DOS).
For example, the byte array { '\x21', '\x43', '\x65', '\x87', '\xA9', '\xCB' } would represent the numbers 0x321, 0x654, 0x987, 0xCBA.

7. Base64 encoding. Not all 8-bit characters are printable. Base64 defines an encoding of any character sequence using 64 printable characters: uppercase letters, lowercase letters, digits (for the first 62), and two other characters: + and / (considering the particular case of MIME encoding).
Any group of 3 bytes (3 * 8 bits) is encoded using 4 printable characters (4 * 6 bits). The most significant bits available in those 4 printable characters are used first. If the bytecount to be encoded is not divisible by 3, unused bits in the last incomplete encoding character are set to zero. Any completely unused encoding character (there could be one or two in the last group of 4) is set to '='.
a) Encode. Write a function that takes a null-terminated string, an array of bytes and its length and base64-encodes as much of the string as fits, ending it with '\0'.
b) Decode. Write a function that takes a null-terminated base64 text, an array of characters and its length and decodes as much of the text as fits, ending it with '\0'.

Arrays

8. 3n+1 conjecture The Collatz conjecture says that for any positive n, the following sequence of operations will eventually reach 1: if n is even, divide it by 2; else, if n is odd, compute n = 3n + 1.
a) Tabulate (i.e, compute in a table) for n = 1 to 1 million the number of steps it takes to reach 1 (the "stopping time") starting from n.
Avoid repeating computations: once you find the stopping time for k, store it, and use it from the table if reaching k again in some other computation. (Use recursion to store the stopping time just before returning the value).
Careful to only store values only for numbers smaller than the table bound, as the sequence may reach values larger than the initial n.
b) Compute and print from the tabulated values the sequence of numbers whose stopping time is larger than that of any smaller number, i.e.: 1, 2, 3, 6, 7, 9, ...

9. Bignum addition. Write a function that adds two arbitrarily large unsiged integers given as null-terminated strings (made of decimal digits). Place the result in the same form, in a character array also given as parameter, together with its size. If the result array is too short, produce the empty string.

10. Bignum multiplication. Implement multiplication of arbitrarily large natural numbers, stored as null-terminated strings (arrays of decimal digits). Write a function that takes as parameters two such numbers, computes and prints their product.

Marius Minea

Last modified: Thu Nov 2 18:50:00 EET 2017