Computer programming - Lab 6

If any questions remain about these problems, please ask during office hours. Arrays and bits

1. Set encoding. Write a function, which given a string (char array terminated by '\0') returns the set of lowercase letters that appear in the string. The set is represented as a 32-bit integer mask (enough to assign one bit position to each lowercase letter in the alphabet) which has a 1 if that letter is part of the set.
Use the created bit mask to filter text read from standard input until EOF, printing only those characters which do not appear in the initial string.

2. Base64 encoding. Not all 8-bit characters are printable. Base64 defines an encoding of any character sequence using 64 printable characters: uppercase letters, lowercase letters, digits (for the first 62), and two other characters: + and / (considering the particular case of MIME encoding).
Any group of 3 bytes (3 * 8 bits) is encoded using 4 printable characters (4 * 6 bits). The most significant bits available in those 4 printable characters are used first. If the bytecount to be encoded is not divisible by 3, unused bits in the last incomplete encoding character are set to zero. Any completely unused encoding character (there could be one or two in the last group of 4) is set to '='.
a) Encode. Write a function that takes a null-terminated string, an array of bytes and its length and base64-encodes as much of the string as fits, ending it with '\0'.
b) Decode. Write a function that takes a null-terminated base64 text, an array of characters and its length and decodes as much of the text as fits, ending it with '\0'.

3. URL encoding (Percent-encoding). In URLs, several characters including &, =, $, ?, etc. have special meaning. If a URL contains these as normal characters, they must be represented (encoded) differently. The percent encoding does this with three characters: a percent % and the two hexadecimal digits of the character's ASCII code. For example, the dot '.' character has code 0x2E, thus is encoded as %2E (or %2e, case is ignored). The percent character itself (ASCII code 0x25) is encoded as %25. For simplicity, we will assume that all non-alphanumeric characters must be percent-encoded.
a) Encode. Write a function that takes a null-terminated string, an array of bytes and its length and percent-encodes as much of the string as fits, ending it with '\0'. Use bitwise operators to extract the hexadecimal digits of a character that must be encoded.
b) Decode. Write a function that takes a null-terminated string, an array of characters and its length and percent-decodes as much of the text as fits. If a percent character is not followed by two hexadecimal digits, stop decoding. Always end the decoded string with '\0'. Use bitwise operators to assemble a character that was encoded.

Arrays

4. 3n+1 conjecture The Collatz conjecture says that for any positive n, the following sequence of operations will eventually reach 1: if n is even, divide it by 2; else, if n is odd, compute n = 3n + 1.
a) Tabulate (i.e, compute in a table) for n = 1 to 1 million the number of steps it takes to reach 1 (the "stopping time") starting from n.
Avoid repeating computations: once you find the stopping time for k, store it, and use it from the table if reaching k again in some other computation. (Use recursion to store the stopping time just before returning the value).
Careful to only store values only for numbers smaller than the table bound, as the sequence may reach values larger than the initial n.
b) Compute and print from the tabulated values the sequence of numbers whose stopping time is larger than that of any smaller number, i.e.: 1, 2, 3, 6, 7, 9, ...

5. Bignum addition. Write a function that adds two arbitrarily large unsiged integers given as null-terminated strings (made of decimal digits). Place the result in the same form, in a character array also given as parameter, together with its size. If the result array is too short, produce the empty string.

6. Rationals in base 2 Given two natural numbers m < n, compute and print the digits in the binary representation of fraction m / n . If the fraction is periodic, print out the period in parantheses, e.g. 1/10 = 0.0(0011) .
Hint: use an array to track remainders you have already seen, as shown in the course slides for base 10, and another array for the quotient digits.
Extra credit: if n ≤ 64, use bit operations on a uint64_t both for tracking remainders and constructing the result.

7. Hex-encoding a string Write a function that takes as parameter a null-terminated string (assumed non-constant) and changes it in-place in the following way: any two consecutive hexadecimal digits are replaced with the character that has the corresponding ASCII code; other characters are not changed. For example, "$75 and 41 cents" becomes "$u and A Înts", since 0x75 is 'u', 0x41 is 'A' and 0xce may be displayed as Î (depending on the character mapping for codes > 127).


The following problems were not assigned during the lab. Use them as optional exercises if you have solved (most of) the above and would like to do more.

8. 12-bit ints. An array of bytes (unsigned char) stores unsigned integers, each represented using 12 bits (two integers fit in three bytes). Bits are used least significant first, thus the low-order nibble of the second char in a triple forms the high-order bit of the first number, and the high-order nibble forms the low-order four bits of the second number. (A nibble is 4 bits from a byte).
Write a function that takes an array and its byte length and prints all numbers in hexadecimal.
(This encoding was used for the FAT-12 file allocation table in DOS).

9. Character compression. A text is represented using only four bits (nibbles) for frequent characters, as follows: 4-bit values from 1 to 15 represent characters from the following string: char key[] = " etaoinsrhldcf\n";, a nibble with value zero is followed by two nibbles representing an 8-bit character not in that string. Nibbles are stored least significant first. The end of the text is marked by two null bytes '\0'.
a) Write a function which takes as parameter a char array representing an encoded (compressed) text and prints out the decompressed version
b) Write a function which takes as parameter a string (character array terminated by '\0') and prints out the compressed text

10. Bignum multiplication. Implement multiplication of arbitrarily large natural numbers, stored as null-terminated strings (arrays of decimal digits). Write a function that takes as parameters two such numbers, computes and prints their product.


Marius Minea
Last modified: Thu Nov 3 18:13:00 EET 2016