Write a checker for a subset of Java-like bytecode.
The checker should work with the following instruction set:
iadd | fadd | imul | fmul
| ldc idx | iload idx | fload idx
| istore idx | fstore idx
| ireturn | freturn | return | goto instr | ifeq instr
.
Here, idx is an index into the constant pool
(for ldc) or the array of locals (otherwise),
and instr is an integer designating an instruction
(assumed to be numbered from zero).
The program will take an instruction stream for a function and check type and memory safety. The sizes of the constant pool, the locals array and the maximum stack size are given.
You can choose any input representation (including text, program terms, or work with a bytecode analysis infrastructure).
Hints
To write a dataflow analysis, you first have to build a control flow graph (CFG) from the code. From every node (instruction), you will need access to its predecessors and successors.
The lattice of type values is formed of int, float, ⊤ (unknown type, anything) and ⊥ (impossible). Meeting paths in the CFG produces the meet (⊓) in the type lattice, e.g. int ⊓ float = ⊥
For each instruction s, you will need to compute and store a dataflow fact at its input, in[s] and one at its output, out[s]. In this problem, dataflow facts consist of: an array of types for each local, and a list of types on the stack (which also provides the stack height). Initially, all dataflow facts are ⊤ (unknown), except for the input of the first instruction, in[init], which has an empty stack.
The worklist algorithm applies the transfer function fs for each statement s, propagating the changes to all successor statements, and inserting into the worklist the statements for which the input has changed. The algorithm reaches a fixpoint because the transfer functions are monotone and the lattices are of finite height.
worklist = { init } while worklist ≠ ∅ do s = extract(worklist) out[s] = fs(in[s]) forall s' in succ(s) do newin = ⊓p in pred(s') out[p] if newin ≠ in[s'] then in[s'] = newin insert(s', worklist)Extra credit Assuming that the array of locals contains parameters in its first part (of known size), and local variables in the remainder, check if the code uses any uninitialized variables.