Diploma Project Proposals 2016-2017

We are looking for 4-5 students interested to work on projects in the following domain:

Tools for supporting program comprehension of large software systems at architectural level

Domain description and motivation

The main problem of software development is the increasing size and complexity of software systems. In order to cope with this problem, Software Engineering emerged as a discipline. It begun by defining software process models, identifying structured sets of activities and developing guidelines how these should be carried out (from requirements engineering, design, testing, etc).

But abstract guidelines are not enough to support the software process; tools have been developed to support all the phases of software process. Tools like integrated development environments (IDEs), modeling tools, collaborative and versioning tools are since a long time indispensable for software engineers.

Current research is directed toward developing more intelligent tools such as recommendation systems for software engineering. These are tools that help software engineers master the large amount of code, models, documentation etc. that they must navigate and that assist them with a wide range of their activities. In this project, we focus on helping some activities related with maintenance.

Maintenance of complex software systems can be done by software engineers only after they understand well the existing code. Program comprehension (which is the "scientific" term used to refer to the activity of understanding existing code) is supported by documentation - either developer documentation, if it is available and up-to-date, or reverse engineered.

Reverse engineering a large software system often produces a huge amount of information, whose comprehension or further processing would take a long time. Lets imagine that a class diagram has been reverse engineered from a system with hundreds or even thousands of classes. Such a class diagram is of little use when trying to understand the system in absence of any documentation. Even when documentation is available, it may be too detailed and scattered - such as the one generated by javadoc from all the classes and packages of the system. What is most often missing is a short document providing the new user with useful information to start with - an "executive summary".

This is the goal of the proposed projects: to develop tools which (automatically) produce "summaries" of software systems in order to help their comprehension.

General project description

A summary of a document can be obtained in two ways: abstractive summarization or extractive summarization. In software engineering, a form of abstractive summarization is doing architectural reconstruction, or generating higher-level software abstractions out of the primary software artifacts that are given; it mainly works toward generating high level "architectural box and lines diagrams" starting out from the raw code, identifying the meaningful boxes and lines. Extractive summaries in software engineering are built by identifying and selecting the most important software artifacts from the existing ones.

We can build tools for both approaches of summarization and combinations thereof.

Project implementations

How exactly is such a tool working ? It follows following general steps:
  1. First, the tool takes as input some existing code, analyzes it and generates a primary model of the code. We do not do this from scratch, but use various libraries and frameworks for code analysis.
  2. Then the tool analyzes the primary model generated in step 1 and produces the "summaries". Projects will focus on this step, defining and implementing methods for analysis and "summarization", abstractive or extractive or combinations. The methods used come from domains such as: data mining (clustering), graph algorithms, network analysis, fuzzy inference, etc.
The individual assignment for each student project will be discussed at appointment.

Candidate requirements

The interested students should: The interested students should send an email to ioana.sora@cs.upt.ro to set an appointment for discussion.

Starting points

The projects will start with implementing methods described in some of the publications, working on their improvement, extension and validation. We also have existing implementations of certain parts previously developed as parts of the ART project that can be reused or extended.