Introducing VisCad
-
Detection and analysis of similar code fragments (“code clones”) has become an integral part of software maintenance. In response, over the last decade a great many clone detection techniques and tools have been proposed. However, identifying useful cloning information from the large volume of textual data produced by these detectors is challenging. VisCad is a tool with which a user can visualize and analyze large volumes of raw cloning data in an interactive fashion. Users can analyze and identify distinctive code clones through a set of visualization techniques, metrics and data filtering operations. The loosely coupled architecture of VisCad allows users to work with the clones of any clone detection tools that report source co-ordinates of the found clones. This yields the opportunity to work with the clone detectors of choice, which is important for clone analysis since clone detectors have their own strengths and weaknesses.
-
1. VisCad requires Java Runtime Environment (JRE) 6 or later. You can download the recent JRE from here .
2. We have successfully tested VisCad on Windows XP, Windows 7, Max OS X( version 10.6) and on some Linux distribution (such as Ubuntu).
4. The more RAM your computer has, the better performance you gain from VisCad. However, 2.0 GB or more is recommended.
5. Display dimension of 1024 x 768 or greater is recommended. -
Follow the steps listed below:
1. Download the VisCad_beta.zip file from here . You can obtain the most recent version of VisCad, documentation, source code from this location.
2. Extract the contents of the archive.
3. Double click on the VisCad.jar file to run the program. -
VisCad requires the subject system and the clone detection result you obtained by running clone detection on the subject system using the supported clone detectors.
Current you can directly import clone detection result of CCFinder, Simian, SimScan, NiCad. If you have clone detection result in RCF format (e.g., iClones result), you can also import and analyze the data in VisCad. For other clone detection tools, you need to convert the result into VisCad input file format. VisCadBeta.zip file contains an example of VisCad input file. -
The main user interface of VisCad can be divided into three parts.
1. Left Part: The left part accommodates the clone browser. The clone browser has two parts, one of which displays the distribution of clones over the directories and sub- directories in the subject system, known as System Navigation Tree. The other part, located on the bottom of the clone browser, lists all clone classes and the number of clone snippets in each class, called Clone Class Tree.
2. Middle Part: The middle part of VisCad accommodates different views in separate tabs. We will refers this part as Viewer.
3. Right Part: The top- right window shows the clone detection specific information VisCad obtained while parsing the result file for the selected
subject system. For any selected directory in the system navigation tree, the bottom-right window shows the distribution of clones in its sub-directories through a pie chart. -
You can analyze the source code of the clone fragments using the code browser. The same component is used for analyzing clone code fragments in other places also.
-
You can use this view to analyze and compare clone files with grouping and selection features.
Visualization
-
Visualization plays an important role in code clone analysis since it can provide high level overview of cloning in a system. At present, VisCad supports three different visualizations which are scatter plot, treemap and hierarchical dependency graph.
-
A scatter plot can be viewed as a two dimensional matrix where each cell represents the cloning status between a pair of files or directories. In VisCad, cells render the clone pairs distributed between a pair of files or directories using a color heatmap. Cells are also labelled in the horizontal and vertical axes.
-
Treemap preserves the hierarchical structure of subject systems where each rectangle represents a file or directory. The rectangles representing the files are aggregated to indicate the cloning status of a directory in the system hierarchy.
-
Clones are more problematic when members of a clone class scattered in different parts of a software system because this requires changes need to be made in different parts of the system. Thus, it is required to discover how clone fragments are distributed across subsystems/directories. Moreover, understanding cloning relationships among different subsystems can also reveal their dependencies. VisCad can render the hierarchical organization of a software system along with the distribution of clones using a hierarchical dependency graph.
Code Clone Metrics
-
For supporting in-depth clone analysis, VisCad can compute a set of metrics. We can divide the metrics into two broard categories. The first set of metrics (clone system metric set) relate clones with the organizational structure of the subject system and can be computed for different system boundaries, such as for the entire system, for subsystems/directories or for source files, as per the user’s choice. Depending on the granularity of operation, we can again subdivide them into two groups, the file metric set and the directory metric set. The next set of metrics (clone class metrics set) deals with the clone classes.
VisCad supports four operations for each metric set. These are:
1. Exporting : Results of metric computations can be exported in CSV (comma separated values)format.
2. Plotting : Although metrics are important for quantitative analysis, identifying important patterns from a large set of data is difficult. To avoid such difficulties, users can plot the metrics values with a bar chart which helps in identifying an anomaly within clone patterns easily.
3. Browsing Clone Code: Depending on the metrics values, user may be interested to explore the clone fragments located within a file or directory. VisCad also supports such operation.
4. Sorting : Values can be sorted to locate the maximum or the minimum value easily. -
This section discusses the steps for obtaining various metrics values for the clone files located in a directory.
-
This section discusses the steps for obtaining various metrics values for all clone directories within a selected directory. A clone directory is a directory that contains at least one clone fragment.
-
This section discusses the steps for obtaining various metrics values for the clone classes.
Filtering
-
The first and foremost challenge in clone analysis is the large volume of clone detection results. Not all clones are useful to the user and the objective of the analysis at hand governs the set of the useful clones. Here, the term ‘useful clones’ refers to those clone fragments that the maintenance engineers are looking for or are interested in. For example, when the objective is to analyze the inter-project clones, users may be more interested in the clone classes whose fragments are distributed across different projects and these clone fragments form the set of useful clones. In that case, we can filter out the clone pairs that are not distributed across different projects. VisCad supports a set of filtering operations to remove clones that are not useful/interesting to the users.
-
Clones may overlapped each other and removing those overlapping clones also reduces the size of the result set.
-
Textual filtering allows to remove clones that are only structurally similar without having any semantic similariy. For each clone class, VisCad determines the clone fragment that maximizes the sum of the textual similarity to all other fragments of that class. We call this fragment as the ‘leading clone fragment’ for that class. If the textual similarity between the ‘leading clone fragment’ and any other clone fragment in the clone class falls below a given threshold value, we remove the fragment from the analysis. We discard an entire clone class from the analysis when the textual similarities between the leading clone fragment and all other non-leading clone fragments of that clone class fall below the threshold value.