Code Swarm: Visualizing the Evolution of Software Systems

Software systems grow over time. They evolve through additions, extensions and modifications. A successful application is not one that satisfies all the original requirements and then is delivered to its users and remains unchanged. If people are using a system, they will require it to be changed. They will have ideas about new features and capabilities that were not originally planned, and will feel that they really need them. Of course users also find bugs that need to be fixed, but this kind of maintenance is actually a small effort when compared to the addition of new functionality. Thus, if a system is not evolving it is because it is not being used, and if a system is being used it certainly will need to evolve.

Now the question is: How do systems evolve?

For a particular system, we can ask specific questions:

  • What parts of the system are being changed more frequently?
  • How the total number of files is growing over time?
  • How the work is being distributed among the software developers?
  • Are there developers that contribute more than others?
  • Is there an “ownership” relationship between developers and software components?

The answers for these questions can be obtained by analyzing the history of the software repository. Every commit represents a change done by a particular developer in a specific file. A commit also means the existence of a relationship between a programmer and a file at a particular moment in time. This relationship may continue for a period of time, if the same person makes several successive changes on the same file.

But for big systems it may be difficult to summarize all this information in order to understand how the system is growing and being changed. Thus, it would be very nice if we could visualize graphically the evolution of software systems. And this is exactly the concept behind Code Swarms, as idealized by Michael Ogawa:

This visualization, called code_swarm, shows the history of commits in a software project. A commit happens when a developer makes changes to the code or documents and transfers them into the central project repository. Both developers and files are represented as moving elements. When a developer commits a file, it lights up and flies towards that developer. Files are colored according to their purpose, such as whether they are source code or a document. If files or developers have not been active for a while, they will fade away. A histogram at the bottom keeps a reminder of what has come before.

For an example, see below the Code Swarm for Twitter:

It is interesting to note how a small number of developers play a very central role in the evolution of Twitter, changing a large quantity of files during a long period of time. It is also interesting to see that other developers join the project, give a relatively small contribution, and then “fade away”. Of course here we can only observe the number of files being changed, and not the size of the contribution in lines-of-code (LOC) neither the complexity of the code being added or modified. But even so it is clear that a few programmers are “hubs” while others are in the periphery.

I really find this useful, and intend to run Code Swarm for the systems I’ve developed recently with my colleagues at work.

I will appreciate to know what you think about this idea of Code Swarms, please leave a comment below.

About Hayim Makabee

Veteran software developer, enthusiastic programmer, author of a book on Object-Oriented Programming, co-founder and CEO at KashKlik, an innovative Influencer Marketing platform.
This entry was posted in Software Evolution and tagged . Bookmark the permalink.

2 Responses to Code Swarm: Visualizing the Evolution of Software Systems

  1. Very interesting idea. Am wondering how do I use it. While the video is wonderful, I wanted to understand if I can generate custom report(s), which tells me the developers who have made maximum changes (in decreasing order), files which are changed maximum in size and number of times (again in decreasing order), source files that are changed all together, etc. If the information is available potentially as a schema, then I would be able to specify what reports I am interested in. Is such a facility available, or being planned? Will the video and the specification driven reports work with any kind of versionsing system?

    • makabee says:

      I think these are very good ideas and I’m sure it’s possible to implement all of them based on the existing system.
      But I’m not directly involved with its development, so I suggest you contact Michael Ogawa.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s