Last week I traveled to Thessaloniki to participate in the Conference on Advanced Information Systems Engineering (CAISE’14). I presented a paper at the Workshop on Cognitive Aspects of Information Systems Engineering (COGNISE). This is joint work with Yulia Shmerlin and Prof. Doron Kliger from the University of Haifa.
Title: Reducing Technical Debt: Using Persuasive Technology for Encouraging Software Developers to Document Code
Abstract: Technical debt is a metaphor for the gap between the current state of a software system and its hypothesized ‘ideal’ state. One of the significant and under-investigated elements of technical debt is documentation debt, which may occur when code is created without supporting internal documentation, such as code comments. Studies have shown that outdated or lacking documentation is a considerable contributor to increased costs of software systems maintenance. The importance of comments is often overlooked by software developers, resulting in a notably slower growth rate of comments compared to the growth rate of code in software projects. This research aims to explore and better understand developers’ reluctance to document code, and accordingly to propose efficient ways of using persuasive technology to encourage programmers to document their code. The results may assist software practitioners and project managers to control and reduce documentation debt.
You can freely download the paper.
Below are the slides of my presentation at the workshop:
Hi,
Really interesting subject.
i wonder why you think that documentation and lack of it causes debt.
i will start by saying that I am an advocate of ‘no comment in code’. As you mention in your slides:
“…some development approaches promote the idea that good code should be self-explanatory…”
It’s not only being self-explanatory. I find that an even stronger reason is what you also mentions:
“…outdated documentation…”
Clean code suggests that you need to understand the code by just reading it “like a story”.
If the code is clean, well named, OOD, etc. One does not need to explain in comments what the code does.
You also mention:
“…time is spent on studying the software prior to modification because lack of appropriate documentation…”
I agree with the beginning of this argument, but not with reasoning of it.
It’s true. Time is spent on studying the software prior to modification.
You must spend time understanding the software.
But if the code is written in a well mannered, cleaned, conventional way you should spend time understanding it by reading just the code.
Moreover, if you have tests, they are the best documentation possible.
Which brings me to the research plan.
In Think-aloud sessions…
I wonder what would be the answer of the subject relies on documentation.
i suggest you did this experiment also on clean 100% covered tested code without comments.
Also, in the think aloud. Ask them if each time they read the comment, they also read the code.
Ask the participants whether they “believe” to the comment or the code.
pilot experiments – I am not sure undergraduate students are the best candidates.
perhaps a intermediate developer (3-5 years) would be a better fit.
One last remark.
I used to write beautiful comments back on the days.
Mostly while in the university.
But in real life, I really find that having comments is the debt. Not missing them.
And a “must note”.
If you write a library that people should use, you really need to have really great documentation (javadoc).
Eyal
Eyal, thanks for your comment! I agree completely that there are several factors that affect the readability and understandability of source code, including modularity, naming conventions and the existence of unit tests. However, in practice most code is not really “clean” and in this case it was shown in previous researches that comments contribute significantly to maintainability. Please see the references in the paper for several such studies on real software systems.
I like your suggestion for including in the Think-aloud sessions some experiments with really clean and undocumented code, thanks!
Regarding our experiments we intend to use students only for pilots, and investigate professional software developers to derive our conclusions.
Yeah… usually most code is not as clean as we would have liked 🙂
My thesis is that when someone reads the comment, he will also read the code itself.
(that is at least my experience).
Which means basically that one reads twice.
BTW,
you can do something nice:
add a comment, which is wrong (outdated).
And this is real life, isn’t it?
Then see how the pilots read the comments and / or the code.
So even not in a clean-code Think Aloud, I wonder what the results would be.
“If the code is clean, well named, OOD, etc. One does not need to explain in comments what the code does.”
True…but understand that knowing “what” it does is less important than “why” it does what it does.
I wonder,
why knowing why is more important than knowing what?
Can you clarify how you see the difference of the two things?
There’s always more than one way to code something and two blocks of code that appear to do the same thing can yield different results. Take the case of what’s known as Banker’s Rounding – knowing why it’s used can prevent you from substituting a different (incorrect) rounding function during refactoring. Vagaries of APIs can also be important – when creating an attachment to an email message in .Net, you can’t dispose of the stream that was used to construct the attachment until after the message has been sent. Noting that in a comment is useful for preventing bugs when someone does housecleaning on a method (yes, a unit test may catch the problem with the attachment, but it doesn’t tell you what caused it).
The big question is the granularity of the documentation not whether documentation should be written which is a false debate.
Code is not the good place for documenting high level design because it is only a small part of an application. What is interesting is the communication between different applications, potentially written in different languages and with different owners : this is an architecture document.
If we now consider very fine-grained documentation which has the same level of details than the code itself, then the question is : can such a documentation do better than the code? With a high enough language, it should be obvious that the answer is no.
So the big question really is : at what point do we leave case 1 and enter case 2? But I doubt that a hard rule can be easily be formulated.
———————————–
“…some development approaches promote the idea that good code should be self-explanatory…”
As for this comment, this is really a matter of level of skill (see shuhari or Dreyfus model).
New developers should be somehow forced to write documentation because
1) it is something that should be learned
2) the best way to know that something has been learned is to be able to explain it.
Experienced developers should be somehow forced to not write documentation because they should know when documentation is necessary or not. Experience is required to understand the value of documentation and whether (good) code can be sufficient or not.
Hayim,
I’m very much looking forward to seeing this developing. I agree with Bertrand’s comment above that some documentation should be external to the code, but as I noted in a reply to Eyal, the reasoning behind coding choices is extremely important and orthogonal to the readability of code.
Cheers!
Pingback: Reducing Technical Debt: Encouraging Programmers to Document Their Code | On Technical Debt
Pingback: Technical Debt – Why not just do the work better? | Form Follows Function
Pingback: To document or not to document? An exploratory study on developers’ motivation to document code | Effective Software Design