A (Fast and Loose) Model of Software Communication

I’d like to show that diction is a critical component to successful software products. Diction has nuanced meaning, yet “word choice” is possibly the simplest definition. It is a necessary, but not sufficient, requisite for successful communication in a software team, but I think perhaps 80% sufficient.

I want to approach this essay in the spirit of a (really) informal sketch of a proof. Maybe I’ll refine this further when I get more time.

Let’s first define these sets: software team, communication artifacts, and products. Then, a short methodology-independent discussion of the software process. Finally, the argument for how a failure of diction in an element of the communication artifact set is to the detriment of the product set.

Definitions and Sets

A software team is a mapping of roles to a set of people. Often, the size of the set of roles much larger than the size of the set of people. We need to revisit the mapping question later, but for now, let’s assume the role-set and people-set are of equal size and the mapping is one-to-one. Here’s a sampling of the software team set:

  • Developer
  • Tester
  • Analyst (commonly known as Business Analyst or BA, but I want a single word identifier)
  • Writer (commonly known as Technical Writer)
  • Director
  • Client (is often plural, Audience may be used as well)

I’m purposely excluding incidental, but common, accessories to software teams, such as Business Manager or Systems Administrator. Here, Business Manager is meant more like “Administrator”: she makes sure the timecards (if any) are punched and the seating chart is accurate. Folks like this are important, but for the purposes of dialogue concerning abstract concepts, they are left out. I think this choice is fair, since communications to and from these folks will be very concrete (i.e. “Take down the production server on node 2″ or “Update my cell phone in the company rolodex”).

Let’s treat Database Administrators and Artists/Designers (if any) like Developers. The Director can be anything from your Lead Developer to the CTO or CEO, informally, “Boss” – Strategist, or if you prefer, “The Buck Stops Here” person, or the decision arbiter. Client could be anything from individual end-user of your product to tens of thousands of users.

Generally, each element of the software team set has a voice, and they have the direct responsibility for products. Communication artifacts are generated as part of the process that creates these products.

The set of communication artifacts include intangiables like phone calls, hallway conversations, dialogue in meetings, and tangiables like chat or instant message logs, emails, documents, source control comments, anything input by users of an issue tracking system.

Finally, the set of products include anything tangiable produced by the software team, which is usually software.

Software Process Theory

Communication is nothing more than the generation of communication artifacts that orchestrate the production of products by a software team. In other words, it’s the activity of creating a common understanding in each element of the software team through communication artifacts, with the goal of creating a successful product.

Ultimately, a successful product can be defined as that which most accurately expresses the mental “will-outcome” or “sphere of thought-will” of the Client. This is a stupidily complicated way of saying the customer is always right. We cannot argue that what the Client expects in a product is wrong. However, we can accept how the Client expresses their desire for the outcome (the “will-outcome”) may be incomplete or inadequate. This doesn’t mean at all the Client has a short-coming – it is an extremely difficult exercise and often the full scope of the desired outcome is unknown to the Client! So, don’t jump to conclusions and fault the Client for anything. The product could be silly, or the business model flawed, but these are irrelevant concerns in this argument.

We can define a function s on the Client communication artifacts C, such that s(C) returns a number between 0 and 1 inclusive, so that 0 means the Client has conveyed no information about their “will-outcome” and 1 means all the information of the “will-outcome” is expressed in full. My experience is that this function s(C) is usually something between 0.25 and 0.60, if the products are moderately complex [0]. Simpler products mean a higher s(C) value, more complex products introduce more unknowns, so it can be lower.

The Client role interfaces primarily with the Analyst role. It is the responsibility of the Analyst to take what is likely an imperfect representation and create communication artifacts that expand on the Client’s “will-outcome” – like insulating foam around cracks in the attic walls. In math terms, the Analyst applies a transformation A to Client generated communication artifacts C such that C’ = A(C). We can measure the success of the work of the Analyst if they apply a transformation to the Client communication artifacts C, and s(C’) > s(C). Simply put, we are modeling the concept of clarification.

The set C’ is distributed to Developers and Testers. Products P can be produced by the function D, such that P = D(C’). The function T(P, C’) measures the difference between P and s(C’), or T(P, C’) = s(P) – s(C’). If T is negative, then this means that the product is not meeting the current clarification of the Client’s “will-outcome”. T has this interesting property where it shouldn’t be positive (given a strict C’), but it can be positive if somehow s(P) tells us that the current iteration of the product satisfies the Client more than the clarification of the Client’s “will-outcome” that the Analyst produced. In practice, this does happen because the function D can incorporate more information than just what is present in C’.

Writers generally invoke transform W on C’ to add to the set, but W can also accept P and produce P’ (informally, the software and its documentation, but also includes “support”). This is important because the Client ultimately queries s(P’).

Directors are interesting in that they use function B(C, C’) and modify C’ – meaning, since they are arbiter of Client meaning, they alter the value of s(C’), hopefully, for the better.

Immediately, it should be apparent that the waterfall model would never work. Only through successive applications of A, D, and B can we hope to increase s(C’) significantly. So, some variant of iterative development, be it Agile or Scrum, or whatever, has hope of producing a set of products such that s(P’) >> s(C).

Bringing it Home

We can define a function e (for effectiveness) that compares the elements of a set of communication artifacts C between a subset S of a software team, and let’s say it’s a value between 0 and 1 inclusive where 0 means “identical meaning across all elements in C” and 1 means “no common understanding among elements in C.” For simplicity, let’s say e(C, S) for a set C composed of two elements c1 and c2. We calculate e as a sort of Hamming distance on “understanding.” For example, let’s say c1 and c2 are requirements “Store spreadsheets on file system” and “Store documents in database,” respectively. One verb, two nouns. The verb is the same, so the distance is zero, and e remains 0. Documents and spreadsheets are similar but not identical nouns, so let’s just arbitrarily say the distance is 0.25. File system and database are completely different, so the distance is 1. Informally, let’s just say the value of e(C, S) here is about 0.4 [1]. Note that we assume the words store, spreadsheets, documents, file system, and database mean the same thing to all elements in S. This is the “understandability” calculation. If for some reason, one element in S believes a local Sqlite instance is a database, but another element in S assumes this is referring to the MySQL instance on a server, then e(C, S) can be lower.

Note that the value of e(C, S) approaching 0 does not strictly mean s(C) approaches 1. The relationship between s(C) and e(C, S) is that e(C, S) is the error term of the calculation of s(C).

Generally, if the diction between the elements in a set of communication artifacts C is similar, we should expect e(C, S) to be closer to 0, but only if the diction is understood in common between the elements in S.

That means that under this model, as iterations of software development go, the more e(C, S) is allowed to introduce error, the more these can be become amplified and result in a less satisfying products set for the Client.

This is all really hand-wavy, but if we expand s(P’) = s(P’ = W(P = D(C’= A(C)))), the error of A(C) is e(C, {Analyst, Client}), then error of D(C’) is e(C’, {Developer, Analyst, Client}), and then the error of W(P) is e(C’, {Developer, Analyst, Writer, Client}). In simpler terms, as we try and measure Client satsifiability, we introduce error that reduce the value of the s(P’) function at potentially each step of an iterative software process.

The Long and Short of It

If we accept that common understanding in a software team creates better software, then it follows that a common language implied by consistent diction in communication artifacts is necessary for satisfiable products under this loosely defined sketch of a software process model.

Conclusions and Recommendations

If you like this model and its thesis, then what it means is that you should consider if a misalignment of diction has ever resulted in a bug or requirements misunderstanding within your software team.

Consider making somebody a full-time “Editor” for all tangiable communication artifacts. This person is solely responsible for curating the language of all communications. If there is a mismatch (ex. “database” vs. “file-system”) then the Editor should get the parties together to clear it up.

Naturally, the Analyst or Writer is a good choice for the role. However, we must bear in mind that subsets of the software team also have private language, and maintenance of the private language in correspondence with the team language is also needed, but an outsider shouldn’t do that. The Editor should be somebody who can mediate and curate this correspondence. I believe it’s a critical job, and warrants a full-time position. However, I haven’t seen this role recognized or elevated to importance at all in any place I’ve worked.

Explorations and Observations

I mentioned that diction was perhaps 80% sufficient for successful software communication. I estimate 80% of understandability in software team members is driven by diction, so that if you have folks in this environment, then most language is going to correspond to common understanding. For example, “database” will generally mean the same thing to all team members, but “customer orders” can convey distinct meaning to each person on a software team.

So, I believe that 20% of understanding mismatch can come from having team members with really different backgrounds, different generations, differences in general. That is not to say a homogenous team is better, it just means there’s a little more work to be done to keep communications consistent to foster understanding, and to verify understanding. Often Analysts and Directors are much more experienced, which means there might be a generation gap where understanding fails.

In short, this is team “fit” everyone keeps talking about. It’s not easy to quantify, but we’re pretty sure if you can “get me” then we’ll make a great time. For one example, “getting me” starts with having a common understanding of the phrase “don’t break the build.”

An interesting observation is that this model fits the anecdote that a smaller team has an easier time communicating, and the model indicates this is because the diction is easier to curate and maintain in sync with understandability. However, as a team gets smaller, the bar is set higher for the initial value of s(C), because team where many roles are shared or compressed means a less contribution to increasing s(C) with each round.

[0] I’m really trying not to wing this. If a software estimate captures only 1/2 to 1/4 of the effort required to satisfy client, then we can imagine multiple rounds of this process to capture more of the Client’s “will-outcome.” This all implies that the initial s(C) value was low.

[1] In order to formally define the rules and calculation of e(C, S), we are tasked to define and measure “understandability” between elements of the set of humans in the software team set. Obviously, this is challenging, if not impossible. Let’s say e(C, S) is a placeholder for an estimate of effectiveness of a communication artifact.

This entry was posted in Professional. Bookmark the permalink. Follow comments with the RSS feed for this post. Both comments and trackbacks are closed.