The first level of engineering software for maintainability is coding. As I said before, tidy programs are easier to understand and modify. The second level is during design and integration. Well organized, simple, and clean interfaces among components make future changes easier.
In the 20th century, many programs were single stand-alone systems. 21st century software, however, is often collections of integrated software systems that live in an eco-system, not in isolation. They interact with other applications, they use shared libraries, they communicate with related applications on the internet and the cloud, and they share data with multiple computing devices. This increase in couplings among software systems is very powerful but makes careful design more important. In effect, distributed computing has become the norm, not the exception.
Many factors make this kind of integration-heavy software difficult. Networks are notoriously unreliable and much slower than the devices that use them. Programs are diverse in terms of language, operating system, data formats, and many other characteristics. And of course, change is inevitable and continuous. The eco-system changes every time hardware or software is updated and when new applications are brought on board.
Back “in the day,” software components were coupled through function calls and shared, non-local, variables. Now software components are coupled through networks, messages, the cloud, databases, and other convenient mechanisms. The notion of coupling has been extended:
ELC leads directly to XML as a simple but powerful way to support it. Passing data from one software component to another has always been difficult. The two components must agree on format, types, and organization. In method calls, this agreement is syntactically hard-coded into method signatures. But we don’t have the same level of type checking in distributed software (such as web applications), and we have additional requirements:
I examine this through a common form of coupling: One program, P2, needs to use data created by another program, P1. In the 1970s, before the emphasis on data abstraction, P1 would write the data into a file. To save space, files were saved in very compact forms (binary, not text), and with rigidly structured records that were often not documented. If P2 is written after the source of P1 is no longer available, the structure of the file would have to be deduced by a slow trial-and-error process of reading bytes into memory, and printing them in different formats to see if they legible.
By the 1980s, the concept of data abstraction led to the file being controlled by a “wrapper module” that could read and write the file. The wrapper was shared by both P1 and P2. This was still slow, and since the wrapper module was shared among multiple programs, it was very difficult to change it or the structure of the file. Adding a single field could disrupt dozens of programs.
The modern solution is to use file formats that are free-form, textual, and self-documenting. That is, XML. XML files take a lot more space:
<book> <title>Don’t Make Me Think, Revisited: A Common Sense Approach to Web Usability</title> <author>Steve Krug</author> <year>2014</year> </book>
but make programs easier to change.
A major factor in the maintainability of software is how data is shared and information is passed among software components. Four major styles are:
The distinction between synchronous and asynchonous messages is very important. A telephone call is synchronous, because both parties need to be on the call at the same time. Synchronous communication has greater bandwidth, but introduces two restrictions. First, both parties have to be available at the same time, and second, communication must be in real-time. Voice mail is asynchronous. We leave messages for later retrieval, so the real-time aspects are less important. In web applications, the traditional request-respose cycle imposes a synchronous model, but Ajax introduces asynchronous calls.
Figure 1: A synchronous call | Figure 2: An asynchronous call |
Asynchronous messaging architectures are very powerful, but we have to change how we design and develop software. We have decades of deep knowledge about using files, shared databases, and remote method invocations, but we teach college students relatively little about asynchronous software engineering.
Data encapsulation is stronger than with databases and file storage. RMIs have reliability problems because any glitch in the network will disrupt the communication, whereas messages simply wait until the network recovers. Message passing reduces dependencies, making it less likely that changes will cause problems elsewere in the system. That is, fewer ripple effects. This improves maintainability, as well as reliability, security, and scalability.
On the other hand, the lack of deep knowledge for how to write asynchronous software makes them less likely to be reliable, and harder to understand (a negative for maintainability).
The programming model is different and complex. Logic is distributed across several software components, which does not match how we teach topics like algorithms. Many universities do not teach event-driven software at all.
The sequencing of software tasks is harder. Message systems do not guarantee when the message will arrive, so messages sent in one sequence may arrive in a different sequence. In fact, many applications that could use asynchronous events intentionally do not because of the engineering challenges.
Enterprise systems contain hundreds, sometimes thousands, of separate applications. They are a mix of custom-built components, third party vendors, and legacy systems. They are often designed with multiple tiers that run on different computers and different operating systems. Many companies depend on large enterprise systems that encapsulate the operation of many aspects of the business. Patriot Web is an example at my university. That and Blackboard, which is used widely to support teaching, are both unreliable, hard to modify, and have extreme usability problems. Although universities suffer more than many companies, problems with enterprise systems are quite common. Many actually grew from multiple smaller software components, just like small towns grow together, slowly integrating to form cities. This type of organic growth invariably creates maintenance debt and confusion—just think of the last time you saw a street change its name without warning.
Integrating diverse applications into a coherent enterprise application will be an important task for years to come. It’s not easy, but understanding important goals like maintainability and usability help. Lots of frameworks and integration platforms are available. One of the most important thing to understand is their set of basic assumptions. Some assume that the data never changes, but new functions will be continually added. Thus, APIs should be strong and clear, although the central database may be very hard to change. Others assume that the functions will remain constant, but they will be adapted to new hardware platforms and to new users. Thus, new features will be hard to add, but the UI should be easy to change.
These are about tradeoffs. When systems are integrated, we usually can’t support maintainability in all aspects, so a crucial early decision is which types of changes should be planned for. If an organization’s software architect gets that wrong, the entire organization will suffer for years. One of the hardest part about making these decisions is that the planning team must be able to look 5 or 10 years into the future. This is a rare ability.
To summarize, reducing coupling is a key goal to ensuring maintainability at any level. Software engineers have known about the importance of coupling since the 1970s, although the specifics change with each generation of language, hardware, and software engineering technologies. The primary goal of coupling is to reduce the assumptions that two software components have to make when exchanging data. Loose coupling means fewer assumptions. Local method calls have very tight coupling, as do remote method invocations. Worse, RMIs come with the complexity of distributed processing. Message passing, however, has extremely loose coupling and is thus a strong way to increase maintainability.
[1] Dave Parnas, On the Criteria To Be Used in Decomposing Systems into Modules, Communications of the ACM, 15(12):1053-1058, December 1972
Jeff Offutt
George Mason University
offutt@gmu.edu
January 2018