The Journal of Object-Oriented Programming

The "MVC" architecture - a "skill-driven" account

by Avner Ben.

This article originally appeared in the November 2000 issue of JOOP. Some figures have been updated - last update - 4 Sep 2000.

While a comprehensive system design must feature a number of methodical approaches, most systems stress one model, which is the key to their understanding. The well-known Model/View/Controller architecture of GUI-intensive design is a fine example for a system whose functionality determines almost single-handed what it is. The notation and concept of "skill-driven" design are called upon to furnish a short account of this notoriously hard-to-understand piece of object-oriented design, which is as event-driven as it is procedural. Even the object model of this system may be derived from its functional decomposition model!


In a previous JOOP article[1], I introduced the "skill-driven" approach to object-oriented design. In essence, a "skill" is both a functional requirement of the system and the ability of some entity inside the system to satisfy it. This omnipresent duality is there to ensure traceability and enforce structured project control. Skill-driven design artifacts may be applied to capture the functional complexity of a complete system as well as a discrete algorithm (side by side with UML-compliant - or other - accounts of the object model, use-cases, etc.).

The present article is the first in a series, aimed at featuring the method's application scope. Here, I demonstrate the use of skill-driven design to capture a software architecture that stresses distribution of functionality. The application domain is yet another description of the familiar "MVC" architecture of GUI design (see for example [2]). While this may justifiably be considered yet another MVC account, it is primarily a technical effort, intended to show the expressive power of the notation.

Our guided tour of the MVC commences with a free introduction to the application-domain, using unstructured text, illustrations and a number of punctual use-cases. Then, follows the structured part: Three opening statements define (as distinct from "describe") the domain. Then, a skill list and a skill/reliance model exhaust (hopefully) the functional decomposition, which is the essence of this application. Finally, an entity/relationship model (as "class diagram") is factored from the skill/reliance model. A use-case model, abundant in OOD specs, would not donate much to the understanding of this particular application and is thus omitted.

As the result of this structured tutorial, a newcomer to the MVC architecture is supposed to gain a sufficient understanding of its paradigm to proceed with the details of some GUI builder or GUI library that implements it. The Veteran MVC user is supposed to get a formal foundation for evaluating existing implementations or design new ones.

Free-form introduction

The three windows on the computer screen, in figure 1, display the same set of data in three views. The top-left view formats it in hierarchy Style. The right view formats it in decision-tree style. The bottom-left view may look like text, but it is just as structured as its companions - this listing style is called military notation. All three views refer to actual data that is stored by a program in computer memory in the traditional tree data structure, using records made of fields, connected by pointers. This in-memory-database object is the data-model (sometimes called document). Since objects do not exist outside computer memory, all the persistent data that the views would need is loaded into the model when constructed (or re-opened) and written back to disk when saved (or closed).

Each view may be edited and browsed by the user. To support the promise of "WYSIWYG" (What You See Is What You Get) views sport a "point and shoot" - rather than the simple "command-line" - interface. The graphic user interface practically consists of so many controls (sometimes called widgets) - visible command-objects, normally set by - and reporting to - views. Most of the things you see on the screen are controls. The canvas on which the view paints is a control. The scroll-bar at the side of the canvas-frame is a control. The menu on top of the canvas-frame is a control. Pressing a control with the mouse pointer does something (usually, to the view to which the control reports). Other commands, such as typing on the keyboard, are either intercepted by the view, where functioning as text-editor or are mapped to a menu option. To be able to interpret target-less commands (e.g., "delete" and "next"), the view must maintain a current-element pointer and use some graphic convention to highlight it (e.g., reverse video). Note that each of the three views in figure 1 is maintaining a different current!

This added level of abstraction between view and user - the controls - invites an event-driven software architecture, based upon "registration." Views never talk to the user directly. They set controls and wait for their alert. In fact, some commands, e.g., "save" and "undo" are normally handled directly by the data-model. (To be precise, most real-life MVC architectures actually apply a complicated "chain of command" to resolve the handler of the request. However, for our purpose, this detail may be safely ignored). The illusion of graphic manipulation of controls (and, in an extreme manifestation of the "point and shoot" metaphor - "dragging" objects on the screen) is enabled by a background entity called the controller - another event-driven "registration" abstraction layer. All input is first intercepted by the controller. It is often dispatched to controls, based upon simple identification by physical location, to be processed by the control and eventually executed by the views. In its most simple design, a controller consists of an "event loop" that repeatedly listens to inputs, analyses them and dispatches their treatment to the proper handlers,

This software architecture was called MVC for its three principal components: Model, View and Controller and for the very design decision to encapsulate their functionality in three distinct entities in the first place. Its main objective is to create the illusion of manipulating "the real document." Contrary with what some users may imagine, the software is not looking at us from the other side of the cathodic-ray tube, watching the mouse-cursor move over the reverse image of the screen. However, this is exactly the illusion the software designer would like to create! The key item in computer-based graphics is that the program never "sees the screen." What the program does see is a sequence of mouse operations and the locations in which they happened. The program also remembers the locations where it "put things on the screen." Communication with the data model that is hidden deep inside the computer, is still being held using an old-fashioned command-driven interface, but on API (Application Programmer Interface) level. The view is so smart that hides this fact from the user, understanding mouse movements and key presses and translating them to data-model-level machine-code. If the view is fast enough in refreshing its display to reflect changes in the data-model, then the illusion of dragging and other forms of graphic manipulation will be complete.

To realize that even the least significant user-action requires collaboration among more than one object internally, look at the simple "use-case" scenario of changing the current node.

User

Presses left mouse button over square of node "B".
(See figure 2).

Controller

Detects mouse-down event at coordinate 25:112.
Locates control occupying this position - finds a view's canvas.
Alerts canvas to mouse-down event at 25:112.

A canvas : Control

Alerts view to element-selection event at 25:112.

"Hierarchy" : View

Locates element intersecting 25:112 - finds node "B.
Establishes node "B" as current.
Re-displays canvas. (See figure 3).

The view in focus orchestrates most of what the user sees, as long as we can do with browsing functionality, i.e., operations that leave the data-model intact. Consider, for example, this use-case scenario of scroll down.

User

Presses left mouse button over vertical scroll-bar.
(See figure 4).

Controller

Detects mouse-down event at coordinate 14:37.
Locates control occupying this position - finds a scroll-bar.
Alerts scroll-bar to mouse-down event at 14:37.

A vertical scroll-bar : Control

Computes relative offset - understands 20% down.
Alerts view to vertical-scroll event of 20% down.

"Decision tree" : View

Re-computes display point of origin.
(Assuming the entire display-data is pre-formatted.)
Displays the visible part of the formatted data in its window.
(See figure 5).

Often, the view does not really do the job. User operations that imply modification of the data model are forwarded to it by the view in question, which then sits back and does nothing! So who is responsible for refreshing the display to the correct state? It so happens that all views register with their model, so that whenever the latter changes, it notifies these "observers" of the change - its very originator included.

Often, the view maintains a non-trivial data-structure related to displaying the data-model (e.g., element positions, colors, fonts, the current entry, additional navigation paths, etc.). This internal formatting-related data must be re-constructed with every refresh (unless the view has some strategy of optimizing the refresh - a rather risky business!) Consider the use-case scenario for "node deletion."

User

Presses the "delete" key.
(See figure 6).

Controller

Detects key-down event.
Maps delete-key to "delete" menu command.
Alerts menu of view in focus to "delete" command.

A menu : Control

Alerts view in focus to "delete" request.

"Military Notation" : Data view

Requests the data-model to delete node "D."

Data model

Deletes node "D."
Alerts its three views to refresh.

(Let us concentrate on the originator of the change.)

"Military Notation" : Data view

Retrieves data from model.

Data model

Returns requested data to view.

"Military Notation" : Data view

Re-builds formatted data.
Notices invalid current-pointer - "D."
Establishes next node - "E" - as current.
Displays formatted data on its canvas.
(See figure 7).

Opening statements

To begin the structured part, let us summarize all the important facts we know into a concise - but still textual - form. The normal purpose of this "minispec" (miniature specification) is to show the originator of the system (if we are lucky enough to have one present - otherwise, to ourselves) that we have understood the assignment correctly. In case we have missed a point, the semi-precise phrasing should arouse constructive criticism leading to this end. To show that the text is not entirely free-form, nouns that suggest entities of design significance are emphasized in bold type and functionality that involves them is emphasized in italics.

The first item - the domain statement - (sometimes called system charter) is an important part of any design. In the "skill-driven" version, it presents a working model that satisfies the functional requirements above, detailed to a level that is understood by the system initiator. It features essential domain entities and processes, in Human language. However, it is the illusion of domain to be created that is presented, rather than its internal architecture. The latter is detailed by the last paragraph - the statement of architecture. This exposes hidden mechanisms and practices that must be introduced to sustain the above illusion. However, an alternative architecture does not have to affect the domain statement. The third item - the system contract - "sells" it. It expresses the system functionality in terms of design by contract. A few skills of the system together express what its user has actually come for. In return, the system expects the user to have some skills.

Finally, some words of caution: Obviously, this design addresses the general case - any WYSIWYG editor (the tree-model with three views, featured in figure 1, features an example). However, I have found it impossible to be 100% generic about the domain; This account is largely inspired by the Microsoft Windows version of the architecture (e.g., "SDI" and "MDI" models), as manifested in class libraries such as Microsoft's own "MFC" and even cross-platform libraries such as "wxWindows"[5]. In addition, the role of the entities called"controls" (or "widgets") has been somewhat idealized in the present account, to allow for a more elegant architecture than often found in practice.

Here it comes:

"MVC-based graphic editor" - definition:

The domain: A graphic-editing session commences by either opening a file and displaying the data it contains in a formatted, editable form, or displaying a blank form. The user then appears to manipulate the graphic representation of the data ("WYSIWYG"), by that meaning either browsing the displayed view or manipulating the data behind it. The editor intercepts user input through edit controls, and interprets it, resulting in manipulation of the data. The editor session ends with closing the file, saving the updated data. A data-model may sustain a number of editing sessions at once, in the same or different presentation styles with editing-results synchronized in all of them.

An architecture sustaining it: The functionality involved is distributed among three distinct entities.

Contract:

Post-condition skills:

Pre-condition skills:

"MVC-based graphic editor" - required skills

"While Model/View/Controller may look like an object-model, it is really a functional decomposition. The model stores and manipulates data, the view displays it and understands user's intent about it and the controller gets raw input from user and dispatches it to those who can understand it. The name of this game is "teamwork." In designing teamwork, we are concerned with what must be done and who should take care of it, i.e., with distribution of responsibility. The traditional tools of the procedural paradigm - flowcharts, Pseudocode and the newer "use-case scenarios" are still useful in collecting requirements and in testing the final product, but are of little design value. Actually, according to many authorities (e.g., Meyer [3]) time-sequence-based notations are detrimental to the design. A "skill-driven" design commences by specifying the required skills (functionality featured by team members) and ascertains the selection of skills by plotting the network of dependencies among them.

Here is the complete list of skills required to sustain an MVC application framework, assigned to the five entities involved. The fifth authority - the editor-framework itself - has been taken for granted thus far. It encapsulates the entire system and is a normal practice in object-oriented design. The functionality that it encapsulates - constructing views and models - had to be introduced to remove the skill of initialization from the view itself. The view is always told to refresh - even in the first time!

Note that the skills are not described by a process. To prevent a procedural account, we have borrowed the "Design By Contract" (DBC) idiom[3], with a variation. While DBC, in its original context of programming, expresses extreme caution, assuming that each function is doomed to fail unless proven otherwise, design-level DBC is essentially optimistic, being goal-driven. A skill is there for its post-conditions (its goal). Pre-conditions are what must be assumed for it to succeed. Otherwise - handle the exception, most of the time, simply ignore. When the skills are decomposed to program functions, a stricter form of DBC will make itself apparent. In addition to DBC items (pre-conditions, post-conditions, exceptions) we also borrow from event-driven design ("triggering event") and design-pattern terminology ("context," "motivation") - all in the name of clarity and conciseness.

Editor. The editing-application framework. Note that its only significant functionality is constructing views (and models). Once they exist, they proceed to interact with the user directly.
Invariant: Editor sustains one or more fully-constructed models each of which sustains one or more fully-constructed views.

Features.

View. An edit window.
Invariant: View formatting reflects last known model state.
Features...

Model. An in-memory database whose contents are edited. Features...

Control. A visible object accepting commands for a view or model. Features...

Controller. Display and command infrastructure. Features...

Note that some legitimate skills have been left out of the list. For example, the controller's skill for graphic-primitive supply. This facility is not MVC essential and may be skipped - the skill-list is too crowded already. Consider this: If the view was to handle painting the screen all by itself, pixel by pixel - would that affect the MVC-ness of the architecture?

In addition, we have taken the liberty to unite a number of distinct skills into generalized skills, allowing for a higher-level view of the design. E.g., the many skills of the model for update, delete, insert, get, set etc. are united into two distinct skills: data retrieval and data manipulation. They are not united to one skill, e.g., data handling, to support the model's client - the view - need for two distinct model functions: view editing and view browsing.

This overview of the MVC architecture has kept to the basics. Real-life examples often involve additional levels of abstraction suggesting "advanced" skills. Some frequent examples are: data retrieval - producing an iterator to separate view from model internals and structured data manipulation - a transaction protocol with undo capabilities. In addition, one may sometimes meet a persistent view (storing configuration data) and a distributed model (adding a server/client dimension to the design).

Note the title "View Integrity." The infamous "re-entrant loop" inflicted by introducing the "observer" pattern[4] to MVC has kindled some heated semantic debates. Some people would see it as a simple notification issue - i.e., the model does not know what it is doing and is not responsible for what the view (or whoever it may be) does with it. Others would see it as a synchronous procedure call, involving placing the responsibility of "view integrity" upon the model. I have borrowed from the two camps. The name of the skill [guaranteeing] "view integrity" places some responsibility upon the model, however it is implemented by the weak bond of event notification. The post-conditions of the skill state the point clearly: The model's responsibility ends when each view is "commencing refresh." From there on, it is their responsibility!

The main justification for DBC over explicit temporal coupling is that the latter may be deduced by matching the pre-conditions of one piece of functionality with the post-conditions of another[3]. The "switchboard chart" in figure 8 does just that. Apart from discarding descriptive details, the skill-list above remains the same. The web of reliances on the left margin of the list exposes explicit supervision: By promising "view refreshment" the view must also promise (and actively supervise) both "view formatting" and "view display" because its post-conditions are the sum of their's. The web of reliances on the right margin of the list exposes implicit coupling: The post-conditions of "data formatting" are the pre-conditions of "data display." Consequently, data-formatting is data-coupled with data display. However, it cannot supervise it because it has ended before its time. Such passive coupling reliances expose important information needed to synchronize the processes suggested by the supervision reliance network on the left margin.

The graph has three "physical" entry points; There are three solid arrows leading from the outside. I.e., we are dealing with three processes that must be synchronized by the using application. This is not a single-entrance black-box. We call this an asynchronous skill-list. The many dashed (or pale) arcs represent "logical" reliance - some of the functionality supported by this system is not approached directly. The user "as if" activates some functions but is actually addressing some go between. Logical reliances are redundant from a programmatic point of view. However, they are essential to demonstrate traceability with the system's functional requirements - which is the main purpose of skill-driven design!

Following is a guided walk-through the skill-chart:

1. Skills related to view editing: The user of the editor may order a few functions directly: instruct the editor to construct the view, re-display the view when it re-gains the focus and make the controller activate controls. The latter operation is not what the user actually has in mind and the user may even be unaware of it being done this way. The editor uses an indirect, event-driven approach, involving registration of functions and execution dispatch. When the user wishes to instruct the view to edit - and the model to save to disk (object "persistence - out") - he uses the graphic interface and presses the appropriate control. By this, the user thinks he instructs some control to activate some editor function (e.g, instruct a scroll-bar to scroll down the view to which it is attached). Actually, the user is addressing the hidden controller that, in turn, locates the control indicated by the coordinates of the graphic event and activates the function that the specific control registered for the event. The control, in turn, activates the functions registered by its owner for the event - editing by the view and data-persistence by the model.

2. Skills related to data manipulation: All data-model activities that fall under the category of data manipulation are also responsible for view integrity. This means instructing each view to refresh, in its own respective way. When the view is told to refresh, it first "re-formats" i.e., reconstructs its internal display-data-structure and then displays the result. (It is also possible to display without re-formatting, as in the case of view browsing). In order to format the data for display, the view must retrieve the complete set of data from the model once again. (Of-course, it may optimize the retrieval and somehow restrict it to only the part that has changed, where that is feasible and worth the effort). A peculiar feature of this architecture is that the view seldom refreshes the display of its own accord. It is instructed to refresh for the first time upon construction (by the editor) and then by the data-model, each time the latter is modified (possibly this very view). In other occasions (e.g., browse events) the view displays the formatted data but does not normally re-formats it for the event!

3. Skills related to view construction: When the editor constructs a view, it first constructs the respective model, if needed. If a file name was specified, the view instructs the model to load the file to its in-memory database (object "persistence - in"), else, the model is left empty nevertheless in working state. Then, the view is instructed to refresh (see data manipulation). The fact that the view must be told to refresh, rather than take the initiative and does it as part of its initialization routine is significant in this event-driven architecture.

"MVC-based graphic editor" - the object model

A comprehensive system design should consist of at least two autonomous models: a static model (e.g., an E/R dialect) and one or more dynamic models (e.g., skill-driven, use-case driven, finite state machine, flow-chart). These parallel models furnish specialized views of the system. It is left for the Human eye to resolve them into a comprehensive picture. (Attempts at a "unified" one-size-fits-all notation have always been failing to deliver, for obvious reasons). However, in most systems, one view predominates. For example, our present system - MVC-based graphic editor - is profoundly functional. Where functionality rules, other aspects of the design may sometimes be derived from it alone. Actually, some methods - e.g., structured analysis and design - even featured a procedure for doing this in a structured way! Let us see if this is feasible with a skill-driven design.

Let us begin by "summarizing" the reliances in figure 8 on entity level. Let us leave only one box per entity and let us unite all skills between the two entities in the same direction. The result (see figure 9) should show us the net reliances on the object-oriented level - i.e., who needs who. Note that "data" coupling has been suppressed while "event" coupling remains. The first does not suggest permanent visibility. The latter requires further research. The graph in figure 9 has also omitted the reliances from outside the system, since we are now interested only in the associations inside it. Looking at figure 9, the following facts become apparent:

All visibilities in the figure 9 suggest permanence. We may expect the view to remember which model it must display and the control to remember which view it must alert. None of these associations suggests, for example, a temporary visibility such as a function argument. Implementing "event" coupling by registration of pointer means reversing the direction of the arrow. From the functional perspective, the view relies upon the control (to alert it to requests). From the object model point of view, however, since the view has registered at the control, the control must point to the view. Wether, in addition, the view also keeps eye contact with its controls is a secondary issue. (In many GUI systems, it is the control's responsibility to stick to the frame, rather than the frame's responsibility to sustain its embedded controls, or at least this is the impression it gives you). The name of the association - "activates" preserves the coupling semantics.

Figure 10 reconstructs the result of the analysis in UML-compliant "class diagram" form. In addition, the diagram features some design decisions that have no direct roots in the functional decomposition that has been the subject of our original presentation. The associations from editor to model view and controller are composition (full ownership semantics, whole/part). The association from view to model is aggregation (pointing without ownership). This containment semantics has been in the background all the time - it has been taken for granted. It does not follow from the functional decomposition. The controller, which has so far been treated like an independent entity is now presented as part of the editor. The existence of orthogonal notations and design concepts is essential to a comprehensive design. This way, we ensure that each view is not obscured by details which - from its unique point of view - are redundant.

The object model in class 10 is preliminary. A few more iterations will result in a detailed, more accurate model. One possible such model is expressed in figure 11.

References:

1. Avner Ben, "Entity/Skill/Reliance: Applying Functional Decomposition Safely to Object-Oriented Design," JOOP, October 1998. For a version with up-to-date drawings, see http://www.skilldesign.com/articles/sddintro/sddintro.html.

2. James Rumbaugh, "Modelling models and viewing views: A look at the model-view-controller framework," JOOP May 1994.

3. "Object-Oriented Software Construction Second Edition," Bertrand Meyer, Prentice Hall PRT, Upper Saddle River, New Jersey, 1997.

4. Gamma, E. et al. Design Patterns: Elements of Reusable Object-Oriented Software Architecture, Addison-Wesley, Reading, MA, 1995.

5. http://wxwindows.org/.

Figures:

Figure 1. MVC-based graphic editor - an illustration.

Figure 2. "Make current" use-case - before.

Figure 3. "Make current" use-case - after.

Figure 4. "Scroll down" use-case - before.

Figure 5. "Scroll down" use-case - after.

Figure 6. "Delete node" use-case - before.

Figure 7. "Delete node" use-case - after.

Figure 8. The complete skill-chart for a generic MVC-based graphic editor. The skill-list is adorned with "reliance graphs" in its margins. The graph on the left margin details direct reliances where the post-conditions of skill include the post-conditions of another. The graph on the right margin exposes hidden "coupling" reliances where the pre-conditions of one skill depend upon the post-conditions of another, featuring two such cases: reliance on prepared data and reliance on invocation on event.

Figure 9. Beginning of an object model: All skills collapsed and united to rely on entity. Reliances by an anonymous source have been omitted.

Figure 10. Preliminary object model. The information in figure 9, expressed in the "class diagram" medium.

Figure 11. A possible object-model. The "canvas," first mentioned here, is a case of control of special importance. The entity "control owner" has been introduced to allow the control to be generic in a strong-typed language. of-course, other solutions are possible.

While a comprehensive system design must feature a number of methodical approaches, most systems stress one model, which is the key to their understanding. The well-known Model/View/Controller architecture of GUI-intensive design is a fine example for a system whose functionality determines almost single-handed what it is. The notation and concept of "skill-driven" design are called upon to furnish a short account of this notoriously hard-to-understand piece of object-oriented design, which is as event-driven as it is procedural. Even the object model of this system may be derived from its functional decomposition model!