Yet aside from the features that HTML 5 will bring, the entire point of HTML as a language that brings together formatting and content in one schema has been subverted by the MVC model, which uses HTML only at the very end of the formatting process. That is, when you visit an MVC-style website, you are often looking at HTML, but the HTML itself is being rendered on the fly by a controller program that is also determining what data should be displayed rather than allowing the format and the data to be determined by a human, namely the web developer. When you visit eBay and search for “clogs” and then change from “Sort by time remaining” to “Sort by price”, there isn’t just some guy who pulls up the new auctions and rewrites the HTML really fast for you, but rather an intervening controller takes care of it. Getting the controllers right, making them more efficient, and making them easier to work with is where the real focus of web development is today, and even technologies like Flash that have been associated with views are in fact increasingly taking on the responsibility of controllers by directly accessing databases.
The net result is that the real loser in the past 15 years hasn’t been companies like Forrester, that have been (or should have been) benefiting tremendously from the development of controllers that allow direct access to abstracted data, but rather the amateurs who built the web from pure static HTML. We can blame them for populating Geocities, Tripod, and Angelfire in the early 90’s with those horribly designed monstrosities, but it was also they who were the first to use the web as a tool for expressing themselves digitally rather than as a tool for replicating their pre-written or pre-thought expressions in a digital manner. That is, as they wrote HTML, they were also writing the very content that they wanted to format in one fell swoop.
As I write this article, I’m typing into a WYSIWYG text box that renders my writing cleanly in a TrueType font with proper indentation, kerning, and italics, leaving me to focus on the raw, abstracted content rather than its format. Once it is published, the world will be able to read what I have written, but what they see will only partly be my own personal expression. Half of it will be this blog’s theme, which has determined the format for me through this blog’s content management system’s controller. The result is something elegant, professionally designed, and clean, but not completely expressive of what I had to say, because I had only limited input into the formatting. Those who write in raw static HTML, on the other hand, can manipulate formatting to say what they mean with much greater control. Hence, this page’s background color, which melds with the yellowish beige of the map to suggest that the website itself is like an ancient chart of Greece, or the intricate if gaudy buttons of this page, the gold sparkles racing around a sumptuous cracked ruby-red marble background suggesting the author’s admiration of the richness and splendor of Arabic calligraphy. I could replicate the same effects on my WordPress blog by editing the theme and its CSS, but it would be a separate act from writing the content and not a representation of my thoughts and emotions during the moment of creating the web page. This is the reason why an old Geocities page used to look so hideous, because its formatting was the impulsive expression ofthe author in situ; that is the kind of unity, between content and format through a continuous moment of expression, that Web has lost.
The separation of content creation and formatting has deep ramifications for the Internet and computing in general that go beyond merely how easy it is to create a website from scratch. Every profession has a school of practitioners that believe in the unity of strategy with tactics, or the convalescence of the broad strokes with the nitty gritty that fill in the borders. Architects like Louis Sullivan and Frank Lloyd Wright were famous not only for planning out their buildings elegantly, but also for designing the details that filled in the space they envisioned, down to the doorknobs and the chairs. Cabinet makers today will still choose their wood by hand to select the grain that best expresses their vision rather than just accept homogeneous pressed wood made from pulp composite. These artists are the craftsmen of their trade, and their breathtaking works depend on their ability to create content and format through a magically unified act of imagination and labor. This is unity in very deep sense: a craftsman’s act isn’t just a bunch of careful strokes and moments of thoughtful repose strung together, it is a dramatic narrative of those individual time spans brought together by a meta-act, which is the unified vision-creation. Perhaps not all goods need to be produced this way, but craftsmen have their place and they should be respected; without them, we wouldn’t have Fallingwater, Chippendale furniture, or any of the other marvelous contributions to the world that craftsmen have made over the millennia.
Internet craftsmen, however, are a dying breed because the tools being developed are increasingly hostile to creating format and content in one motion. Even among designers who want to create static HTML pages, the introduction of CSS as the solution for overcoming the limitations of HTML has effectively turned each page into a scientific experiment instead of an artwork: controlling for certain variables gives different widths, twiddling with other figures gives different margins, and so on, rather than just writing and designing the page at once as you would with pen and paper. Moreover, the impossibility of crafting sites reflects a much broader trend since the spread of broadband toward estranging people’s content from not only their formatting but also their tools in general. Gone are the days when users would set up their own web hosts on their PC’s so that people could download their content directly. Gone are the days, too, of the democratic ideal when each IP address represented a single computer on the Web, replaced instead by NAT (Network Address Translation) and home routers; it would seem ridiculous today (at least, with IPv4) for each device to have its own IP address, yet before the Internet, content was so unified with its tool that people would literally “dial in” to each others’ computers to access it.
Now, I’m not advocating that we return to telnet, and generally the separation of content creation and formatting makes sense for enterprise-scale web applications like Twitter or Facebook where it would be insane for each user to hand code HTML updates, or for the companies to manage such a mess. Hence, the MVC paradigm, which allows for an easily-maintainable code base–but is it the only way possible? If there were an easier way to code a web page, one that would abstract data while also allowing us to create format and content in one swoop, wouldn’t that be preferable? Users would have much greater flexibility in expressing themselves while programmers would maintain the cleanliness of the MVC model that so effectively separates data from its implementations.
No such solution seems to exist yet, neither online nor offline. The lack of the former may well be a result of the lack of the latter, and if text editors had been able to capture formatting more advanced than EOL characters then perhaps we wouldn’t need HTML, which is ultimately an imperfect solution. Yet text editors reflect the hardware capabilities of most PC’s, and there is a deep divide in this regard between the keyboard, generally used to manipulate pure abstract data, and the mouse, which usually controls and interacts with format. Even on keyboards themselves, users operate the Num Lock key to switch between the keypad’s spatial navigation and number input modes, further dividing the two functions. Perhaps the last time that hardware unified content creation and formatting was the Altair 8800, when switches gave a visual confirmation of the binary opcode being entered while simultaneously serving to enter the opcode.
Are there any hardware solutions in sight? Touchscreens offer some hope, but we haven’t found the right algorithms to extract text from digitized handwriting reliably, nor is that desirable when typing is so much quicker. We’ve also experimented with trackballs, knobs, punch cards, facial and voice recognition, and other methods, but no solution has been allowed users to unify content creation and formatting in a single act while also abstracting meaningful data efficiently. The solution, some argue, is better algorithms, but I would suggest that the answer lies in designing hardware that is more true to the digital soul, or more precisely the computeral soul, which is that inexpressible feeling one has when using a computer that distinguishes and unifies all experiences with computers. This means thinking beyond writing or virtual reality into new, uninvented, and undiscovered realms of data interaction.
The questions to be asked are “What is the essence of using a computer, precisely?”, and “What tools are best adapted to it?” If the physical act of writing is the shaping of glyphs on a surface, what could be more natural than a pen and a page? Yet this solution was not obvious to the first humans who began making signs on cave walls with spit and charcoal, and we easily forget how many thousands of years it took for the tools that define writing today to evolve. Like those cave men and their signs, we are still in the infancy of using computers and we hardly know what to make of them. It is significant, for example, that nearly all of our computeral interactions take place through dissonant paradigms and incongruous metaphors that unsurprisingly engender countless debates over usability and ergonomics. Yet a painter can only create a new work with the colors on her palette, and we can only imagine a digital world with the tools and formats that we know. Is it any wonder that a cave man who is accustomed to communicating orally would draw maps of hunting grounds with his saliva, or that today we would write documents in typewriter emulators or marry each other in pixelated 3D chapels? Are any of those the best way to draw a map, write a letter, or marry? More specifically: are the last two examples the best ways that we can imagine to communicate to a friend computerally or to feel the emotions of marriage computerally?
Moving beyond such paradigms and metaphors is the job of the artist and the scholar, who in this new age are necessarily programmers, as well. Computer design will remain in the hands of usability experts so long as we assume that we must adapt today’s paradigms to better fit a fixed and universal set of human behaviors. The future of computer systems, however, will depend on those who can tap into their inexpressible soul and thereby reinvent that essence and its manifestation, or in this case the unified act of creating content and formatting while abstracting meaningful text on the one hand and the tool we use to execute that act (e.g., HTML, word processor) on the other, according to their own vision. Then, and only then, will computer system design take its rightful place among the arts and enter the great cycle of artistic and spiritual reinvention and redefinition, a call that beckons each generation anew.
About the author:
Ersin Akinci is a history PhD candidate at the University of Illinois at Urbana-Champaign (home of the usable web browser) studying medieval and early modern intellectual history and the history of information technologies and formats. He is interested in the methods and tools of scholarship, and specifically how the humanities can better help us gain insights into the human condition. When he’s not engrossed in obscure fifteenth-century theological debates, he likes to write on his blog, What Digital Revolution?