File: Document

( 2014-11-09)

Readings

  1. Raggett, Dave, ed. “Introduction to The World Wide Web.” Raggett on HTML 4. 2nd ed. Harlow, England ; Reading, Mass: Addison-Wesley, 1998. Web. Link (remote) →
  2. ---, ed. “A History of Html.” Raggett on HTML 4. 2nd ed. Harlow, England ; Reading, Mass: Addison-Wesley, 1998. Web. Link (remote) →
  3. “Transclusion: Fixing Electronic Literature.” Jan. 2007. Web. Link (remote) →
  4. Anderson, Tim. “Introducing XML, Its History, What It Is, Its Significance.” Tim Anderson’s ITWriting 2004. Web. Link (remote) →
  5. Buckland, Michael K. “What Is a ‘Document’?” Journal of the American Society for Information Science 48.9 (1997): 804–809. Web. Link (remote) →
  6. Goldfarb, Charles F. “The Roots of SGML: A Personal Recollection.” Charles F. Goldfarb’s SGML Source Home Page 1996. Web. Link (remote) →
  7. Levy, David M. “Fixed Or Fluid?: Document Stability And New Media.” ACM Press, 1994. 24–31. Web. Link (remote) →
  8. Watson, Dennis G. Brief History of Document Markup. Agricultural and Biological Engineering Department, Florida Cooperative Extension Service, Institute of Food and Agricultural Sciences, University of Florida, 1992. Web. Link (remote) →

File as document

  • A digital file meets the most basic definition of document: a written representation
  • In consumer computing, a document usually describes a file containing text data

Plain text

Formatted or “rich” or “styled” text

  • WYSIWYG
  • Markup is hidden
  • Content and form are combined, but not easily distinguished

Text editors vs. word processors

  • Text editors
  • Desktop word processors
    • Plain text interfaces (before 1984-1985)
      • MS-DOS operating system, 1981
      • Plain text with formatting characters added to text and visible to user
        • Electric Pencil, 1976: illustration
        • WordStar, 1978
        • WordPerfect, 1980
        • Microsoft Word, 1983
    • Graphical user interfaces
      • Mac OS operating system, 1984
      • Microsoft Windows operating system, 1985
      • Rich text with formatting codes added to text and hidden from user
      • WordPerfect “reveal codes” feature
      • WYSIWYG: screen display mimics, and produces, printed document

Markup

  • Processing markup
    • Producing printed or printable documents
  • Semantic markup
    • Describing the structure of documents
  • Presentation markup
    • Presenting documents on a screen

Markup history: typesetting

Markup history: text processing

  • roff-type utilities
    • TYPSET and RUNOFF on MIT Compatible Time-Sharing System (CTSS), 1964
    • runoff on MIT Multiplexed Information and Computing Service (Multics), after 1964
    • roff on Bell Labs UNiplexed Information and Computing Service (UNICS, then UNIX/Unix), after 1970
    • nroff (“new roff”) on Unix, after 1973, for line printers and terminals
    • troff (“typesetter roff”) on Unix, after 1973, for phototypesetters
  • TeX (1978) and LaTeX (1984)
    • Designed for typesetting mathematics, among other functions
    • writeLaTeX

Markup history: semantics

  • IBM Generalized Markup Language (GML) (Goldfarb)

      :h1.Chapter 1:  Introduction
      :p.GML supported hierarchical containers, such as
      :ol
      :li.Ordered lists (like this one),
      :li.Unordered lists, and
      :li.Definition lists
      :eol.
      as well as simple structures.
      :p.Markup minimization (later generalized and formalized in SGML),
      allowed the end-tags to be omitted for the "h1" and "p" elements.
    
  • SGML (Standard Generalized Markup Language)

      <h1>Chapter 1: Introduction</h1>
      <p>GML supported hierarchical containers, such as
      <ol>
      <li>Ordered lists (like this one),
      <li>Unordered lists, and
      <li>Definition lists
      </ol>
      as well as simple structures.
      <p>Markup minimization (later generalized and formalized in SGML),
      allowed the end-tags to be omitted for the "h1" and "p" elements.
    

Markup history: text presentation

  • HTML (HyperText Markup Language)

      <h1>Chapter 1: Introduction</h1>
      <p>GML supported hierarchical containers, such as
      <ol>
      <li>Ordered lists (like this one),
      <li>Unordered lists, and
      <li>Definition lists
      </ol>
      as well as simple structures.
      <p>Markup minimization (later generalized and formalized in SGML),
      allowed the end-tags to be omitted for the "h1" and "p" elements.
    
  • “…many desktop publishing methods were in vogue: SGML, Interleaf, LaTeX, Microsoft Word, and Troff among many others. Commercial hypertext packages were computer-specific and could not easily take text from other sources; besides, they were far too complicated and involved tedious compiling of text into internal formats to create the final hypertext system… What was needed was something very simple, at least in the beginning” (Raggett)
  • “The HTML that Tim [Berners-Lee] invented was strongly based on SGML (Standard Generalized Mark-up Language), an internationally agreed upon method for marking up text into structural units such as paragraphs, headings, list items and so on… What SGML does not include, of course, are hypertext links: the idea of using the anchor element with the HREF attribute was purely Tim [Berners-Lee]’s invention” (Raggett)
  • View HTML source of this page
  • Tryit Editor v2.0

Structure vs. presentation

  • Structural elements of a document
    • Section breaks with their headings
    • Paragraph boundaries
    • Less variable than presentation elements of a text
  • Presentation elements of a document
    • Typeface color
    • Typeface size
    • Typeface: roman, italic
    • Type weight: regular, bold
    • Other forms of emphasis: underscores, spacing of letters
    • Vertical spacing between structural elements like section breaks, paragraph boundaries
    • One might choose an italic typeface, bold type weight, underlining, or spacing to emphasize a particular word in presenting a text in various contexts and media (print, screen), without changing any structural element of the text thereby

Today: the front-end Web development “stack”

  • HTML + CSS
    • A style language developed for use with HTML
    • Internal style sheets: <style></style>
    • External style sheets: <link rel="stylesheet" type="text/css" href="mystyle.css">
    • CSSDesk - Online CSS Sandbox
  • HTML + CSS + JavaScript
    • JavaScript is a lightweight, special-purpose programming language
    • Developed along with HTML and CSS for Web sites
    • Implementation is entirely contained in any browser
      • Safari: Developer > Error Console
      • Chrome: Developer > JavaScript console
    • Use inside document with HTML markup viewed in browser
      • <script></script>
    • Simple JavaScript functions
      • alert("Hello World!);
      • document.write("Hello World!");
    • Dabblet
  • Client-side (browser) vs. server-side scripting
  • PHP (server-side)