INFO 320

Post-Web Information Systems


Background reading: MSDN XML Library or download
Go to XML    XML Schemas     XSL    HTML, JavaScript, etc,

The World before the World Wide Web: (Prior to early 1990s)
  • Information elites determined form and meaning of information (e.g., librarians)
  • Information oligarchs vended access to information (e.g. Dialog Corporation)
  • Use of fairly rigid record structures (e.g., MARC records)
  • Databases grow "slowly", records are "permanent".
  • Use of "one" source of meaning (e.g., the Library of Congress Subject Headings)
  • Use of "one" classification of meaning (e.g., the Library of Congress Classification)
  • Computers act as big indexes, helping us find paper resources (e.g., Library catalog at the University of Washington)
The Internet/World Wide Web Yet another information technology revolution!
The World after the World Wide Web: (1990 to 1998)
  • Anyone can publish a web page. The conflation of the author and publisher?
  • Anyone can indicate the meaning of his or her web page. Meaning is freed from the control of elites. Your page is about what you say it is about.
  • Information escapes personal control, escapes organizations, spans borders, etc.Multi-cultural, multi-language, multi-ethnic, etc.?
  • Web content churn. Web pages tend to be ephemeral
  • Web content is "full-text"
  • Web content can be the product of scripts, database reads, streaming video/audio, cookies, caching, etc.
1998 and following...

Second Generation Web

The World Wide Web Consortium (W3C) develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential as a forum for information, commerce, communication, and collective understanding.
Class investigation assignment: Crawl all over the W3C site
  • Extensible Markup Language, XML:

    XML is a set of rules, guidelines, conventions, whatever you want to call them, for designing text formats for such data, in a way that produces files that are easy to generate and read (by a computer), that are unambiguous, and that avoid common pitfalls, such as lack of extensibility, lack of support for internationalization/localization, and platform-dependency.
    Documents that declare their meaning? Deja Vu: What is the relationship between syntax and semantics?


  • XLink

    This specification defines the XML Linking Language (XLink), which allows elements to be inserted into XML documents in order to create and describe links between resources.
    Linkages among documents that declare their meaning?


  • XML Query

    The mission of the XML Query working group is to provide flexible query facilities to extract data from real and virtual documents on the Web, therefore finally providing the needed interaction between the web world and the database world. Ultimately, collections of XML files will be accessed like databases.
    An structured query language to use on XML documents? The Web as a database?


  • XML Schemas


  • XML Schemas express shared vocabularies and allow machines to carry out rules made by people. They provide a means for defining the structure, content and semantics of XML documents.
    Common, shared document descriptions? Community implications? The Web is a community? Is there a policeman?

The Extensible Markup Language (XML)

  • A simple XML information source. The source.
  • A bad example naming XML elements. The source.
  • Using XML attributes. The source.

    Class discussion: Does the MARC record have "elements" and "attributes"?

  • Advantages of using Extensible Markup Language (XML)
  • XML files are simply text files and XML itself is only a text string. Since they are only structured text strings, this makes them ideal for communications between two components or two systems that were never designed to communicate with each other.

    An Example of XML
    <?xml version="1.0"?>
    <colors>
       <paint number="123">
             <name>Scarlet</name>
             <hue>opaque</hue>
             <price>12.99</price>
      </paint>
       <paint number="456">
             <name>Ultramarine</name>
             <hue>transparent</hue>
             <price>8.75</price>
      </paint>	 
    </colors>

    Frequently when two systems are being designed to communicate, which is increasingly the case in this Internet era, programmers face the sticky issue of how to manage COM objects, created on separate machines. Obviously, moving COM objects, classes and ActiveX controls around the Internet is not ideal. Moving the data and “state” (the values and content) is a much better approach in terms of maintenance and speed. With this in mind, how do you structure your data? Do you create your own custom format in which your distributed objects communicate, or do you use XML so that many other systems can in future participate? XML is the perfect medium in which to describe the state of these objects for communications to other systems or objects. In a disconnected medium such as the Internet, synchronous (where two systems are synchronized in their operations) communications over the web as it is not easy to achieve. Therefore there is a huge need for asynchronous messaging and even for indicating the “state” of that message.

    Imagine this scenario: Two products from two different vendors have different custom data formats for their purchase orders - therefore they are unable to exchange their data. But if their developers use the XML approach to structuring purchase order data, the two systems may be able to exchange data – without overhauling their existing systems.

    XML can contain links to other files. This is useful because they can provide a link to an XML file called a “schema” which holds a description of the correct XML structure.

    The XSL file (eXtensible Stylesheet Language) holds the details on how to appropriately display a XML file. For example, if I send you an XML file with several hundred lines of numbers in it, the same XML data/message can be displayed as an email, a word document, as a table containing data or even as a graph or simply as text. All I have to do is provide the correct XSL file for each choice.

    This is most useful because the XML file itself is only the data, and the “look and feel” is kept separately in the XSL file and only applied according to the context it is in and according to the data it is displaying.

    Consortia promoting extensible information technologies
    BizTalk A Microsoft-backed consortium for the development and distribution of the BizTalk flavor of business-oriented XML schemas
    CommerceNet Defines specifications to facilitate the interoperability of information and integration of content and services across and between vertical markets
    FinXML A consortium supporting the creation and management of the FinXML language for the integration and exchange of digital information in capital markets
    Organization for the Advancement of Structured Information Standards Nonprofit, international consortium of companies and organizations dedicated to accelerating the adoption of product-independent formats based on public standards
    RosettaNet Standardizes the mechanisms used to define the business processes of vertical markets

    Public market for XML schemas: XML.org Registry by Oasis

    Some Examples of XML schemas

    Assignment Four

    Defining a data source

    Metadata has come to mean a complete record, including encoding, that describes and stands in place of a larger document or a collection in a bibliographic tool...Metadata has been defined in many places as 'data about data.'...There are three parts to creating metadata for an information package: 1) encoding, 2) providing a description of the information package along with other information necessary for management and preservation of the package, and 3) providing for access to this description. Arlene G. Taylor, The Organization of Information, p. 77

    24 October 2000:
    W3C is pleased to announce advancement of the XML Schema language to Candidate Recommendation status.
    Press Release
    The situation right now: XML's Grand Schema


    Metadata - Structure: XML schemas

    What is an XML Schema? An XML schema validates an XML document. For example: Is this XML document a valid order form? The XML parser uses the XML schema to check if the XML document is marked up in a valid way. This means that the schema defines the structure of the document: which elements are child elements of others, the sequence in which the child elements can appear, and the number of child elements.





    The XML schema is itself written in XML and allows you to specify an element as a integer, a float, a boolean, an URL, etc. The schema defines whether an element is empty or can include text, and can also define default values for attributes.

    You point your browser at an XML document. The XML parser in the browser checks the document against the XML schema. If it is valid, then the browser uses the stylesheet to present the document to the end user (i.e., you).



    XSD Schema Generator 
    Article: Posted 7/3/2001 
    Source: XML for ASP.NET 
    The XSD Schema Generator takes XML data as input and generates a fully compliant W3C XSD Schema. It is completely web-based.
    XSD Schema Generator

    Metadata - Knowledge representation: XML attributes
    <HTML>
    <HEAD>
    <TITLE>Mating Habits of the Northern Hairy Nosed Wombat</TITLE>
    <META NAME= "DC.Creator" CONTENT="Smythe, Pearl">
    </HEAD>
    <BODY>
    <H1>Northern Hairy Nosed Wombats</H1>
    <P>The Northern Hairy Nosed Wombat is an animal native to Australia....</P>
    </BODY>
    </HTML>
    


    For class discussion:
    Reading three: The semantic web (?!)

    Extensible Stylesheet Language

    The W3C DOM is a platform- and language-neutral interface that permits script to access and update the content, structure, and style of a document. The W3C DOM includes a model for how a standard set of objects representing HTML and Extensible Markup Language (XML) documents are combined, and an interface for accessing and manipulating them. Web authors can use the W3C DOM interface in Internet Explorer 5 and later to take advantage of this dynamic model.

    Linking to an Extensible Stylesheet Language (XSL) styling source




    XPath Examples(Version 3 parser)

    MSDN XPath Documentation
    Some XPath notes Version 2 Parser Examples
    XPath Expression Builder Tool
    The MSXML Parser v.3 Stylesheet processing instruction:
    <xsl:stylesheet version="1.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:fo="http://www.w3.org/1999/XSL/Format">
    

    HTML, JavaScript, etc.

    There seem to be two strategies for showing an XML source without using a server. You can style the source and show it on the same page (first example below) or open a new XML page that refers to its own style sheet (second example below). Both of these examples use this simple XML data source.

    Using the DOM (Document Object Model) and JavaScript

    • A JavaScript function selects a single node from the DOM. The XML instance. The XSL with the JavaScript functions.
    • A select widget receives user input and sends it to a single JavaScript function. The XML instance. The XSL with the single JavaScript function.
    • A select widget sends user input to a receiving page, which dynamically reveals one of two stylesheets. The XML instance. One example of a stylesheet. The select widget. The receiving HTML page that parses for user input.