Autumn 2003

INFO 320

Information Needs, Searching and Presentation

Introduction

Active Management of Information

Premises:
  • Information Sharing
    • Geography
    • Time
    • Across applications
  • Modular Information
    • Document bursting (XHTML)
    • Initial construction of information is XML
Abandon Assumptions:
  • Everyone will agree on a single view of data
  • Everyone will have the same use of information
  • Everyone will use the same technology

Managed Store of Information

One or more XML sources
  • Technologically neutral: XML is text based
  • Can create XML schemas to establish certain types of XML sources, i.e., Plant catalog, sales, etc.

An information design that reflects my needs/view of my information right now. Yesterday it was different, tomorrow it may be different. It may be unlike anyone else's view of the same or similar data.



Input Update and Integration

I receive XML and blend it into my own information store
  • Sender's technological platform is of no concern to me
  • Sender's presentation preferences are of no concern to me
  • I can validate sender's data against an XML schema
  • I can bridge the semantics of sender's data to my own
Diverse origins of input


Styling Output

No output is pre-ordained; all output is contingent on time, taste, technology, etc.
  • Style output as XML to share with downstream information consumers
  • Style output to present on Web, Wireless, etc.
  • Style output as input to other technologies, e.g.: databases





Question: What's the best way to store large amounts of XML data in SQL Server? What are the performance implications of storing it in large chunks versus breaking it out into tables? MSDN magazine, Web Q&A, May 2003, p. 17
Answer: Different criteria play a role in that decision. If the data in the XML document is highly structured and fits into a relational model, it is often queried on a granular level, and you rarely need to get the XML back into its original form (in other words, order does not matter). In this case, decomposition into columnar data is better. If you have more document-oriented XML where order matters and recomposition costs are high, a Character Large Object or XML datatype-like approach is better.