Our focus in Knowledge Management is what we call “Knowledge Provenance”. Knowledge Provenance addresses the problem of how to determine the origin and validity of information/knowledge. Over the last 30 years Information and Telecommunications technologies have simplified the acquisition, codification and distribution of information. The World Wide Web accelerated this trend by making information globally accessible to anyone with a computer. Secondly, it has enabled almost anyone to produce and distribute information.
Accompanying the expected information glut – web searches routinely return 1000’s of pages – lies a more insidious problem: is the information we are consuming valid? Consider the many cases of security frauds caused by chat room lies. For example, in 1999, two men posted a message on electronic bulletin boards, which caused the stock price of NEI to soar from $0.13 to $15, resulting in their making a profit of more than $350,000 . Enterprise information is not immune from this problem. The validity of parts catalogue information, product requirements, financial information, etc. can be quite costly. For example, an aerospace company designed a device without knowing the NASA approved parts catalogue it was using had been replaced by a newer version, thereby forcing a redesign, delay in delivery and cost overrun.
Knowledge Provenance addresses this problem by investigating how to model and maintain the validity of knowledge. A key factor underlying knowledge provenance is the modeling of the evolution of knowledge. That is, where does it come from? Who created it? What does its existence depend on? What changes were made to it? What extent can it be believed?
The application to the web is obvious, as demonstrated above. Following are other examples of its use:
- Customer Relationship Management (CRM): A CRM database is often the result of integrating information from many different sources, e.g., order databases, customer databases, marketing databases, service databases, etc. When using this information it is important to understand the origins of the information, the degree to which it is believed to be true, what changes/processing of the information has been performed (e.g., revisions), etc.
- Requirements Management: Any engineering project begins with the definition of requirements. Over time these requirements change. Knowledge Provenance provides the underlying representation and mechanisms for representing the source, changes and validity of requirements as they undergo changes during a project.
Our research explores a multi-level approach to Knowledge Provenance, where each level extends the sophistication and relevance of the next lower level. Four levels are currently being explored:
- Level 1: Static Knowledge Provenance. This is the simplist form of provenance. We model information using a 3-valued logic: True, False or Unknown. Any statement in a web page (or database) can be tagged as an assertion with an accompanying truth value, or as being derived from another assertion (in the same page or a different web page) with a dependency link back to its source assertion.
- Level 2: Dynamic Knowledge Provenance. Extends Level 1 to deal with dynamic changes in truth values due to incorrect or hypothesized knowledge. Using Truth Maintenance techniques, the validity of dependent information is updated as the validity of the underlying assertions changes.
- Level 3: Uncertain Knowledge Provenance. Extends level 2 from a 3-valued logic to representing validity in the form of a belief between [0, 1]. Using uncertainty modelling techniques, such as Bayes Nets, belief in statements can be asserted, and propagated along lines of dependency.
- Level 4: Judgemental Knowledge Provenance. Extends level 3 by defining a Socio-Technical process for the acquisition of provenance information from groups of people accross the web. The issue this level addresses is how subjective beliefs of provenance can be acquired, combined and attached to information
We are using the World Wide Web as our research platform and XML/RDFS as the technologies for embedding provenance information in web pages. We are building a Java applet that will display information provenance for any web page in which provenance xml tags are embedded, and will enable the attachment of provenance tags..