CHAPTER ONE
INTRODUCTION
The amount of documents in computerized information monitoring system usually grows rapidly overtime. How to store, manage and search these documents within the computerized information monitoring system is a challenging problem. Documents in computerized information monitoring system are stored as semi-structured data, while in the traditional relational database it is stored as structured data. Relational database management system cannot manage semi-structured data efficiently and cannot satisfy the requirement of content-based text retrieval. A lot of research works have been done about semi-structured data, such as data modeling, query language for text retrieval, index methods and text retrieval algorithms and similarity search algorithms.
These research results have been used a lot in computerized information monitoring system systems. SSREADER computerized information monitoring system, the national computerized information monitoring system and wanfang database are popular monitoring system in china. All the monitoring system classify the documents into several classes and support querying inside a given class. Metadata search and full-text search through a single keyword or expressions are both supported in these monitoring systems. other examples of monitoring system are greenstone computerized information monitoring system, uc berkeley computerized information monitoring system, tufts computerized information monitoring system, acm computerized information monitoring system, ncstrl etc. similar functions are supported in these monitoring system, such as metadata searching, full-text searching, documents classification and browsing. Greenstone computerized information monitoring system has a suite of software that provides management tocirs for creating and maintaining a computerized information monitoring system. tufts computerized information monitoring system is for the integration of collections that exist or may be developed in the future. there is a system named lore developed by Stanford. it is a database management system for managing semi-structured data. The ncstrl at cornell university is a distributed technical report monitoring method developed by the arpa-sponsored computer science technical report project. The ncstrl collection is distributed among a set of interoperating servers operated by participating national archivess. all of the monitoring system described above do not support the following functions: structure and content-based queries, automatic entries of external documents and parallel document processing. The CIRS system described in this study has the following features.
(1) Generalization: It is essentially a general document database management system. It can be used to build monitoring system for user needs and provides a suite of toCIRS to maintain it.
(2) Parallelism. CIRS uses a lot of processors to execute queries and manage documents, which improves both storage capacity and query efficiency.
(3) Structure and contentbased retrieval. Users can query inside a document for an element, e.g. a chapter of a book, which not only allows users to propose for a more accurate query, but also reduce the information transmission workload in networks.
(4) Personalization. CIRS can query according to user’s interest and recommend documents relevant to user.
(5) Automatic external data entering. CIRS can combine with other search engines in finding and adding references automatically.
(6) Multi-format supporting. DL collects a lot of document resources including books, journal papers, proceedings etc. and supports document information retrieval for a lot of document formats.
(7) DLSQL query. CIRS defines a query language like standard SQL, named DLSQL. By using DLSQL, users can program and do all the operations in CIRS.
(8) Automatic document classification. It creates a classifier according to the sample documents loaded by the system manager and automatically classifies documents.
A method for cross monitoring material in a reference work having a plurality of portions therein, comprising the following steps:
(a) providing a reference work including a plurality of major sections, a plurality of minor sections within at least two of the major sections, and a plurality of instructional steps within at least two of the minor sections;
(b) further providing a first series of sequential numbers for monitoring each of the major sections, with each of the numbers of the first series corresponding to one of the major sections in sequential order;
(c) further providing a second series of sequential numbers for monitoring each of the minor sections, with each of the numbers of the second series corresponding to one of the minor sections in sequential order;
(d) further providing a series of sequential letters for monitoring each of the instructional steps, with each of the letters of the instructional steps corresponding to one of the instructional steps in sequential order;
(e) indicating a specific major section, minor section, and instructional step from a first portion of the reference work in a second portion of the reference work, by placing one of the first series of numbers, one of the second series of numbers, and one of the series of letters referring to the major section, minor section, and instructional step of the first portion of the reference work, in the second portion of the reference work, thereby providing backward cross monitoring in the reference work; and
(f) indicating a specific major section, minor section, and instructional step from the second portion of the reference work in the first portion of the reference work, by placing one of the first series of numbers, one of the second series of numbers, and one of the series of letters referring to the major section, minor section, and instructional step of the second portion of the reference work, in the first portion of the reference work, thereby, providing both forward and backward cross monitoring in the reference work.