ADVANCING THE IDEAS OF WORLD-WIDE-WEB : HYPER-G H. Maurer Institute for Information Processing and Computer Supported New Media, Graz University of Technology, Graz/Austria email: hmaurer@iicm.tu-graz.ac.at EXTENDED ABSTRACT WWW (World Wide Web)[Berners-Lee 92] has become the most successful networked multimedia system employing the hypertext paradigm over the last few years: Documents consisting of textual information can have embedded pictures, movies or audioclips and reside on a s e r v e r , accessible e.g. via internet using suitable v i e w e r s. The only structuring mechanism for sets of documents is the facility to place -- in hypertext fashion -- anchors in documents leading to (linking them to) other documents. Although this mechanism can be used to create menu-like hierarchical structures WWW databases are basically "flat" (stratified) collections of documents linked together. Thus, a WWW database can be seen as a graph whose nodes are the links between them. WWW has become an easy to use tool, mainly for small and medium-scale multimedia presentations that are accessible world-wide due to the excellent M o s a i c - viewer that is available on all major platforms: X-Windows, Mac and MS-Windows. However, WWW has a number of limitations that become apparent once tasks more complex than "a few hundred multimedia screen applications" are considered. No full-text search is provided as part of the WWW server, let alone the possibility to search across boundaries of WWW servers; authorisation features are lacking, hence the installation of a number of independent WWW servers within an organisation is not uncommon, for the simple reason of preventing access of unauthorised groups. This fragmentation prevents more global searches: thus, one of the main aims - to tie information together - gets lost due to the lack of authorisation features and the boundaries imposed by each WWW server. To overcome such weaknesses WWW offers an ingenious way out: it allows to start arbitrary application programs, thus letting users link into other databases, employ complex search algorithms or activate any other program desirable on top of WWW servers. This great flexibility is achieved, however, at a big cost: the uniformity of the interface disappears, different WWW servers start to behave differently: the whole "jungle" of scattered databases each with a different feel as we all know it from Internet starts to reappear, now on the level of WWW. Realizing this dilemma, a group of some 30 researchers and developers at the Graz University of Technology has started to systematically examine the ideas, structure and underlying features of large distributed multimedia servers, leading, eventually, to a concept embracing WWW yet more general than WWW: Hyper-G. Hyper-G has been developed carefully to ensure cross-operability with WWW. Hyper-G databases have gateways to WWW (and Gopher [Alberti 92]), and conversely; the Hyper-G viewer A m a d e u s (for MS-Windows) and H a r m o n y (for X-Windows) will allow the perusal of WWW databases; and the Mosaic Viewers of WWW can be used to access Hyper-G servers. (For more details on Hyper-G see [Andrews 94a], [Andrews 94b], [Hyper-G 94], [Kappe 93a], and [Kappe 93b].) The main difference between WWW and Hyper-G servers is that Hyper-G provides much functionality integrated into it (and hence uniform in nature) that has to be implemented on top of WWW (and hence potentially differs from site to site) and that Hyper-G servers work on a truly distributed platform: a user can activate a number of Hyper-G servers (that may or may not be arbitrarily geographically dispersed) in such a fashion that the union of all the databases involved appears as if it were one single database. Indeed, the Hyper-G concept is a bit deeper and more general: The basic item of a Hyper-G database is a document cluster rather than a single document: this is a convenient tool to handle such diverse features as multiple languages, multiple windows or multiple representations. A typical example of the latter is the treatment of LaTeX documents in WWW (Mosaic) and Hyper-G: in Hyper-G the basic idea is to store LaTeX documents as a cluster of two documents: one of them is a textual document with in-line pictures for all formulae (this is the approach taken in WWW/Mosaic), the other document is the DVI File corresponding to the LaTeX document containing links to e.g. pictures, i.e. a file retaining all the precision and beauty of the original LaTeX version. For casual browsing on a medium resolution screen the first alternative is the only viable one, for serious viewing (or printing on a laser printer) the second one (not available in WWW/Mosaic without some contortions) is the only one that makes sense. Document clusters in Hyper-G are put together in so-called c o l l e c t i o n s, and a collection can be part of one (or more) parent collections. Thus, Hyper-G structures its documents into a kind of hierarchy (actually not a tree, but a DAG). This is useful for many reasons: documents can be inserted without defining links (not possible in WWW: a document without a link to it is not (properly) accessible in WWW; it is accessible in Hyper-G, however, due to the collection structure); collections (and document clusters) can have attributes, allowing Boolean searches on those attributes; and although Hyper-G provides the full anchor-link hypertext paradigm it also allows (Boolean or WAIS-like) full text searches within the scope of any number of user-selected collections. Since each Hyper-G database is a collection, users can activate even geographically remote Hyper-G databases (better still: arbitrary sub-collections within them) and perform powerful searches across all of them. Note that such a facility is rather hard to implement in WWW: although full text search can be added on top of WWW databases, scope definitions are very difficult and automatic searches across various WWW databases are next to impossible. Thus, Hyper-G avoids the danger of independent "WWW-empires", the "Balkanisation" of databases as Ted Nelson has so aptly called it! However, it must be clearly recognized that Hyper-G generalises, but makes full use of WWW and Mosaic facilities. It is worthwhile to look at a specific example. Suppose a university has five different WWW servers operated by five different departments, each department authorized to modify only its own database, and departments unable or unwilling to combine the data for exaclty the mentioned authorization problems. Although a good solution, it is not perfect: to find information on person xxx within that university, each of the five databases has to be queried. Maybe not even all of them support full text searching or may support it using different mechanisms: thus, the problem "where do I have to look and how do I do it" (well-known from the world of Internet and international databases) can arise using WWW even within a single institution (and even more so if the databases are spread over various institutions). Using Hyper-G, each of the five mentioned WWW databases can be converted into five Hyper-G collections, all of them belonging to the collection "University yyy". Authorisation to modify data remains where it is desirable, yet a single full-text search for "Person xxx" in the collection "University yyy" will reveal all information on person xxx, if any such information is available. Observe that no manual changes in any of the WWW databases are necessary; nor is it necessary to abandon the viewer Mosaic, if users have started to like this beautiful piece of software. On the other hand, the Harmony viewer (see below) does provide all of Mosaic's features and a few additional ones (and is available free of charge like Mosaic), so may become a welcome addition at some stage ... To take a larger and maybe more pertinent example: suppose a number of universities in Germany use Hyper-G, each with a collection "Mathematics". By defining a collection "Mathematics in Germany" a single search will examine all sub-collections "Mathematics" automatically, independent of where they are located geographically. We believe that above notion of unions of collections defining scopes for searches (or other actions!) is essential to prevent that a new kind of fragmentation occurring on a new level. Hyper-G was developed using also knowledge of and experience with WWW; Hyper-G is thus influenced by WWW and has systematically tried to stay consistent with WWW without giving up the insights gained in the meantime: - Big hypermedia systems must have a structure: a "flat" graph with no "semantic" meaning of links will not do for large systems. - Activities in large networked multi-media systems have to be restricted to scopes definable by the users. - Activities that are considered central (like searching, structuring, non-private annotations, etc.) have to be integrated into the basic system so as to avoid unsystematic proliferation of "unorthogonal" features. Hyper-G is based on above premises.More on those and other points (like mechnaisms for gathering statistics,billing and "active" mail )will be contained in a full version of this paper. However, a few more specific aspects should be mentioned: The annotation concept in WWW (actually part of the Mosaic client) allows "private" annotations that are stored locally. Hyper-G allows to define authorisation classes for annotations, permitting "private", "group" or "public" annotations. Links in WWW are restricted to textual anchors, while Hyper-G supports anchors in arbitrary data-types like pictures or movies. Links in Hyper-G are bi-directional. Hyper-G introduces a sophisticated authorisation mechanism defining for each user the rights to read, create links, modify and annotate. This provides the basis for sophisticated customisation and even CSCW within Hyper-G that have to be -- like all other more sophisticated features -- built on top of WWW (potentially creating confusion and incompatibility). Hyper-G is being used as information system at a number of universities (Graz University of Technology and the University of Auckland are two examples); it has been selected as information system by major organisations such as ESA (European Space Agency), it is the basis of a multi-media guidance system by large museums or exhibition operators (such as the new Museum of New Zealand, or the Images of Austria Presentation at the EXPO' 92 at Sevilla); it is the platform of one of the most ambitious (30 GByte data) multimedia projects anywhere (the millenium celebration of Austria) and is the basis of the first serious attempt of electronically publishing a high-quality journal in computer scince, J.UCS (Journal of Universal Computer Science): J.UCS is suported by Springer Pub.Co., has an editorial board of over 100 prominent computer scientists and more than 25 universites world-wide have agreed to act as server. Hyper-G, as a late-comer in the field, has been able to profit from and incorporate experience from earlier projects such as Gopher and WWW. And despite the fact that Hyper-G will not be officially released before June 30, 94 above list does show a fairly wide acceptance even of its pre-beta-release version. Since Mosaic has been the main driving force for WWW it is worthwhile to mention that the current X-Windows viewer of Hyper-G will be replaced by Harmony. (The current Harmony version is available as development prototype for functionality tests if specifically requested for such tests; it must not be considered an operational tool before June 30, 94 [Hyper-G 94].) Harmony includes all features of Mosaic, plus a graphical browser giving document-type, history, in-and-out links, dynamic and static environments, and incorporates a viewer of 3D objects and scenes (including navigation within them) plus a first attempt at producing 3D "information landscapes". The MS-Windows Viewer A m a d e u s is available as of May 94. It implements a subset of Harmony's features on a PC-Windows platform. Summarizing, WWW has been successful in establishing networked multimedia as a major option for information systems of the future. Hyper-G has been built using experiences with WWW and other large-scale networked multimedia systems, preserving full interoperability with WWW, yet incorporating all those features into the basic system that have been universally accepted as indispensable. In this sense, Hyper-G tries to contribute to a more uniform and controlled environment of the world opened by WWW. REFERENCES: [Alberti92] Alberti B., Anklesaria F., Lindner P., McCahill M., Torrey D., "The Internet Gopher Protocol: A distributed Document Search and Retrieval Protocol", Available by anonymous ftp from boombox.micro.umn.edu in directory: pub/gopher/gopher_protocol. [Andrews 94a] Andrews, K., Kappe, F.: Soaring Through Hyperspace: A Snapshot of Hyper-G and its Harmony Client; Proc. Eurographics-Multimedia 94, Graz, June 94; FTP iicm.tu-graz.ac.at in: pub/Hyper-G/papers. [Andrews 94b] Andrews, K., Kappe, F.: Hyper-G: A New Tool for Distributed Hypermedia; submitted to ISMM Int. Conf. on Distributed Multimedia Systems and Applications, Hawaii (1994); anonymous FTP iicm.tu-graz.ac.at in: pub/Hyper-G/papers [Berners-Lee 92] Berners-Lee T., Cailliau R., Groff J., Pollermann B., "WorldWideWeb: The Information Universe", Electronic Networking: Research, Applications and Policy 1, 2 (1992), 52-58. [Hyper-G 94] Reports, Information and SW concerning Hyper-G; anonymous FTP iicm.tu-graz.ac.at in: pub/Hyper-G. [Kappe 93a] Kappe, F., Maurer, H., Scherbakov, N.: Hyper-G -- A Universal Hypermedia System; J.EMH (Journal of Educational Multimedia and Hypermedia) 2, 1 (1993), 39-66 [Kappe 93b] Kappe, F., Maurer, H.: Hyper-G: A Large Universal Hypermedia Systems and Some Spin-offs; anonymous FTP siggraph.org, in: publications/May-93-online/Kappe.Maurer