Development of a Multimedia Jnformation S3stem for an Offke Environment S. Christodoulakis, J. Vanderbroek.
J. Li, T. Li.
S. Wan, Y. Wang, M.
Papa*. E. Bertino** Computer Systems Research Institute University of Toronto 10 King 9s College Road Toronto, M5S lA4 l visiting from CRAI.
Italy l * visiting from CNR. Italy Ahtract We describe an experimental multimedia infor- mation system for an office environment which is being developed in the University of Toronto. Mul- timedia messages are composed of text, image, voice and attribute information.
We discuss issues related to internal representation, presentation and corn- munication with the outside world, content addressa- hility in the various data types, user interface and access methods. 1. lntroduct.ion There is a growing interest among computer sci- ence researchers in office information systems that handle complex data such as text, attributes, graph- ics, images, and voice ([VLDB 63a], [VLDB 83b]).
We will call the unit of multimedia information a m&- timed& message. Multimedia messages are com- posed of attribute, text, image and voice information. Some of the functions that these systems may pro- vide are Ming of multimedia information, content addressability of multimedia messages, and mul- timedia message transmission and reconstruction in a different site.
There are several important problems ... more. less.
associ- ated with the development of such systems. Some of these problems are identified next: 1. Query environment Queries in this environment may be different than queries in traditional Data Base Management System Environments.<br><br> Users may only have a vague idea of what they are looking for. Their understand- ing of what they want and how to specify it may increase as they look at other messages. Their query may prove to be inadequate.<br><br> In that case they may want to reformulate it. Some other users may want to enhance their retrieval capability by specifying some characteristics of the messages that have to do with the presentation form of the messages rather than the content. Finally queries on the image and text part of messages are not often handled by tradi- tional DBMS 9s.<br><br> 2. Content addressability in various data types: Content addressability in these diverse data types presents serious problems. Content addressability in messages containing attribute value and text information can be achieved by allowing the user to specify expressions involving proceedings of the Tenth International Conterence on Very Large Data gases.<br><br> 261 the attribute values of the message as well as regular expressions of words appearing within the text mes- sage ([Aho et al. 781, [Tsichritzis and Christodoulakis 931). Structures for efficient retrieval of formatted :!ata from single and multi-flle environments have been studied extensively for various retrieval request types and frequencies ([Teorey and Fry 6~1.<br><br> [Wiederhold 831, [Ullman 831, [Christodoulakis 641. [Christodoulakis 63a], [Christodoulakis 63b]). Con- tent retrieval from text files has also been studied for various environments and efficient methods have been described ([Salton and McGill 831, [Rijsbergen 791, [Floyd and Ullman 801, [Haskin 611, [Tsichritzis and Christodoulakis 831, [Christodoulakis and Falout- SOS 641).<br><br> Content addressability of the image message and voice message part is much harder. One reason is that the current technology on image and voice recognition has serious limitations. Picture recognition involves very expensive pat- tern recognition routines ([Tou and Gonzalez 731, [Duta and Hart 721).<br><br> 1 n addition picture recognition of general type pictures is still remote ((Ballard and Brown 623, [Rosenberg 761, [Fu 631, [Pavlidis 771). Existing experimental and commercial systems based on high power machines (array processors) can be successful only when much knowledge about the scene presented in a picture is available. A second, equally important problem related to pic- tures is that it is very difficult for the user to specify precisely the picture content of the pictures that he wants.<br><br> Speech recognition presents similar problems ([Fu 631, [Reddy 751, [Reddy 761, [Erman et al. 601, [Electronics 631). Currently only speaker dependent, discrete speech, voice recognition devices with a lim- ited vocabulary of words exists in the market.<br><br> 3. hrformation Organization and Access: In an office informatikn environment files are seldom static. 8The information is usually diverse, the world changes fast, people do not like to spend time for organization and reorganization of informa- tion.<br><br> In addition the good naming, structuring, con- sistency and quality of information which is assumed in data base environments is not easy to maintain in an office information system. Information may be inserted using error prone methods and the quality of information and the consistency of naming may be Singapore, August, 1994 very difficult to maintain. The query capability and the access methods used should be able to cope with these problems.<br><br> Of course performance is of very high importance in this environment: By specializing the application environment we expect to provide better services (e.g. content addressability) and usable systems. Performance may become easily unbearable due to large volume of data and to the flexibility of the query environment.<br><br> 4. Query hterfuces Users of office information systems may have very diverse backgrounds varying from the sophisti- cated type to the naive type. The precise syntax required by many DBMS may not be appropriate for this environment.<br><br> The high quality screens, voice input output and other sophisticated devices that are now available have the potential for providing very effective interfaces. In this environment it would also be desirable that the query interface facilitates the user to express his queries in a better way. 5.<br><br> htfomation eztraction and internal Tepresenta- tion: In a multimedia message environment several possibIe ways of message creation exist. Multimedia messages may be interactively generated in a given station and sent to another station via communica- tion lines. In the receiving station additional editing of the message may take place.<br><br> Alternatively mes- sages or parts of messages (pictures) may be in a paper form in which case a powerful image segmen- tation and OCR capability may be used for extracting the information from the messages. Automatic extraction of information from docu- ments is a difficult task. Segmentation into text regions and image regions is an easier problem and the performance of existing techniques is adequate [Wong et al.<br><br> 821. I n addition optical character recog- nition techniques perform well for a variety of fonts. Thus existing techniques can be used for the automatic recognition of the text part of documents.<br><br> However, in many cases this may not be ade- quate. The information contained in various images may contain much redundancy [Pratt et al. 801.<br><br> For example if an image of a document contains a simple graph, this graph may be encoded in an internal representation form with much reduced storage requirements. Thus an internal representation may be used to reduce storage requirements as well as telecommunication costs. The internal representa- tion may need to be different from system to system depending on the availability of devices of various types, as well as on the workload characteristics of the system (e.g.<br><br> CPU bound versus IO bound system, capacity of the communication medium). 6. Data presentation Data base management systems have tradition- ally emphasized data organization and manipulation.<br><br> Presentation of data in devices with diverse capablli- ties has traditionally been deemphasized [Gray 831. This problem is particularly important in the pres- ence of bit map display capabilities with different number of gray levels, colors, display sizes, as well as facsimile input/output devices [Horak 831. In addi- tion the large storage requirements of image infor- mation and the large cost of transmitting this infor- Procwdlnge o?<br><br> the Tenth InternatIonal Conlerenoa on Vary Large Data Bases. mation through communication lines may impose an internal representation of some image types which is different than the presentation form of these images. finally, voice input and output devices also present similar problems.<br><br> In this report we describe an approach for the development of an office information system which handles multimedia messages. We present aspects related to internal representation, presentat.ion, con- tent addressability. access method, user interface, message formation and information extraction.<br><br> 2. A framework for multimedia messages In this section we present a conceptual frame- work for multimedia messages [Christodoulakis 84a]. The conceptual framework describes the logical com- ponents of multimedia messages and their interrela- tionships.<br><br> The framework is useful for describing the capabilities of the system to users. The logical components of a multimedia mes- sage are shown in Agures la and lb. Multimedia messages have a type associated with them and they are composed of one or more of the following: a set of attributes, a voice message, a text message and a set of images.<br><br> In addition multimedia messages may have an annotation part. The message type contains a minimal common information (a set of common attributes) in a large number of messages. Attributes have an attribute name, a type, and a value.<br><br> The value may be a repeating group of values. The text message is composed of tezt sections. Each text section is composed of tezt paragraphs.<br><br> Each text paragraph is composed of text words. Each text word is composed of overlapping parts of words. This structuring of the text message allows queries to restrict retrieval based on the proximity of words within the text message as well as to associate anno- tation with each of the text components.<br><br> An image is composed of an image type, a vector for?& a TmteT fom, a .StUtistica~ pCLTt, and a text part. The image type can be: graph if it contains at least one graph, pie chart if it contains at least a pie chart, histogram if it contains at least a histogram, table if it contains at least a table, statistical (any of the previous), and picture (anything else). The vector form represents the image as a set of image objects.<br><br> An image object is composed of a set of ordered points and an object caption. Points are pairs of vaIues indicating the position. of a point within an image.<br><br> Points of an object may be con- nected to form lines, polygons, polylines, . . .<br><br> . The object caption is composed of object caption words. Object caption words are of type text and they are composed of parts of words.<br><br> The raster form represents the image as an ordered set of pixels in two dimensions. The raster. form of an image may contain possibly overlapping raster objects which are sets of adjacent pixels.<br><br> Each raster object corresponds to a distinct vector object of the same picture which is a polygon. The implica- tion is that the set-of pixels composing the raster object is deflned by the boundaries of the vector object when it is superimposed on the raster form of the image. Singapore, August, 1984 The statistical part of the image is composed of a set of tables, Each table has a set of attributes.<br><br> Attributes have a name, a twe, and a set of values. Tables within an image are independent of each other. We do not allow joins among tables.<br><br> Tables are used internally to store the statistical informa- tion contained in images of type graph, pie chart, histogram or table. The image tezt part is composed of image tezt words. Image text words are composed of parts of wo&.s.<br><br> The image text part is text related to a given image. The text part is formed by the following: The image caption of a given image, Text paragraphs related to the image, Text annotation, Object caption words of objects within the image, Attribute names of attributes in the statistical part of the image, Attribute values of attributes of the type text in the statistical part of the image. The voice message is composed of voice words.<br><br> Annotation is composed of tezt annotation and voice annotation. Text annotation is composed of tezt annotation sections and voice annotation is com- posed of voice annotation sections. Annotation may be associated with a text mes- sage, text sections, text paragraph, text words, and images.<br><br> (The lines from voice annotation are not shown in the figure.) Annotation is a further informal explanation about the contents of a message, para- graph, word, image or image object. 3. Internal Representation and Presentation Form of Hultimedia Messages The presentation form of the constituents of a message may be different from the internal representation of the message.<br><br> The internal representation of an image does not have to have both an object form and a raster form. It may only have one of the two. An example of an image where both forms exist in the internal representation is a photograph where objects have been identified and stored in the object form for enhancing content retrieval.<br><br> An example of an image having only a raster internal representation is an uninterpreted photograph. An image having only an object form as internal representation can be an engineering design. (At the presentation level how- ever, the object form may be used to display the design in a raster display.) The internal representation of the object form of an image is a collection of objects.<br><br> With each object is stored information related to its type (polygon, cir- cle, . ..). its name, name display specifications (font, size, position of display), shading information, and the coordinates of a set of points or other informa- tion specific to object type (radius,..).<br><br> This informa- tion enables the reconstruction of the set of points which compose an object. The internal representation of statistical type images (graphs, pie charts, histograms, tables) is a Proceedings of the Tenth International Conference on Very Large Data Bases. collection of tables.<br><br> This information is not displayed and it is in fact a duplication of informa- tion since the information about the objects compos- ing the presentation of these images in a specific device is also maintained. However, the information duplication is not very large usually, and the approach facilitates both answering queries on the image contents and presenting the image in a different form, or the same form but with different parameters (different coordinate system say), at a later point in time. In addition it can be used to display the contents of the image in devices which do not have graphics or bit map display capability.<br><br> The presentation of a multimedia message in an output device is called a physical message. With a physical message we associate some default informa- tion (such as font, size, line spacing, . ..) which is used for displaying the message in an output device.<br><br> The structure of a physical message is shown in figure 2. A physical message is divided into physical pages. Each physical page is composed of rectangles.<br><br> A rectangle can be a tezt rectangle or an image rec- tangle. Rectangles are identified by their location within a physical page and their size. Image rectangles correspond one to one to images of a multimedia message.<br><br> Text rectangles may contain some information that is used for displaying messages in an output device (alternative font, alternative size,...). Since sequences of words may be displayed in a different way we also use word sequence rectangles which are contained within text rectangles. Finally the voice message and the annotation message part of a multimedia message are not displayed in the physical message.<br><br> However, the voice part of the message, voice annotation sections, and text annotation sections are mapped one to one to image rectangles and paragraph rectangles of the physical message. An indication of their existence is a special symbol associated with the relevant rectan- gle, which may be optionally displayed in the output device. The indication symbol can denote voice indi- cation, voice annotation section indication, or tezt annotation section indication.<br><br> A descriptor is associated with each created multimedia message. The descriptor indicates the parts of the message, the internal form for each part and its mapping to a physical message. Compression information may also be encoded in the message descriptor.<br><br> As we menti:jned before, the compression to be used in such an environment depends on the system workload and the devices used. In addition since there may be a variety of techniques that can be used (and none is best for all cases [Gonzalez and Wintz 771) the particular method used (if not the default) and its parameters may be encoded within the descriptor. This may be more important for the image part rather than the text or attribute part of multimedia messages due to the large amount of information in images.<br><br> The simplest case (which has been implemented in our prototype) is the encoding of an image as a set of objects (and regions of uniform shade). The image is expanded to a complete bitmap at presentation time. Singapore, August, 1904 263 Multimedia Message Figure 18: Multimedia message structure Raster FOnn Proceodlngs ol the tenth International Confenncs on Very Large Data Bases; Figure 1 b: Multimedia mesaage 8truCture (image pert) Slngapors, August, 1994 264 4.<br><br> Content Addressability In our system multimedia messages are retrieved by specifying message content information instead of a unique message identifier. The user will have some idea of what is the content of messages that he wants to see (or not see) and he will specify this information in his query. The system will try to return to him all relevant messages.<br><br> We would like to avoid the general pattern recog- nition problem associated with images in our sys- tem, and still provide as much content addressability as possible. In some cases converting image recog- nition problems to attribute and text recognition problems provides us with a powerful alternative. The system should provide some basic but powerful support for content retrieval.<br><br> Image content addressability can be achieved by specifying conditions on the image text part, on the image statistical part, as well as similarity and spa- tial relationships among image objects. Retrieving messages based on conditions on the image text part is logically and physically different than specifying conditions on the text part of the message. The former specif 9les that the user wants to see a message that has an image related to the con- dition specified while the latter speciAes a message related to the condition specided.<br><br> The search is also limited to the image text part. An image in our system may contain a number of statistical objects (graphs, pie charts, histograms, tables). Each one of those has an internal represen- tation in the form of a table.<br><br> In our system the user can focus his attention to only one of the statistical objects at a time. We do not allow content addressa- bility based on relationships among tables. We follow this approach because we believe that it will be confusing to the user to remember which statistical objects belong to the same image.<br><br> In addition, condi- tions on single tables may be very selective so that the size of the response is limited. However, the presentation of a message allows that more than one statistical objects (graphs, tables, . .<br><br> . ) to appear in the same picture. Finally similarity and spatial relationships of objects in pictures may be found useful in restricting the size of response for non-statistical images.<br><br> The specification of the query will be done interactively by the user by using a set of menu options and the graphics editor [Christodoulakis and Elles 841. We have not implemented this option yet. Some examples of possible queries and the way that they can be formulated follow: Exl: Give me any documents with images that have to do with IBM .IBM exists in the text part of the image Ex2: Give me any documents that have some statis- tics on IBM .image type statistical .lBM in the text part Ex3: Give me any documents that have a graph related to IBM .image type graph .IBM on text part Proceedings of the Tenth International Conference on Very Large Data Baser.<br><br> Figure 2: Physical mewa~e titruchrre Ex4: Give me any documents with statistical figures relating IBM sales and year .image type statistical .lBM sales and year are attribute names (the user may specify partial match on the attribute names if he is not sure about the exact name of the attribute) Ex5: Give me all documents which have graphs where all IBM sales where greater than CDC sales .image type graph .attribute values of IBM sales greater than attri- bute values of CDC sales In our system the user is able to specify the fol- lowing for message retrieval: 0. Message type 1. Conjunctions of attribute values and attribute ranges (=.>.<...) 2.<br><br> Conjunctions of disjunctions of words or parts of words appearing wihin the text message, text sec- tion. text paragraph. 3.<br><br> Existence of voice. 4. Existence of images 5.<br><br> Approximate location of an image 0. Conjunctions of words or parts of words appearing within the text related to the image (text, para- graphs, image caption, names of objects, attribute names, attribute values of attributes of type text). This capability may be useful when the user wants to ask queries about an image that he has not seen before.<br><br> In this case he may not know if the word IBM for example appears as attribute name or value. 8. For statistical images (piecharts, graphs tables...) existence of attributes, attribute values, relation- ships of attribute values.<br><br> Singapore, August, 1994 265 9. JSxistence and relationships of image objects for non-statistical images. Such relationships are con- tains, intersects,...<br><br> We plan to incorporate in this Option some aspects of fuzzy retrieval of images. The user will draw an image and the system will search for images that contain objects similar to it. This option has not yet been implemented.<br><br> 10. Conjunctions of the above. 5.<br><br> User Interface and Query Reformulation The important task of the user interface is to support in a uniform and integrated way the various data forms and activities. In order to ask a query the user has to specify a filter. The specification of the filter is done using menues in a by-example fashion.<br><br> This approach allow the user to specify his selection non-procedurally using a set of options and thus it presents advan- tages for non-expert users [Vanderbroek 841. The screen is divided into two regions. The left displays a message template.<br><br> A message template has two main components: 1) A set of fields that corresponds to the attributes of the message, 2) the message body. The various components of the tem- plate are filled in during the process of query formu- lation. The right region of the screen is the menu area and displays the available options for definition of restrictions on the voice and image part of the message.<br><br> For restrictions on voice the options that can be specified are present, absent, or no restriction. For image restrictions several options are avail- able: Location on/off When the location is on, the user indicates the loca- tion by pointing to the approximate position within the message body. A small frame is outlined to indi- cate the position chosen.<br><br> A corresponding frame is displayed in a flxed area of the screen called reposi- tory. This frame is always used (even If location has not heen specined) to display information which has been so far specified about the image: Image type The following image types are available: picture, pie chart, histogram, graph, table, statistical, picture. The statistical type permits the user to query for images containing statistics without specifying their representation.<br><br> The user chooses one of the above types by pointing the icon representing the image type. The information that the user specifies about the image is inserted in a frame for the image. Frames are displayed below the menu options in the right hand side of the screen.<br><br> The location of the : --,-,.. -..:C*: l.a~dL;,s ,.L,,A.rL 5 pzr-3 (if -q--if-A I--.7 f,!-.c ,399-r) is IIT-?_ ,i. .,,...A .<br><br> . . .<br><br> . ., to outline a dotted frame where the approximate location of the image exists. A new menu is displnycd indicating the various options for that image type.<br><br> The user can either quit going back to the previous menu, or specify further details. The options for each image type for stattstical images can be divided into two classes: 1)image type attributes such as column names 2) image attribute relationships. These options allow the user to specify relationships among the attributes of an image type.<br><br> For example if the type is graph the menu options Proceedings of the Tenth Intemational Conforencr on Vary Large Daie Baeee. 266 presented describe objects related to graphs (e.g. axis, lines, .<br><br> ..). This is expected to be desirable for a user that has a particular prksentation in his mind (possibly because he saw the image recently) and he specifies the type of the image. However, all queries (independent of type) are examined versus a com- mon internal representation (tables) [Vanderbroek 643.<br><br> Thus the user has the option to specify the type of the image to be statistical if he does not remember (or has not seen) the presentation form of an image which contains statistical information. After the query specification by the user the sys- tem starts the search for finding the desirable mes- sages. Very often the user will not be able to specify a very tight filter.<br><br> Thus he will often describe a filter to which a super-set of the messages that he wants to see qualify. A browsing capability is used to assist the user in identifying the appropriate messages. Miniatures and voice abstractions are used for better browsing (Tsichritzis et al 831.<br><br> Miniatures are realistic visual abstractions of messages which are displayed for the user during browsing. Several miniatures can fit on the screen at the same time. Miniatures are automatically gen- erated and stored for each message.<br><br> voice abstmzctions are voice excerpts associated with a message and highlights the meaning of the message. Voice abstractions may be generated either manually by inserting a voice abstraction message or automatically by using an automatic indexing technique. In the case that an automatic indexing technique is used the abstraction can be stored in a binary form to save space.<br><br> When the miniature is displayed, the voice output capability uses the binary form to produce voice. The voice abstraction can be turned on or off by the user. We have not implemented the automatic indexing option yet in 0: 8r system.<br><br> In a multimedia information system environ- ment it may often be the case that the user cannot exactly describe the information that he wants. This is not typical of a data base environment where the information is well structured and named, and attri- butts take values from n fixed set of attribute values. An example from a text retrieval environment which demonstrates that this may not be true in a more general information retrieval environment is synonyms, words with similar meaning.<br><br> There is a need for query reformulation in this environment. Dynamic query reformulation in image mes- sages is even more important. 8l 9h~: rc:iso~~ ik tllat the information extraction process rn 8ly f<lil lo Ililllle all the existing objects within an irrlcigc,.<br><br> 8T 9h~r; r :ay be due to several reasons. 1. The person extracting the irlfurlriatiol~ :\ I- riot careful or patient.<br><br> 2. Ther? i.s toq m.llch ~nfrlrvrat;r\~~ 1r1 ,t !>;','II 3.<br><br> Certain objects may not bl: ~:~~r 8) cIc,~t .: 8. I:, ;( given picture. This will affl:c%t t)ol t1 rllO1riii ,I ,.;lti automatic extraction of iriI 9oI.rli (It 1011.<br><br> :- 8l:lllc~ of this Problem may be iivcidt:d I)y alloh I~I# a powerful set of similaril.~ rlIc~J~III 8c ,alld transforms for raster object-: (01 8 user dcAnt:d, query specific transforms and similarity mcas- ures). Singapore, August, 1984 4. Certain objects are not known or important at the time that the images were inserted in the multimedia message repository.<br><br> Again this is true for both manual and automatic extraction of information. It is possible that the user will feel the need for query reformulation at some point in time as he browses through the messages. One reason is that something that he saw in these messages may have prompted his memory to a better specification of his query.<br><br> Another may be that he has decided that he is receiving too many documents back. The query reformulation may restrict the number of qualifying documents further, it may expand the query with a disjunctive term, or it may completely change the query. In our system we allow options for query expansion using an environment dependent thesaurus (not implemented yet), query modification (more restrictions) and continuing the search for- wards or backwards, or changing the query and res- tarting without seing the documents seen so far.<br><br> For images the query reformulation capability should allow the user to extract a part of an image and to use it. for expanding his filter. This will be QSyf'J!<br><br> when R jl*er es be browses t.hrough ~UR!ifying messages, sees an object of a class of objects that he wants. It will be easier for him to extract this infor- mation from the image itself instead of redrawing t.he image. It will also probably result in a more accurate specification of the query.<br><br> A possible different scenario would be that the user is not able to draw or specify his image objects very well. Thus he starts by using some text words to select messages that possibly contain an image Simi- lar to the one that he wants use in his filter. When he finds one, he extracts the information that he wants and he uses it as a new filter.<br><br> One objective of the interactive image editor that is described below is to facilitate this type of queries. Thesaurus mechanisms have been traditionally used in information retrieval for replacing one word in a query with its synonyms. An expansion of the thesaurus idea would bc to use thesaurus mechan- isms that associate words with their pictorial rcprcscntation.<br><br> This type of thesaurus mechanism can be environment dependent so that excessive overhead is avoided. 6. Access Method Multimedia messages coming into a station are stored in general files.<br><br> At a later point in time a user of the station may want to view these messages or extract some information from these messages to form a new message. An access method based on abstractions is used to achieve fast response time in user queries. An abstraction of the multimedia mes- sage is much smaller than the multimedia message itself and restricts the attention to a small number of qualifying messages.<br><br> Information stored in the abstraction file con- tains abstractions of text image and voice data. The text abstraction scheme is based on superimposed coding [Christodoulakis and Faioutsos 841. A fixed length block signature is created for each block of Proceedings of the Tenth InternatIOnal Conference on Vefy Large Data Bases.<br><br> text data. Originally all the bits of the block signa ture are set to zero. The signature is constructed by taking each non-trivial word in the text message splitting it into successive overlapping triplets of letters and hashing each triplet into a bit position within the block signature.<br><br> These bits are set to one. If the word is too short, additional bit positions are created by using a random number generator, which is initialized with a numeric encoding of the word. Thus a constant number of bits corresponds to each non-trivial word.<br><br> These bits are set to one. The size of the signatures and the number of bits per word have been determined in such a way that the perfor- miiricc 0; the system is optimized [Chrictodculakio and Faloutsos 841. To examine if a given word appears within a logi- cal block of the message, the signature of this block is examined.<br><br> The same transformation is performed on the word and the bits determined by the transfor- mation are examined. If they are all one, the word is assumed to appear in the text message. This access method retrieves supersets: oi the quaiifjiiig ITiTS- sages.<br><br> The browsing capability described before allows the user to pinpoint the relevant messages. Parts of words can also be specified in queries. More complicated query patterns (including conjunctions and disjunctions of words) can be examined versus the signature in an obvious manner.<br><br> Information related to attribute values is also abstracted using a signature technique. The only difference is that order preserving transformations are used in order to answer inequality queries. Some further evalua- tion which was done by [Rabitti and Zizka 841 shows that the approach is more appropriate for an infor- mation system environment than word signatures [Tsichritzis and Christodoulakis 831, [Larson 831, or rigid indexing techniques like IBM STAIRS.<br><br> Important information regarding images like the image type and approximate location is also inserted in the abstraction file. In addition information related to the objects of a picture as well as an abstraction of the image text part and the statistical part is also inserted in the abstraction file. Finally the only information related to voice that is inserted in the abstraction file is information related to the absence or existence of a voice section in the message.<br><br> As with text information, the information used for answering queries involving attribute picture and voice which is stored in the abstract file guarantees that a superset of the qualifying messages to a given request is retrieved. The blocks of the access file are accessed sequentially. The sequentiality for access, the use of large blocking factors, and the small size of the access file result in a small cost of the access method.<br><br> 7. Message Formation and Information Extraction Messages may be interactively created using a bit map display capability of a workstation. The text formatting software may provide the same basic features seen in traditional formatters [Futura et al.<br><br> 823. Alternatively, the formatting software may be integrated with the filing capability so that new mes- sages are synthesized from old ones [Christodoulakis Singapore, August, 1984 261 et al. 84bJ.<br><br> Interactive creation of certain image types (graphs, diagrams, pie charts, histograms, tables, drawings) seems easier for the user and the system. The user has an immediate feedback of what his images look like and he will be able to directly edit them and see the result on the screen. The system will possibly avoi.d some overhead coming from the fact that user encoded specifications do not map exactly to what the user wanted to see, or even if they do, the resulting flgure when displayed is not satisfactory.<br><br> The interactive image editor formatter in our system is used to assist the user in creating images interactively, extracting information from other images already in the system, or manually editing digitized images, extracting the information in them and possibly discarding the raster form of the image which is expensive to store. We plan to use the image editor also for specify- ing or reformulating queries that refer to non- statistical images. Thus the image editor becomes dii ItGjXr'ihAiL +iit Sl tli: iXiXL:ak;i:i;lCLt ;79);3:11 ."3T multimedia messages: It is useful for message forma- tion, change of presentation form, query specification, query reformulation, information extraction for achieving content addressability, and information compression (when the raster form of the inroming messages ia discarded).<br><br> The image edi- tor should be powerful enough to support these func- tions. In addition it would provide a nice interface to the user. The communication with the image editor in our system is based on menues.<br><br> The right part of the screen is used for communication with the user dur- ing interactive design. Menues are small and well identified. The user gives command3 to the system by util- izing mainly a single communication medium (mouse).<br><br> He uses another medium (keyboard) only when there is no alternative. This in combination with good descriptive menu options, relatively short menues, confidence on where exactly the user is at a particular time in his navigation through the menu options and modes, confidence on what exactly the system expects from him and what the system is doing, are the main principles used for designing a good user interface for interactive image creation. The image editor provides capabilities for crea- tion and manipulation of individual objects and for creating an internal representation of the image in a vector form.<br><br> It can a:sc bu used to cremtt -pecial. objects (graphs, pie charts,...). For these objects an internal representation in a table form is main- tained.<br><br> The image extraction mode of the image editor extracts information from images of other messages stored in the system, tables and pic:ure bitmaps in order to synthesize a new picture. Information that can be extracted from images of other documents is the raster form of an object or a set of objects, the vector form of an object or a set of objects, and an attribute or a set of attributes. Extracted informa- tion may be stored into a subpicture.<br><br> Thus a new picture can be synthesized from subpictures that have been selected from different pictures. TII*~ aub- Proceedlngr ol the Tenth Intwnatlonal Conference on Very Large Data l3wes. picture creation mode may be used to adjust the relative positioning of images.<br><br> This is basically a cut and paste form of creating new images from old ones. Frequently multimedia messages that have been prepared with this editor formatter may arrive in the system. The system may have to deal with both: Mul- timedia messages that have been prepared using for- matters that are known to the system [Christo- doulokis et al.<br><br> 64a] as well as multimedia messages that have been prepared using unknown formatters. The later is similar to the problem of general docu- ment conversion to an internal form [Wong et al. 821.<br><br> Information may be exiractad from piclures in a raster form using the interactive object form edit mode of the image editor to draw objects on top of the raster form. The image extraction mode is used to read the raster form of the picture Arst. For spe- cial images (graphs, histograms.<br><br> . ..) the required attribute information is also extracted at the same time by specifying the values of the parameters that were used in the presentation (minimum and max- imum values in the axis, scale, . .<br><br> . ). The vector form of images requires in general much less storage space.<br><br> The raster form of these images may or may not be maintained after the creation of the object form of the image. Objects within pictures in a bit map form will be defined manually in this mode. The user will indicate on the screen a point or an ordered set of points which composes an object.<br><br> He may also introduce type, caption and/or annotation information. (This information may be the only information needed since not all images have a raster form.) l- .* the future the object description for pictures in raster form can become semi-automatic. An edge detector could be used to detect edges and the user may name the objects and annotate.<br><br> He may also ,m.5y,c 9k . -:.-km 9 .-..- ,e k-L.- ._... .i._.e .<br><br> ;. 0z.y +t .a-.~^P The. t-4-y detector need not be fancy.<br><br> It will only see the dom- inant object3 of the picture. For simple images specific to our environment (graphs, diagrams, ..) the information extraction process may be com- pletely automated. The image editor is a part of the document for- mation unit that currently exists in the system.<br><br> Simple text formatting capabilities complement the unit. 8. Concluding Remarks In this report we have presented issue3 related to the development of a multimedia information sys- tem for an office environment.<br><br> Messages in this environment are retrieved based on content. The user can specify attribute value relationships as well as words or parts of words that appear in the text part of multimedia messages. Image content retrieval is achieved by allowing queries on the image text part, queries on the internal table representation and queries on similarity and spatial relationships among image objects.<br><br> Some aspects of the Frcscntation of messages may also be specified in queries. We presented issues related to the user interface, 8query reformulation, access method, image creation, information extraction. We described the internal representation and the Singapore, August, 1984 .presentation form of multimedia messages and the mapping between them.<br><br> We are implementing a prototype for a mul- timedia information system for an office environ- x:r+. h-!gPf ?> t!-,P cy-n~??Trk +F-yit-e+ ip t!-i? paper.<br><br> Multimedia message filing and retrieval have been completed [Vanderbroek 841. Most of the image formatter is complete at the time that WC write this paper (axccpt the information extract;cn mode). Spatial and similarity retrieval for images has not been implemented yet.<br><br> Some relevant or complementary research has been done in other places. Aspects of document seg- mentation in an office environment are discussed in [Wong et al. 821.<br><br> Image compression is discussed in [Pratt et al. 801. Text retrieval hardware is discussed in [Haskin 821.<br><br> Attribute and text retrieval aspects I--f= s!sn t-!isrl~sspd in [Mrl,t~~d 811 and [Stonebrnker et al. 831. Text retrieval for an oflice environment is also discussed in [Larson 831.<br><br> Several systems for image processing and pattern recognition have been implemented in the past ([Kunii and Harada 801, [Christodoulakis 801, [Kulick et al. 751. [Chang and Fu 811, [Economopoulos and Lochovsky 831).<br><br> Finally an ambitious design for a system with attributes, text and pictures is described in ([Penny and Picard 831). References [Aho et al. 781 Aho, A.V., Kernigham, B.W., Weinberger, P.J.: cAwk- A Pattern Searching and Processing Language d (2nd edition), September 1978.<br><br> [Ballard and Brown 823 Ballard, D. and Brown, C.: cComputer Vision d, Prentice Hall, 1982. [Chang and Fu 811 Chang, N..<br><br> and Fu, K.: cPicture Query Languages for Pictorial Data-Base Systems d, Computer 14, 11, 1981. [Christodoulakis and Faloutsos 841 S. Chrisiodouiakis and C.<br><br> Faioutsos: cPerfor- mance Analysis of a Message File Server d, IEEE Transactions on Software Engineering March 1984. [Christodoulakis 841 Christodoulakis. S.: cImplications of Certain Assumptions in Data Base Performance Evalua- tion d, ACM TODS June 1984.<br><br> [Christodoulakis 84a] Christodoulakis, S.: cFramework for the Develop- ment of a Mixed-Mode Message System d, Proceedings ACM-BCS Symposium on Research and nevelopmpnt in Information Retrieval. Cam- bridge, England, 1984. [Christodoulakis 83a] Christodoulakis, S.: cEstimating Block Transfers and Join Sizes d, Proc.<br><br> IEEE-ACM SIGMOD 83, San Jose, California, May 1983. [Christodonlakis 83b] Proceedings of tha Tenth Intematlonal Conference on Very Large Data Baser. Christodoulakis.<br><br> S.: cEstimating Record Selec- tivities d, Information Systems 8.2, 1983. [Christodoulakis 803 Christodoulakis, S.: cIPRL: An Interactive Pattern Recognition Laboratory d, Proc. ACM SIGCSE 1980.<br><br> [Christodoulakis and Elles 841 Fuzzy Retrieval of Images in a Multimedia Server, Internal Report, CSRI, 1984. [ Christodoulakis et al. 84a] S.<br><br> Christodoulakis, M. Papa and M. Theodoridou: cPresentation and Communication of Multimedia Messages d, Internal Report, CSRI.<br><br> 1984. [Christodoulakis et al. 84b] S.<br><br> Christodoulakis, M. Papa, M. Theodoridou, J.<br><br> Li and J. Vanderbroek: cInteractive Document For- mation Using a Multimedia Server d, Internal Report, CSRI, 1984. [Date 821 Date, C.: cAn Introduction to Database Systems vol.<br><br> 3 d. Addison Wesley, 1982. [Duta and Hart 731 Duda, R.<br><br> and Hart, P.: cPattern Classification and Scene Analysis d, Wiley, 1973. [Economopoulos and Lochovsky 831 Economopoulos, P. and Lochovsky, F.: cA System for Managing Image Data d, in Proceedings of 9th World Congress, IFIP 83, Paris, Sept.<br><br> 83. [Elektronics 831 cSpecial Report on Voice Systems d, Electronics, April 83, pp. 126-143.<br><br> [Erman et al. 801 Erman, L. D., Hays-Roth, F., Lesser, V.<br><br> R. and Reddy, D.R.: cThe Hearsy II Speech- Understanding System: Integrating Knowledge to Resolve Uncertainty c, Computer Surveys 12.2. June 1980.<br><br> [Feiner et al. 811 Feiner. S., Nagy, S., and van Dam: cAn Integrated system tor Creating and Freseniing Compiex Computer-Based Documents d, Computer Graph- ics 15.3.<br><br> August 1981. [Fu 831 Fu. K.S.: cSyntactic Pattern Recognition and Applications d, Prentice Hall, 1983.<br><br> [Furuta et al. 821 Furuta. R., Scofield, J..<br><br> and Shaw, A.: cDocument Formating Systems: Survey, Concepts, and Issues d, ACM Computing Surveys 14,3, Sept. 1982. [Gonzalez and Wintz 771 Gonzalez, R., and Wintz, P.: Digital Image Pro- cessing, Addison Wesley, 1977.<br><br> [Gray 831 Gray, J.: cPractical Problems in Data Manage- ment, A Position Paper d, Proceedings SIGMOD 83, San Jose, May 1983. [Haskin 811 Haskin: cSpecial Purpose Processors for Text Retrieval d, Database Engineering 4.1, Sept. 1981.<br><br> Singapore, August, 1984 269 lHorak 631 Horak, W.: cInterchanging Mixed Text Image Documents in an Office Environment d, Comput. & Graphics 7,1, 1983. [IBM 791 IBM: cSTAIRS/VS: Reference Manual d, IBM System Manual 1979.<br><br> [Kulick et al. 751 Kulick, J., Challis, T., Brace, C., Christodoulakis, S., Merrit, I., and Neelands. P.: cAn Image Pro- cessing Laboratory for Automated Screening of Chest X-rays d, Proceedings of IEEE Third Inter- national Conference on Pattern Recognition, Nov.<br><br> 3975. [Kunii and Harada 801 Kunii, T.L. and Harada, M.: cSID: A System for Interactive Design d, Proc.<br><br> AF 9lPS 1980 NCC, May 1980, pp 33-40. [Larson 831 Larson, P.A.: cA Method for Speeding up Text Retrieval d, Proc. ACM SIGMOD May 83.<br><br> [McLeod t31] McLeod, I.: cA Data Base Management System for Document Retrieval Applications d, Information Systems 6,2, 1981. pp. 131-137.<br><br> [McLeod 831 McLeod, R: cManagement Information Systems d, SRA. Second Edition, 1983. [Pavlidis 773 Pavlidis, T.: cStructural Pattern Recognition d, Springer-Verlag.<br><br> 1977. [Penny and Picard 831 Penny, P. and Picard.<br><br> M.: cApplication of Novel Technologies to the Management of a Very Large Data Base d, Proceedings VLDB-83, pp. 20-30. [Pratt et al.<br><br> 801 Pratt, W.K., Capitant, P.J., Chen. W.H.. Hamilton, E.R.<br><br> and Wallis, R.H.: cCombined Symbol Match- ing Fascimile Data Compression System d, Proceedings IEEE 68, July 80, pp. 786-796. [Rabitti and Zizka 841 Rabitti, F.<br><br> and Zizka. J: cevaluation of Access Methods to Text Documents in Office Systems d, Proc. ACM-BCS Symposium on Research and Development in Information Retrieval, 1984.<br><br> [Reddy 75) Reddy, D.R.: cSpeech Recognition d, Academic Press, 1975. [Reddy 761 Reddy, D.R: 8Speech Recognition by Machine: A review d, Proceedings of the IEEE, 64,4, April 1976. [Rijsbergen 791 C.J.<br><br> Rijsbergen: cInformation Retrieval d. Butter- worths, 1979, Second Edition. [Robertson et al.<br><br> 811 S.E. Robertson C.J. van Rijsbergen, M.F.<br><br> Porter: cProbabilistic Models of Indexing and Search- ing d, ACM-BCS Conference on Research and pmlngr ot the Tenth IntefiWlOnal ~nferencr on Very Lame Oata Saws- 270 Development in Information Retrieval, in Infor- mation Retrieval Research, Coddy. Robertson, van Rijabergen, Williams, editors, Butterwoths. 1981.<br><br> [Rosenfeld 761 Rosenfeld,A: cDigital Picture Analysis d, Springer-Verlag 1976. [Salton and McGill 83) G. Salton and McGill: cIntroduction to Modern Information Retrieval d, McGraw-Hill, 1983.<br><br> [Stonebraker et al. 831 Stonebraker, M., Stettner, H., Lynn, N. Kalash.<br><br> J. and Cuttman, A.: cDocument Processing in a Relational Database System d, ACM Transaction on Office Information Systems. [Teorey and Fry 821 Teorey, T.<br><br> and Fry, J.: cDesign of Database Struc- tures d, Prentice Hall, 1982. [Tou and Gonzalez 741 Tou, J.T. and Gonzalez, R.C.: cPattern Recogni- tion Principles d, Addison Wesley, 74.<br><br> [Tsichritzis 831 Tsichritzis, D.: cMessage Addressing Schemes d, ACM TOOIS, 1984. [Tsichritzis 83b] Tsichritzis. D.<br><br> (editor): cBeta-Gamma d, Technical Report CSRG-150, University of Toronto, 1983. [Tsichritzis and Christodoulakis 831 D. Tsichritzis and S.<br><br> Christodoulakis: cMessage Files d, ACM Transactions on Office Information Systems 1,l. 1983. [Tsichritzis et.al.<br><br> 831 D. Tsichritzis, S. Christodoulakis, P.<br><br> Economo- poulos, C. Faloutsos. A Lee, D.<br><br> Lee, J. Vander- brook, C. Woo: cA Multimedia Office Filling Sys- tem d, Proceedings VLDB 83.<br><br> Florence, Italy, 1983. [Tsichritzis and Lochovsky BZ] Tsichritzis and Lochovsky: cData Models d. Pren- tice Hall, 1982.<br><br> [Ullman 831 cPrinciples of Data Base Systems d vol. 2, 1983, Computer Science Press. [Vnrtcicrbroek 841 Vanderbroek, J.: cA Message File System d, M.Sc.<br><br> 8l 9scsis, University of Toronto, 1984. (to appear). [Verrling 771 Verning, C.: cAutomatic Query Adjustment in Document Retrieval d, Information Processing and Management 13,6.<br><br> 1977. [vLDB 83a] cPane1 on Complex Data Objects: text. voice.<br><br> images: Can DBMS manage them? d Proceedings VLDB83. [VLDB 83b] cPanel on Office Information Systems: What is our role? d Proceedings VLDB 83. [Wiederhold 831 Singapore, August, 1994 Wiederhold, G.: cDatabase Design d, McGraw Hill, 1963.<br><br> [Wong et al. 821 Wang, KY., Casey, R.G., Wahl, F.M.: "Document Analysis System c, IBM J. Res.<br><br> Develop. 26,6, NOV. 62.<br><br> Penn&ion to copy without fee all or part of this material is granted provided that the copies are not made or dtstrtbuted for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notfce is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the Tenth international Conference on Very Large Oata Sa8es.<br><br> 271 Singapore, August, 1984