SEAMLESS: Introduction to the Project - Ariadne Web Magazine for Information Professionals Home Coming issue Archive Authors Articles Guidelines SEAMLESS: Introduction to the Project Buzz data html database dissemination zip metadata thesaurus doi tagging identifier vocabularies schema copyright video cataloguing jpeg z39.50 ascii marc purl uri cd-rom interoperability url research standards Citation BibTex RIS Mary Rowlatt describes SEAMLESS, the Essex-based project. SEAMLESS is a two year research project, funded by the British Library, which aims to develop a new model for citizens’ information - one which is distributed, and based on partnerships and common standards. The objectives of the SEAMLESS project are to: build strong and sustainable partnerships between the various information providers operating in the region develop and implement common standards (technical and informational) so as to achieve interoperability between their systems and data develop a SEAMLESS interface which will allow simultaneous querying of distributed information sources (whether stored in a database, made available on a website, or in word processed documents) and return all the information back to the user in a unified list facilitate electronic communication between the information providers and their customers, and between the various participating agencies develop a current awareness/alerting service for users (second phase) Currently the project team (Essex Libraries, Fretwell Downing Data Systems Ltd. and Education for Change Ltd.) are working with 29 organisations in Essex (national government departments, County Council departments, District Councils, Health authorities, business organisations, educational establishments, CABs, voluntary and charitable groups etc.) to develop the necessary standards and set up a prototype system. The application of metatags and the creation of a common thesaurus are being investigated. Once the system has been tested, modified, and proved viable, it is hoped that the system will be opened up to all information providers in the region, and that it will form the basis for the development and delivery of citizen’s information in Essex in the future. Why do we need a profile for citizens’ information? Developments in three discrete, but inter-related areas are converging to create a need for a new profile, or standard attribute set, for citizens’ information which will support greater interoperability, help to improve resource description and discovery, and act as a basis for the development of new e-services: a growing emphasis on joint working and partnership arrangements, often described as ‘joined up thinking’ or ‘joined up government,’ amongst organisations which provide services to the public. Increasingly, this includes cross-sectoral initiatives involving players from the public, private and voluntary sectors. Examples include: Health Action Zones, Employment Action Zones, Regeneration (Pathfinder) Projects, social exclusion initiatives, Local Agenda 21 projects, the implementation of the Crime and Disorder Act, the development of a National Childcare Strategy and implementation of Early Years Development Plans, Lifelong Learning, University for Industry, and the setting up of Regional Development Agencies. All depend on strong local partnerships and in this context the ability to share information effectively is a key requirement, one which in turn rests upon the development and adoption of suitable standards. a concern for the users of local services, and the difficulties they face in trying to access the services and information they need in an increasingly complex and fragmented information environment. In order to live their lives and play their full part in society people need information from a wide range of public, private and voluntary sector organisations, nationally and locally. Most of these organisations produce information in a variety of printed and electronic formats, many of which are available in the public library. Users are confused by a multitude of overlapping information sources, with a multitude of different search interfaces, which act as a barrier to easy access. Although this approach is gradually being upgraded to a www based environment, which allows access to an increasing amount of information, it still produces ‘information ‘islands’ which can only be bridged through superficial high level hyper-links. Another problem, which is well recognised by information professionals, is that the current state of indexing and description of documents and resources on the web is inadequate, which means that searches tend to favour recall at the expense of precision. There is a need for further development, and practical application, in areas such as the use of metadata, and automated techniques based on harvesting and web crawlers, in order to improve this situation. * the publication of the influential report on the future for Britain’s public libraries ‘New Library - the Peoples’ Network’ [1] http://www.lic.gov.uk/publications/newlibrary.html has highlighted the need for public libraries to be linked up to a high speed, high capacity digital network. Attention now is turning to the content and services that public libraries will be able to deliver over that network. The provision of citizens’, or community information, has traditionally been one of the public library’s core functions and there is considerable interest the question of whether and how citizens’ information resources held locally can be aggregated, or made available, as a national resource. Related research The British Library funded CIRCE project (http://www.gloscc.gov.uk/circe/index.htm ) has been investigating the potential for networking public library community information databases. The fundamental difference between CIRCE and SEAMLESS is that the SEAMLESS team do not see a long term future for public library community information databases as such. Rather they take the view that there is a danger that public libraries may become marginalised as information providers unless the twin ‘threats’ of competition from other information providers and the trend to remote access to information encouraged by the development of the www are addressed. The SEAMLESS project proposes to develop, test and evaluate a new model for citizens’ information provision in which the public library becomes the facilitator, co-ordinator and standard setter for a distributed system (made up of the information resources of a network of local information providers) and provides expertise and training on demand. Two basic, but crucial, pre-conditions underpin this new model. The first is that a substantial degree of co-operation is needed between the various information providers in any given locality: no one organisation can provide a successful citizens’ information service in isolation. The second is that some common technical and information standards need to be developed and adopted in order to facilitate successful co-operation and to enable the necessary sharing of data between partners and efficient dissemination of data to the wider public. One of the key aims of the SEAMLESS project is to test whether some of the large body of previous research into interoperability and metadata could beneficially be applied to a new domain - that of citizens’ information. (See www.ukoln.ac.uk/metadata/, www.ukoln.ac.uk/elib/ and www2.echo.lu/libraries/en/metadata/matahome.html for more information on a number of European Union (EU) and Joint Information Services Committee (JISC) projects funded under the Telematics for Libraries and Electronic Libraries (e-Lib) programmes. Interest in this area continues to grow and JISC and BLRIC (British Library Research and Innovation Centre) have recently established UK Interoperability Focus to explore, publicise and mobilise the benefits and practice of interoperability across diverse information sectors (www.ukoln.ac.uk/interop-focus/). Extant profiles Standard attribute sets are a useful starting point for considering data representation in any area. A number of these either currently exist or are emergent in the area of citizen’s information. The SEAMLESS team studied existing standard attribute sets and compared their elements and possible application. The team also looked at a variety of sources describing the general application of metadata.[2] [3] [4] http://www.ukoln.ac.uk/metadata/desire/overview [5] [6] US MARC Community Information Format This is the extension of the US MARC attribute set that covers cataloguing of community information. Further details about this attribute set are available at the Library of Congress website (http://lcweb.loc.gov/marc/community/eccihome.html).   Dublin Core The Dublin Core seeks to establish a way to describe documents and “document-like objects” such as web pages, in a way which will enable search engines to index and retrieve them. Further information is available from the website ( http://purl.org/dc/ ).   GILS The Government (or Global) Information Locator Service is the result of an international agreement (based on original work among government departments in the US) to provide a standard for locating information, whether held in libraries, data centres, or published on the Internet. The standard adopted for this service is ISO 23950, also known as (ANSI) Z39.50.[7] Further information is available from the website ( http://www.usgs.gov/gils/ ).   CIMI Consortium for the Computer Interchange of Museum Information. Since 1990 CIMI has made substantial progress in the development of standards for structuring museums’ data and enabling widespread search and retrieval capabilities. Further information is available from the CIMI website ( http://www.cimi.org ).   IMS Instructional Management Scheme. The IMS Project is developing and promoting open specifications for facilitating online activities such as locating and using educational content, tracking learner progress, reporting learner performance and exchanging student records between administrative systems. Further information can be found in the IMS website (http://www.imsproject.org/what.html ).   Development of the SEAMLESS profile SEAMLESS was established with the intention that a wide range of types of organisation should be included, so it was important to ensure that the final system would be hospitable to different types of information and that it would meet the needs of varying types of organisation and the particular needs of their customers. In setting out to define a common information profile (attribute set) the project team contacted a wide variety of potential partner organisations, selected to include some who had expressed interest following the launch conference, some who had worked with the library service before, and some whom it was felt would enhance the variety of information challenges for the pilot project. Meetings were held with each organisation during Spring 1998 to give them more information about SEAMLESS and to collect information about their role and services, and a workshop was held in April. An Information audit was carried out during June and July 1998 to analyse the organisations’ information products and systems in detail and to assist them in the selection of information sets to make available for the pilot project. This information was then collated and there followed an iterative process of developing a set of information attributes which were both broad enough to encompass the range of domains represented and suitably constrained so as to be manageable in the real world working environment of the organisations concerned. Following the research into existing standards the team undertook a detailed analysis of the sample data supplied by partner organisations during the Information Audit. The team identified and mapped the various elements within each data set to establish overlaps and common terms. Research staff from Essex Libraries, Education for Change and Fretwell Downing then met to discuss the various options. The original proposal for the SEAMLESS project postulated a information profile based upon the Dublin Core. Initial research within the project indicated that GILS provided a better basis for development. It was felt that it provided a more hospitable attribute set for the elements identified within the sample data than Dublin Core, while being less complicated to apply and offering more potential for accommodating future developments than USMARC. It is also compliant with the international standard for information searching, ISO 23950 (Z39.50) which is used in the project. Having decided that GILS might be the standard to use, the research team then undertook detailed matching of the data obtained from partners in the Information Audit to the full GILS Core Elements. The profile had to be able to cope with elements from three data formats: data bases where every field would need to be tagged in order to be displayed, web pages where only searchable elements were required and word documents where again searchable elements were needed but where substantial editing might be required to produce useable data. This work proved that the majority of data would fit into the GILS Core Elements. The major gap was for information relating to educational courses where there seemed to be nowhere to include information about entry requirements, resulting qualification, target audience or the duration or type of course. The team therefore reconsidered the other extant standards and decided that the IMS profile included elements which would plug this gap. Following advice from Fretwell Downing four IMS elements were included in the SEAMLESS Information Profile as a Learning Provision Subset. In addition, discussions with the participating information providers indicated a desire to incorporate the Alta Vista format for the keyword and description attributes. These therefore appear in the SEAMLESS profile without the SEAMLESS prefix (se.), the intention being that these tags can be recognised by the Alta Vista robots as well as by SEAMLESS. Matching also showed that for the majority of the data currently included in the project, the full GILS Core Elements was not required. GILS includes some quite complicated nested tags and requires some expertise to implement correctly. The intention is that partner organisations will add the tags themselves and the team was conscious that the process had to be simplified as much as possible. The workload involved in manipulating data for SEAMLESS had already been identified as a potential problem by many of the organisations and it was felt that any long and complicated tagging process might cause some organisations to drop out of the project. After discussion with GILS experts at Fretwell Downing and Sebastian Hammer of Index Data, Denmark, the team developed a set of 33 SEAMLESS information attributes (the ‘SEAMLESS Information Profile’) which can for the most part be mapped directly onto the equivalent GILS Core Elements. Details of the SEAMLESS profile The 33 elements are (mandatory elements in bold type): Element No. Name Description 1 title assigned title or description of the resource 2 source the organisation or provider who is making the information available to SEAMLESS 3 date-last-modified in the form DD/MM/YYYY 4 channel term(s) from the SEAMLESS Channels list 5 keywords term(s) from the SEAMLESS thesaurus 6 originator the body primarily responsible for the intellectual content of the information. 7 contact-name the person to contact for more information 8 contact-organisation the name of the organisation to contact for more information 9 contact-address the address of organisation to contact for more information 10 contact-network-address Email address to contact for more information 11 distributor This element will apply mainly to bibliographic items 12 cost cost information 13 begin-date in the form DD/MM/YYYY 14 end-date in the form DD/MM/YYYY 15 time-textual Time/date expressed in words 16 linkage Show URL, URI, SICI, PII, DOI, PURL, ISBN, ISSN etc. here 17 linkage-type e.g. HTML, MIME, plain text etc. 18 medium e.g. CD-ROM, Book, Video etc. 19 place one term plus it’s post town, e.g. Chelmsford 20 description a textual description relating to the general nature and content 21 contributor e.g. co-author 22 date-of-publication-structured in the form DD/MM/YYYY 23 date-of-publication-textual date expressed in words 24 language language of the intellectual content of the resource 25 general-constraint e.g. copyright, use & reuse, intellectual property etc. 26 control-identifier any local reference number that uniquely identifies the resource within its domain 27 record-review-date in the form DD/MM/YYYY 28 supplemental-information a field to map miscellaneous information 29 body Body text (where appropriate). Basic formatting (white space) is preserved. Learning provision sub-set   30 ims.prerequisite entry requirements for courses 31 ims.educationalobjective qualification or intended learning result of course 32 ims.level the target audience or level of the course 33 ims.duration length of the course and/or the type of study e.g. full time, part time etc. Mapping of SEAMLESS Profile Attributes to GILS Core Elements The mappings are as shown in the table below. Note that where GILS provides several groupings of sub-elements, the decision was taken within the SEAMLESS project to provide a “flat” (i.e. non-nested) schema, which it was felt would ease the process of data preparation across a wide variety of locations and by staff with varying levels of technical understanding.   SEAMLESS Element No. Name GILS Element No. Equivalent GILS Core Element 1 title 4 Title 2 source 1019 Record source 3 date-last-modified 1012 Date of last modification 4 channel 2074 Controlled Subject Index sub-group: Controlled term 5 keywords 2074 Controlled Subject Index sub-group: Controlled term 6 originator 1005 Originator 7 contact-name 2023 Point of Contact sub-group: Contact Name 8 contact-organisation 2024 Point of Contact sub-group: Contact Organization 9 contact-address 2025 - 2029 Point of Contact sub-group: Contact Street Address Contact City Contact State or Province Contact Zip or Postal Code 10 contact-network-address 2030 Point of Contact sub-group: Contact Network Address 11 distributor 2006 Availability sub-group: Distributor Name 12 cost 2055 Order Process sub-group: Cost Information 13 begin-date 2072 Availability sub-group: Beginning Date 14 end-date 2073 Availability sub-group: Ending Date 15 time-textual 2045 Availability sub-group: Available Time Textual 16 linkage 2021 Availability sub-group: Linkage 17 linkage-type 2022 Availability sub-group: Linkage Type 18 medium 1031 Availability sub-group: Medium 19 place 2042 Spatial Domain sub-group: Place Keyword 20 description 62 Abstract 21 contributor 1003 Contributor 22 date-of-publication-structured 31 Date of Publication sub-group: Date of Publication Structured 23 date-of-publication-textual 31 Date of Publication sub-group: Date of Publication Textual 24 language 54 Language of Resource 25 general-constraint 2005 Use Constraint 26 control-identifier 1007 Control Identifier 27 record-review-date 2051 Record Review Date 28 supplemental-information 2050 Supplemental Information 29 body None None Learning provision sub-set     30 ims.prerequisite None SEAMLESS/IMS specific sub-group 31 ims.educationalobjective None SEAMLESS/IMS specific sub-group 32 ims.level None SEAMLESS/IMS specific sub-group 33 ims.duration None SEAMLESS/IMS specific sub-group Mapping of SEAMLESS Profile Attributes to Dublin Core Elements During discussion several partners expressed concern about implementing a SEAMLESS attribute set which would not provide additional retrieval advantages in the wider web community beyond those systems already recognising GILS. There was some feeling particularly in the academic organisations that they did not wish to cut themselves off from the Dublin Core community. The team therefore decided to include a mapping of SEAMLESS attributes to Dublin Core Elements as part of the system. This is shown in the following table. For details of similar work see the ‘Dublin Core/MARC/GILS crosswalk’.[8] http://www.loc.gov/marc/dccrocc.htm   SEAMLESS ELEMENT Purpose DUBLIN CORE ELEMENT Purpose title The assigned title or description of the resource. Title The name Given to the resource by the Creator or Publisher. originator To identify the organisation(s) or person(s) responsible for the creation of the resource. Creator The person(s) or organisation(s) primarily responsible for the intellectual content of the resource. keywords To specify the subject or topic of the resources using a controlled vocabulary that describes its content for resource description and discovery purposes. Subject The topic of the resource, or keywords or phrases that describe the subject or content of the resource. description A textual description relating to the general nature and content of the resource. Description A textural description of the contents of the resource, including abstracts in the case of document-like objects or contents descriptions in the case of visual resources. distributor To identify the entity responsible for making the resource available in its present form such as a publishing house, university department or corporate entity. Publisher The entity responsible for making the resource available in its present form, such as a publisher, a university department, or a corporate entity. contributor To identify other significant contributors to the intellectual content of the resource in addition to the originator. Contributor Person(s) or organisation(s) in addition to those specified in the Creator element who have made significant intellectual contributions to the resource but whose contribution is secondary to the individuals or entities specified in the creator element. date of publication To show the date the resource was published. Date The date the resource was made available in its present form. medium To specify the physical format and data representation of the resource. Type The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary. linkage To provide the location or address of an automatic linkage to an electronic resource. Identifier String or number used to uniquely identify the resource. linkage type To identify the data content type associated with the electronic resource e.g. HTML for a web page, PDF for a Portable Document Format file. Format The data representation of the resource, such as text/html, ASCII, Postscript file, executable application, or JPEG image. None at present (GILS: SOURCES OF DATA)   Source The work, either print or electronic, from which this resource is derived, if applicable. language To indicate to the user the language of the intellectual content of the resource. Language Language of the intellectual content of the resource. None at present (GILS: CROSS REFERENCE RELATIONSHIP, CROSS REFERENCE LINKAGE)   Relation Relationship to other resources. begin-date end-date time-textual place To indicate any start or end dates associated with the resource; to indicate the expression of dates and times in words; to indicate the location where the activity occurs Coverage The spatial locations and temporal durations characteristic of the resource. general constraints To indicate if any access constraints pertain to the use of the resource. Rights The content of this element is intended to be a link ( a URL or other suitable URI as appropriate) to a copyright notice, a rights-management statement, or perhaps a server that would provide such information in a dynamic way. The intention of specifying this field is to allow providers a means to associate terms and conditions or copyright statements with a resource or collection of resources. No assumptions should be made if such a field is empty or not present. Searchable attributes For the initial implementation the following attributes will be searchable: keyword, subject, name, place and date. Comments please The SEAMLESS team would welcome comments on the proposed citizens’ information profile as outlined above from colleagues active in the fields of metadata and interoperability research and from public libraries and other organisations providing information to the public. Please contact either Mary Rowlatt (maryr@essexcc.gov.uk ) or the SEAMLESS team (seamless@essexcc.gov.uk ). References: [1] New Library - the Peoples’ Network Library and Information Commission, 1997 Available from: http://www.lic.gov.uk/publications/newlibrary.html [2] Dempsey, Lorcan and Heery, Rachel Metadata: a current view of practice and issues Journal of Documentation, Vol. 54(2), March 1998, p145 - 172 [3] European Commission, DGXIII -E4 Report of the Metadata Workshop held in Luxembourg, 1st and 2nd December, 1997 [4] Dempsey, Lorcan and Heery, Rachel, with contributions from Martin Hamilton, Debra Hiom, Jon Knight, Traugott Koch, Marianne Peereboom and Andy Powell A review of metadata: a survey of current resource description formats DESIRE 1, deliverable 3.2(1), March 1997 Available from: http://www.ukoln.ac.uk/metadata/desire/overview/ [5] Younger, Jennifer A Resource description in the digital age Library Trends, Vol. 45(3), Winter 1997, p462 - 487 [6] Heery, Rachel Review of metadata formats Program, Vol. 30(4), October 1996, p345 -373 [7] ISO 23950 1998/ANSI?NISO Z39.50 1995 Information retrieval (Z39.50): application service definition and protocol specificationISO, 1998 [8] Dublin Core/MARC/GILS crosswalk Network Development and Marc Standards Office, last updated 04/07/97 Available from: http://www.loc.gov/marc/dccrocc.html Author details Mary Rowlatt Information Services Manager and Project Leader for SEAMLESS maryr@essexcc.gov.uk Cathy Day Research Assistant SEAMLESS project seamless@essexcc.gov.uk Jo Morris Research Assistant SEAMLESS project seamless@essexcc.gov.uk Kevin Atkins Network Services Consultant Fretwell-Downing Data Systems Ltd. katkins@fdgroup.co.uk Facebook Twitter Google+ E-Mail Pinterest LinkedIn Ariadne is published by Loughborough University Library © Ariadne ISSN: 1361-3200. See our explanations of Access Terms and Copyright and Privacy Statement.