The workshop - Building and Owning Biotechnology Databases was
held at the Golden Tulip Hotel, Purmerend, The Netherlands on 22-23 September
1998. It was organised by The Biotechnology Information Strategic Forum,
with support from DGXII of the Commission of the European Communities.
SRS, a key software program and its future -- Thure Etzold, EMBL-EBI
SRS (Sequence Retrieval System) is an indexing and querying system for databanks in text or flat file format. It is increasingly used on servers where more than 50 databanks have been loaded, and is simple to set up and maintain. The databases are linked within the network to allow the user to navigate through them and make sophisticated queries. SRS is presently installed on more than 30 public servers throughout the world and gives access to more than 300 databases.
SRS uses the fact that all molecular biology databanks are available as text files (ASCII being the de facto standard for data exchange). Furthermore, almost all the applications in molecular biology (such as BLAST, PHRAG, TACG, etc.) produce text as their output.
This use of text has distinct advantages as it can be linked to other standard software packages such as word processing, or database management services, without too much further work.
Furthermore, SRS can use data in flat file formats. This means that no data are lost after data import and that new databanks can be added to a service in a modular way. The system also behaves well when databanks change and the parser allows one to interpret data in many different ways.
A key advantage of SRS is that it allows extremely heterogeneous databanks to be integrated, and it scales. Thus SRS has allowed us to unify a number of Human Locus Specific Mutation Databanks by taking some 30 different databases, writing parsers for each, and then identifying 11 data fields that can be supported by all. Unified and specific object loaders were then written so that a unified view of the different databanks was obtained. Another example concerns the Advanced Database Linkages in Biotechnology Project (ADLIB) where SRS is being used to link a wide variety of different databases covering business, literature, culture collections and resources, sequences and other factual science data collections.
SRS also allows different degrees of databank interoperation, The simplest uses hypertext while a more sophisticated allows for indexed links. One can also develop composite structures that allow a new data structure to be presented after different parsing actions have been completed.
SRS now allows us to take records from different databanks, for instance, TFGene, TFactor, TFSite, SWISS-PROT, EMBL and EPD, and export the combined results to other platforms/programmes such as CORBA, OEM or XML for further work. Furthermore, SRS will soon be able to offer authentication procedures so that commercial and public-sector databanks can be searched in one session resulting in even more composite and useful findings.
SRS, up until Release 5.1, is owned by EMBL and has been placed in the public domain. However, as from the next release, the exclusive rights will belong to LION BioScience who will license it for commercial and academic use. A future licence for academic use will cost nothing but commercial users will be expected to pay a licence fee.
LION is keen to discuss how best to use SRS for the benefit of the scientific
community, and to continue developing the software for the bioinformatics
market.