A Workshop organised by The Biotechnology Information Strategic Forum, with support from DGXII of the Commission of the European Communities, and held at Purmerend, The Netherlands, May 1997
The workshop - Financing Biotechnology Databases was held at the Golden Tulip Hotel, Purmerend, The Netherlands in May 1997. It was organised by The Biotechnology Information Strategic Forum, with support from DGXII of the Commission of the European Communities.
Electronic publication of biological databases. Does it pay? -- Peter H Schalk & Wouter Los
Introduction
ETI, the Expert Center for Taxonomic Identification is an active developer in the field of information technology in relation to biological diversity. Biological diversity refers to taxonomic, genetic, and ecological diversity. It is a hot topic. Because of new opportunities to integrate various units of biological diversity and information technology to combine and distribute sources worldwide, we are witnessing exciting scientific developments. However, biodiversity is also a political issue, especially since the United Nations Convention on Biological Diversity signed by nearly 200 countries aims for an international collaboration for sustained use of biodiversity. This convention also addresses the need for accessible and reliable information, as it is the basis of a responsible management of global biological diversity. The domain of information is vast and little is available in accessible formats, and even less in an integrated way. There is a strong pressure to speed up the data gathering with all kinds of cross-references, but also to maintain the overall quality. This observation, combined with the fact that broader user groups (in science and society) ask for lower barriers to the access and use the information, leads to the conclusion that traditional (printed) dissemination of information is no longer sufficient (Heywood, 1995). It is too slow, too expensive, and it cannot cope with large information flows. Making full use of Information & Communication Technology (ICT) provides an answer.
The Linnaeus II software program ETI was established in 1992 as an global initiative to promote the sharing of knowledge in biological diversity. Founded by UNESCO, the Netherlands research council NWO, and the University of Amsterdam, ETI now is an international not-for-profit organisation. Apart from the head office in Amsterdam, there are self-supporting branches in Japan, Chile and Russia. New branches in the UK, USA and Uruguay are in preparation. The mission of ETI is to promote, preserve and disseminate taxonomic and biodiversity knowledge, and to stimulate international scientific and educational cooperation by applying ICT. ETI focuses on species diversity for scientific user groups.
The main activities of ETI to support this mission are : ICT tool development, international (human resource) networking, electronic publishing and data dissemination, and providing new-media facilities and training. By creating biodiversity information and expert systems, the dissemination of knowledge with improved accessibility is supported. The central mechanism to achieve this is the distribution of the Linnaeus II software to participating scientists. This tool kit supports the scientists with various program modules for building multimedia databases, creating interactive identification tools, geographic information systems, etc. These modules are integrated in one interactive user-friendly environment (Schalk and Los, 1995) offering a broad package of functional tools in one package. The assembled expertise from various scientists, working on related organisms in a defined taxonomic group or geographic region, can easily be transferred to CD-ROMs for electronic publishing and distribution. The Linnaeus II program also simplifies the transfer of existing printed materials into interactive information systems on CD-ROM. Interactive means intuitive point & click navigation through all hyper-linked program parts. At any part of the program the user may enter another part, or let the system analyse a problem, for example the identification of an organism (Schalk and Oosterbroek, 1996).
Problems
Linnaeus II offers simple-to-use yet powerful tools to construct specific information and expert systems or interactive educational systems. But making good systems (with extensive information of high quality) is extremely labour intensive. First of all: good biodiversity databases need the input from many specialists around the world. Managing or coordinating these networks, assisting the participating scientists, and validating the results is also extremely time consuming. This implies a capital intensive pre-production route. The CD-ROM production itself is relatively cheap, as is the entering of data and the technical combination of the contributions of various scientists. The real work is on the R&D side, at the level the external scientists, the in-house assisting scientists, and the programmers to accommodate special software requests of the scientists. The production of a high-quality biodiversity information/expert system may take up to 48 months, mostly because the process to complete and validate the data asks time. For this reason it is difficult to control the production process, and to predict the needed return on investments.
Secondly, the market and the funding mechanisms to produce electronic systems in this field are still not developed well. In spite of the increased awareness for the importance of biodiversity, there is a narrow and not very stable market for these specialised products. The market is also relatively poor, with mostly scientific institutions and developing countries in the user group. Research councils and other bodies have no defined policies to address these problems with special funding programmes. The private sector is waiting for developments and watches carefully the few entrepreneurial initiatives.
Does it pay?
The answer to the question does the electronic publication of biological databases pay? depends on some conditions. The answer is definitively no, if all related R&D costs are passed on to the final and total costs of the electronic product. These R&D costs include the research work done by the biological scientists, and the software development by the publisher. If the scientists want to recover their invested time and research costs totally from the sales of electronic products, the cost price of one CD-ROM would be extremely high. In the traditional printed way of publishing, the scientist takes it for granted that he/she offers his publications for free, and gets in exchange, full publicity for his work. It is remarkable that the same scientist often expects to get money in return from electronic publications, since this medium is associated with profitable commercial activities.
Costs for software development are also high in this frequently changing and continuously evolving field. The fast sequence of changes in hardware and related operating systems requires adjustment and adaptation of the software tools (upgrades) at a regular basis. Furthermore, the innovations in the field of Bio-informatics are impressive, and both the scientists and the users set higher demands once introduced and familiar with ICT. The interactions between biologists and programmers often result in new ideas and algorithms for a variety of applications but it is not possible and not reasonable to fund these developmental activities by means of the sales of scientific CD-ROMs. If an electronic information product is too expensive because of its research and development costs, the effect will be counter-productive. A relatively low priced product will be in reach of individual buyers, which also helps to create awareness for the new media tools.
The answer to the question does it pay? is yes, if only the direct publishing and production costs are represented in the sales price. These costs relate to data entering and validation, debugging the software system, producing the CD-ROM itself, advertising and distribution. This practice is not very different from traditional publishing, and science should consider these CD-ROMs as a regular publishing method.
Still the (science) market is narrow, and to reach it maximally the product price has to be high enough for the sophisticated and specialised CD-ROM product. Of course the publishers policy has to strike a balance in the mix of popular and specialist titles. Establishing a subscription system with regular series will improve financial stability.
The approach of ETI
ETI recognised that international collaboration was essential to fulfil its objectives. The willingness of scientists to cooperate in developing electronic systems and to share knowledge and expertise is the basis of the united effort to disseminate biodiversity information world-wide. But this willingness does not come for nothing. That is why the Linnaeus II software package is central in ETIs services. Full support of the users is important. Linnaeus II was developed as a tool for science, and is constantly being improved as a feed-back to the suggestions from scientists. The software package is in the first place used for the own purposes of the scientists, but ETI expects that the expertise entered into the Linnaeus II program will be shared with other colleagues, by means of publishing in CD-ROM format. Presently there are over one thousand ETI partners, many of whom are working very actively with the software system. The number of ETI partners is growing exponentially, which assures also a growing number of expert products on CD-ROM. As a non-for-profit organisation, ETI presents itself as an activity for and by taxonomists. To coordinate the partner networks, ETI is establishing self-supporting ETI-branches in various parts of the world. These branches may rely on ETI-Amsterdam for up-to-date software support and provide technical assistance to ETI Partners in the region.
Sharing of expertise is also important to reduce development costs. ETI has an open mind to other initiatives in the ICT field, and promotes cooperation in R&D. In this fast moving field it is essential to avoid unnecessary competition and duplication of (expensive) effort. For this reason ETI tries to interest companies and other organisations to sponsor a fellowship programme that will bring together young and innovative scientists in Bio-informatics. Also research councils should establish programs for the basic funding of (innovative) tool development. Until now, ETIs grant proposals were rather successful. However, a more secure funding policy of the R&D activities is nevertheless desirable.
The production and distribution of biodiversity informatics CD-ROMs needs to be professional to be feasible. Although ETI is a not-for-profit organisation, it treats the production of scientific and educational CD-ROMs like a commercial organisation would to promote professional awareness in its staff and to prevent making losses. ETI is still improving a code of practice for its software development, its standards and a warranted support to experts and users. The professional approach also implies cooperation with other professional (commercial) players who are better in managing parts of the distribution activities. It is important that such a cooperation between the not-for-profit scientific producer and the commercial partners respects the low profit margins in this field.
Finally, ETI invested a lot of time in creating awareness for the (use of) new media products. A Partner Newsletter, attending scientific conferences and biodiversity policy meetings, and the distribution of lots of information to the users world was and still is important for ETI.
Global developments
In the field of genetic, species, and ecological diversity studies, there is a rapid and increasing expansion of biological informatics. The next decade will bring about new innovations and initiatives in biological informatics. One of the challenges is to make full use of the worlds biodiversity databases, which are essentially diverse, heterogeneous, and distributed. As the information about biodiversity is a vast domain, the need is to develop tools to get readily available information about Earths living resources for various user groups. As the Internet will offer increasingly better facilities to transfer huge datasets, the future for global information systems is in the Internet. This will be supported by advanced searching facilities (data mining) and analysing tools.
In 1993, the OECD Megascience Forum established an international working group in Bio-informatics, thus recognising the megascience nature of this subject. One of the recommendations will be to join forces to construct a Global Biodiversity Information Facility. This GBIF is a distributed and virtual facility, growing through a modular approach in the meta-domain of distributed systems. Present and emerging initiatives of a new nature will assist in creating the GBIF. Expert tools as developed by ETI, and master reference files (such as Species 2000) will contribute to the framework of the GBIF.
Publishing in this meta-domain will require new policies. For example, if a user wants to do a query or to make a combination of various distributed data, he will like to see in advance which are the total expected costs, and what are any restrictions to get access to the information. Software tools and publishing policies will have to provide new lines of conduct with respect to copyrights/owner rights, and charging systems.
Heywood, V.H.(editor). 1995. Global Biodiversity Assessment. Chapter 9. Data and information management and communication, 607-671. Published for UNEP by Cambridge University Press.
Schalk, P.H. & W. Los. 1995. Applications of interactive multimedia software in taxonomy and biodiversity studies. Global Biodiversity 7. 12-18
Schalk, P.H. & P. Oosterbroek. 1996. Interactive knowledge systems: meeting the demand for disseminating up-to-date biological information. In: Culture collections to improve the quality of life. Proceedings of the Eight International Congress for Culture Collections. Centraalbureau voor Schimmelcultures, Baarn. 259-264