A Workshop organised by The Biotechnology Information Strategic Forum, with support from DGXII of the Commission of the European Communities, and held at Purmerend, The Netherlands, 22-23 September 1998
The workshop - Building and Owning Biotechnology Databases was
held at the Golden Tulip Hotel, Purmerend, The Netherlands on 22-23 September
1998. It was organised by The Biotechnology Information Strategic Forum,
with support from DGXII of the Commission of the European Communities.
The pedigree of PRINTS: who owns the database? -- Teresa K. Attwood, Dept. Biochemistry and Molecular Biology, University College London
PRINTS is a database of diagnostic protein signatures, or fingerprints, that can be used to diagnose family relationships in newly-determined sequences (especially from genome projects). Fingerprints exploit groups of conserved motifs within sequence alignments to characterise protein families; an uncharacterised sequence matching all motifs can then be readily diagnosed as a true match to a particular family fingerprint.
Pilot work on the fingerprint technique began at Leeds University in 1989. Between 1991 and 1993, 100 fingerprints were compiled into an un-named prototype database. This work was not supported by a designated research grant: it was essentially a "proof of concept" of the new diagnostic method.
In October 1993, I moved with my research, under a 5-year Royal Society University Fellowship, to UCL. This signified the first direct funding to support the database. I marked these significant changes by naming the database PRINTS, formally identifying it with UCL. The first release contained 150 entries and, for a time, two colleagues in Leeds gave some support.
In August 1994, a Web site was established at UCL to serve PRINTS and related databases and tools. During this time, PRINTS has grown steadily and, with some 990 entries, is now virtually equal in size to PROSITE, often considered the gold-standard database in this area. PRINTS is released quarterly.
Like databases, research environments are also evolving. The current trend is for universities to protect their intellectual property very aggressively. Accordingly, Leeds have taken the view that, as the work "originated there", they owned it. In response to the European Directive on the legal protection of databases, they placed licence and copyright notices on their web site. UCL have requested their removal, so far to no avail.
Upon consideration, Leeds may have some claim, but to what exactly: the format, the content, the name, the idea? The format of database entries was derived in Leeds, but the name was clearly given at UCL. Equally, Leeds might claim part of the content, but not the added value from my editorial proofing, to which they make no input. The question is, what rights does a contributor have when submitting to a database? The contributor might own the data, but surely not the database, just as a curator might own a database, but not its data.
Before we can decide who owns a database, we have to understand what a database is. For those involved in database creation, it is clear that a database has a format and software to help maintain it. These are recognisably ownable, but they are not the database as such. The essence of a database arises from the particular collection of items assembled in a specific manner by a curator: without their curators, most databases would probably cease to exist. Given the effort required to maintain and to procure funds to support a database such as PRINTS, it is essential that issues of ownership be clear and public.
Looking to the legal protection of databases for guidance (EU Directive
96/9/EC), the small print says that:
"the selection or arrangement of database contents constitute the author’s intellectual creation, which is protectable by copyright;...[it] applies to, and aims to protect, the original selection or arrangement of the contents, not the contents per se;
... the maker or author of the database is the person who creates it and who takes the initiative and risk of investing; and
... any substantial change to the contents (including changes resulting from accumulation of successive additions, deletions or alterations), which would result in the database being considered a substantial new investment shall qualify the database resulting from that investment for its own term of protection."
The maintenance of PRINTS is a 100% UCL-based effort, and has been
for the past 5 years; I am the sole curator. The database is supported
financially by several large pharmaceutical companies, together with European
and UK Research Council grants. Under the new Directive, I believe that
what IP now exists in the database is created by me, and that my work at
UCL constitutes a substantial new investment relative to the project initiated
at Leeds, the database having grown 10-fold since that time.
Adding to the confusion, in 1999 I will move to Manchester University, taking my group and PRINTS with me. My fear is that there I will face similar battles, as UCL defend their interest in the database.
This situation poses a threat, not only to PRINTS, but also to any database in a similar situation. How will such factors affect funding strategies? Who will provide financial support to a database at the centre of an IP battle? How can you privatise a database if you cannot decide who owns it? Copyright-claiming institutes should realise that they own very little if the curator walks out, or indeed if the funds run out (especially if they have made no investment themselves). And, very importantly, they should also realise that they may be opening themselves to unknown liabilities if they claim to own something they do not actually control.
The significant human effort required to maintain a database like PRINTS adds to the struggle. As described elsewhere (SWISS-PROT), the new generation of databases require a great deal of intellectual input. 12-weekly releases of PRINTS are hard to sustain, and will become harder as the amount of data increases and as related resources on the Internet, which must be cross-referenced, proliferate. To derive the maximum benefit from biotechnology information, kindred databases need to be made to "talk" to each other, to share common ground and reduce duplication of effort. With this in mind, PRINTS is currently supported by an EC grant to investigate how it might be integrated with PROSITE, Pfam and other databases via an international collaboration known as InterPro. However, like SWISS-PROT, PROSITE is also now raising funds from the market. It is unclear how this will affect InterPro, if one component is "commercial" and others are not. It may, of course, open the door to more realistic future funding strategies for all partners; but before PRINTS can reasonably entertain that route, the ownership issues must clearly be resolved.
In conclusion, it is obvious that biological databases are essential research tools, but without suitable investment they will wither and die. New forms of financing are therefore essential. If we are to consider some form of privatisation, we must be clear who owns the resource. But, in deciding who owns a database, we must be clear what is owned and what can be legally protected.
In the case of PRINTS, the jury is still out; although morally the answer
seems, to me at least, very simple.