help button home button ClinMed NetPrints
Warning: This article has not yet been accepted for publication by a peer reviewed journal. It is presented here mainly for the benefit of fellow researchers. Casual readers should not act on its findings, and journalists should be wary of reporting them.

This Article
Right arrow Abstract Freely available
Right arrow Similar articles in this netprints
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Eysenbach, G.
Right arrow Search for Related Content
Right arrow Articles by Eysenbach, G.
Related Collections
Right arrow Journalology:
Peer Review

Right arrow Medical informatics:
World Wide Web

Right arrow Statistics and Research Methods:
Systematic reviews (incl meta-analyses): descriptions

Right arrow Medical informatics:
Other Medical Informatics

Right arrow Evidence Based Practice:
Other evidence based practice

clinmed/2000110001v1 (November 12, 2000)
Contact author(s) for copyright information


Original Investigation

Evaluation of the usefulness of Internet searches to identify unpublished clinical trials for systematic reviews

Gunther Eysenbach MD
University of Heidelberg, Germany
Head, Research Group on Cybermedicine and Ehealth

Dept. of Clinical Social Medicine (Chairman: Prof. Dr. Diepgen)
Bergheimer Str. 58
69115 Heidelberg

Email: ey{at}

Fax + 49-6221 - 56 55 84
Tel. +49 -6221 56 88 97 or Tel. +49 -6221 56 4742 or + 49 -172 82 49 086

Aspects of this paper have been presented at the VII Cochrane Colloquium, Rome, October 5 to 9, 1999, under the title "Use of the World-Wide-Web to identify unpublished evidence for systematic reviews - the future role of the Internet in improving information identification."



Objectives: To avoid selection and publication bias, systematic reviewers should employ a broad range of search techniques and make efforts to locate unpublished studies. We tried to establish whether searches on the World-Wide-Web are useful to identify additional unpublished and ongoing clinical trials by developing and evaluating a search strategy.

Design: Seven Cochrane systematic reviews, where Internet searches were not mentioned as part of the search strategy, were selected as criterion (gold) standard. Their search strategies were retrospectively adapted for the World-Wide-Web in an attempt to find additional randomised controlled trials. A search strategy with the general pattern "study methodology NEAR intervention NEAR condition" for the Internet search engine AltaVista was evaluated.

Measurements: Search time; recall of Internet searches for published studies; precision (proportion of webpages containing hints to relevant published and unpublished randomised clinical trials) ; number of additional unpublished or ongoing studies found on the Internet.

Results: We reviewed 429 webpages in 21 hours and found hints to 16 unpublished, ongoing or recently finished trials, at least 9 were considered relevant for 4 systematic reviews. The recall of Internet searches to find references to published studies ranged between 0% and 43.6%, the precision for hints to published or unpublished studies ranged between 0% and 20.2%.

Conclusion: Information on unpublished and particularly ongoing trials can be found on the Internet. A potential problem are non-peer reviewed electronic publications with questionable quality. More powerful search tools are needed. An "Open Trial Initiative" is proposed to define a syntax for publishing trials on the web and to ensure interoperability of trial registers, so that special search engines can harvest information on ongoing and completed clinical trials.



Information Storage and Retrieval; Internet; Data Collection; Meta-Analysis; Clinical Trials; Evaluation Studies; Publication bias; -Sensitivity and Specificity; Evidence-based medicine


I. Introduction

Systematic reviews and meta-analyses are important methods for synthesising research and evidence-based medicine. An international organization that prepares and maintains systematic reviews in the healthcare field is the Cochrane Collaboration. [1]

Systematic reviews differ from traditional opinion-based, narrative reviews, as they do not represent the views of a selected expert, but aim to be a genuine, objective summary of all of the best evidence of effectiveness of a given intervention, including published and unpublished trial information, in English or in any other language. Systematic reviews have a strict methodology that is stipulated in a published scientific protocol, which sets out the parameters of the research to be undertaken, for example the aims of the review, the search strategy and the types of trials to be looked for.

A major threat to the validity of systematic reviews and meta-analysis is publication bias, [2] i.e. the systematic overestimation of intervention effects resulting from the fact that trials with positive results tend to be published more frequently [3-7] and earlier [8] than those with inconclusive or negative outcomes.[9] To avoid publication bias, an important aspect of a systematic review is to use multiple sources and strategies to identify trials and to take also the results of unpublished trials into account. Although it is still a somewhat controversial issue, how unpublished data should be evaluated and incorporated into a systematic review, there is consensus that unpublished data should not be systematically excluded and that efforts should be undertaken to identify information beyond searches in bibliographic databases and the published research literature.[10-12]

Traditional methods to identify unpublished studies which are recommended in the Cochrane Collaboration Handbook [13] include searches in trial registers, checking reference lists, hand searching, [14] and personal communication. [15] These methods, as well as the standard search strategy for bibliographic databases,[16] are all well evaluated and have been proven to be useful.

The role of the Internet for information identification is less clear. Although there are anecdotal reports of reviewers indicating that searches on the Internet may be useful to identify studies, [17,18] to date there has been no systematic investigation to evaluate the usefulness of the Internet for locating additional evidence nor have been any search strategies been developed or evaluated. The Cochrane Collaboration Handbook and most Cochrane Collaborative Review Groups currently give no recommendation on whether and how the Internet should be used in the process of a systematic review. Also, little or no information is available on the suitability of different Internet search tools and search strategies for Systematic Reviews. In issue 2/1998 of the Cochrane Library [19] only in 2 out of 377 complete systematic reviews authors mentioned that they conducted a search of the Internet. Consequently, it is also not clear how "incomplete" (and thus potentially biased) systematic reviews, which did not include Internet searches, are, compared to those who did.

The hypothesis tested in this study was that searching the Internet with a generic search engine can lead to identification of additional studies which have been not mentioned by authors of systematic reviews who did not perform Internet searches. We also intended to answer some more practical questions useful for reviewers, such as what kind of websites would contain hints to unpublished studies, which search tools and strategies would be feasible and whether information can be retrieved and appraised in a reasonable amount of time. Experiences from searching the Internet to answer clinical questions [20] and considerations regarding quality of Internet information [21,22] raise the question whether the signal-to-noise ratio of information on the World-Wide-Web is good enough to use it as a serious research tool. Another aspect of this paper was to provide future directions of research to develop search tools for reviewers.

II. Methods

A. Pilot study to identify suitable search tools and strategies

This study focuses on the usefulness of general Internet searches with robot-based search engines. Search engines are software programs which continuously "crawl" and index the World-Wide-Web. [23] They do not index content in dynamic databases such as PubMed or trial databases.[24]

For this study we first had to identify suitable search engines which were powerful enough to handle complex queries. A complex (Boolean) query string is required for searching clinical studies on the web, because contrary to bibliographic databases, web documents have not been indexed with keywords from a controlled vocabulary. Moreover, while bibliographic databases only contain the abstracts of scholarly documents, which increases the a priori likelihood to find a relevant article, the Internet contains millions of irrelevant documents such as product descriptions or patient information. Therefore, a carefully designed, specific search must be conducted, as otherwise thousands of irrelevant documents will be retrieved. Thus, the possibility to link synonyms with an OR operator, to truncate word stems and to do a proximity search with a NEAR operator (specifying that two words must appear in the same sentence or close to each other) are crucial for conducting specific Internet searches.



Figure 1. Search strategy to locate hints to randomized controlled trials on the Internet using AltaVista


B. Search Strategy

As in all information retrieval processes, searching the Internet requires a trade-off between recall and precision. Depending on the manpower available to check web pages and also depending on what retrieval precision can be achieved in the respective topic area, the development of a search strategy is to a much larger degree a subjective process than for example development of a search strategy for a bibliographic database.

Our proposed search strategy connects the three concepts "intervention", "condition" and "trial" with NEAR operators, e.g. "(study or trial or random* or evaluat*) NEAR (intervention OR intervention-synonyms) NEAR (condition OR condition-synonyms)" (Fig. 1). NEAR operators allow to limit the search to documents where the connected concepts occur close to each other. The use of the AND operator instead mostly leads to a large number of "false positive" hits. This search strategy puts emphasis on specificity rather than sensitivity. For some topics, where less documents can be found on the web, more sensitive searches with AND replacing NEAR operators, may be possible.

For the "intervention" and the "condition" concepts, different synonyms should be added, connected by OR. Word stems with truncation operators should be used, so that documents using words in different grammatical forms or spelling variants or other languages can be found. The maximum length and complexity of a query string is typically very limited by the search engine, thus very complex queries are not possible or must be replaced by subsequent independent searches (not done in this study).


1) Narrowing search

Our search protocol specified the following measures to be taken if the initial search leads to too many hits. Some resulting pages were viewed and it was established, which search expressions appeared too unspecific. Stepwise unspecific search terms were removed, synonyms rephrased or truncations deleted if they led to false positive results.

Although not used in our searches, using capital letters could in certain cases also narrow the search and avoid undesired results (change for example "aids" into "AIDS" - AltaVista searches case-sensitive if the search term is written in capital letters). Similarly, in certain cases also a phrase search, for example "Graves disease", or the use of NEAR operators for combining words which belong together (for example, atopic NEAR eczem*) could be considered. As a last resort, irrelevant pages may be filtered out by using the NOT operator.

2) Broadening search

To broaden the search (finding more documents), our search protocol prescribed to identify further synonyms using the Unified Medical Language System (UMLS), and combining with them OR. Spelling variants and terms in foreign languages could be added. However, search engines are extremely limited in their ability to digest long search strings, thus the number of synonyms that could be used was restricted.

Although not used in these searches, investigators may further broaden the search by stepwise replacing NEAR by AND, for example "study AND (intervention NEAR condition)", or "study NEAR (condition AND intervention), finally using AND between all three concepts. For very specific searches it may also be worthwhile to remove the "study" concept and to attempt to search only for the "intervention NEAR condition" or even "intervention AND condition" concepts, although this usually leads to a high number of false positive hits.

C. Selection of systematic reviews and adaptation of search strategies for the WWW

We randomly selected a convenience sample of seven Cochrane systematic reviews (CSRs) that were marked as "updated" from the Cochrane Library Issue 2, 1998 (Table 2). In none of these CSRs reviewers described that they have used Internet. In order to adapt their search strategies for the web we analyzed the bibliographic database search strategies described in the methods sections of these systematic reviews and adapted the search terms for the AltaVista search strategy. The search strategy was narrowed if the number of hits exceeded 200 according to our algorithm presented in Fig. 1. The resulting AltaVista search strategies for the CSRs are presented in Table 2.

After the final search strategy had been determined, Internet searches were conducted in December 1998. Web pages were screened for any references to published information or hints to ongoing or unpublished research in the question relevant for the systematic review ("relevant" here means that we felt that the studies should be assessed by the reviewers in detail whether they should be included or excluded for the review). Links were followed as appropriate. The time spent for searching and initial appraisal was recorded.

If references to published literature were found on the webpage, these were compared with the references to included or excluded studies or "trials awaiting assessment" cited in the systematic review. Concordant findings were recorded (Table 2, column 6) and used for a recall estimate.

D. Recall and precision estimates

As an estimate for the sensitivity of our searches we calculated the recall of published studies, defined as the number of published studies (references) retrieved on the web divided by total number of published studies mentioned as "included" or "excluded" in the systematic review (Table 2, column 8). This should not be confused with the actual recall of trials (defined as hints to published and unpublished studies found on the web divided by the total number of relevant published or unpublished studies), as this figure could only be determined if the denominator would be known. As this is never the case (we don’t know how many unpublished trials for a given question exist), we used the recall of published trials as an indirect estimation.

The precision of the search was calculated as the number of webpages with hints to published or unpublished (including planned or ongoing) studies possibly relevant for the systematic review divided by the total number of documents retrieved.

E. Critical appraisal

If hints to planned, ongoing or unpublished studies were found, we tried to gather as much information as possible about the study, for example by contacting the authors of the webpage or the investigators of the trial, and by conducting MEDLINE searches to see whether the study was already published.

In order to determine whether the study was potentially relevant for the systematic review, we contacted the CSR authors and asked them to comment on the potential relevance of the studies found. We also asked whether these studies had been known to them.

As a further hint of the relevance of a study identified on the web, we checked MEDLINE and the latest version of the systematic reviews (Cochrane Library Iss. 1/2000) in March 2000, 15 months after the searches were conducted, to see whether the formerly unpublished study had in the meantime been published and/or mentioned in the systematic review as "ongoing", "excluded" or "included" or "study awaiting assessment".


III. Results


A. Results of the pilot study to identify suitable search engines


Table 1 shows the results of a pilot study, first performed end of 1998 and repeated and updated in March 2000. We evaluated 9 major generic and 2 medical search engines for their ability to handle complex queries. Only one search engine, AltaVista, turned out to be sufficiently flexible and powerful to handle complex Boolean queries involving truncations and proximity operators. Google and NorthernLight are newer search engines that were not available in 1998.

NorthernLight is noteworthy as it contains an additional database with indexed full texts of journal articles, e.g. from newspapers or medical journals. Two other search engines, Medical World Search [25] and MedHunt,[26] work with an underlying medical thesaurus such as the MeSH (Medical Subject Headings) or UMLS (Unified Medical Language System) and allow concept based searches, i.e. they automatically also find pages which contain synonyms, allow translations into other languages and explosion of search concepts to include more specific terms. However, the total number of web pages indexed by these search engines is very low.

In summary, only AltaVista was found to be suitable for this study, but future systematic reviewers may also use and compare other search engines.


Table 1. Search tool features and test searches in different robot-driven search engines. The comparison of search features is partly based on the "Comparison of Search Engine User Interface Capabilities" by Gillian Westera " ( and the Web Search Expert by Diane Johnson (

Search features

Benchmark Searches

Wildcards/ Truncation Capitalisation recognition Special features Simple Boolean AND:

atopic eczema (a)

[= atopic AND eczema]

Phrase Search:

"atopic eczema"

Simple Boolean OR:

atopic eczema OR atopic dermatitis

[= (atopic AND eczema) OR (atopic AND dermatitis)]


atop* ecz*

Complex Boolean with truncation:

(atopic eczema OR atopic dermatitis) AND antihistamin*

Proximity Search:

(atopic eczema OR atopic dermatitis) NEAR antihistam*

Complex proximity search:

(atopic eczema OR atopic dermatitis) NEAR antihistam* NEAR (study OR trial OR random* OR eval*)

Alta Vista

(Advanced search)

NEAR "" * (eg colo*r*) yes allows wildcards within a word; can influence relevancy ranking in adv. search mode; offers natural language searching; can limit by language and date.









AND NOT (in simple mode; must be in uppercase; overrides Excite's concept-based search mechanism)
no "" no no offers similar terms to help narrow your search (in simple mode); provides automatic synonym searching (concept search); offers a "more like this" option in results; can sort by site. Allows max. 1000 results to be displayed



>1000 (b)

Not possible

Not possible

Not possible

Not possible

FAST Search

no no "" no no is very fast.



Not possible

Not possible

Not possible

Not possible

Not possible


no no (though uses implied NEAR automatically) "" no no includes your search term in context (and in bold) in the results page; automatically ANDs words together, ranks websites according to number of hyperlinks pointing to the site.



Not possible

Not possible

Not possible

Not possible

Not possible


AND/OR/NOT (must select "the Boolean expression") no "" or select search the phrase * only left truncation (eg *man)
(can also use ? for one character only)
only for mixed upper- and lower-case words (eg NeXT) can limit by date and continent; can select page depth within a Web site; offers word stemming; can limit from within a results page.

Returns only up to 1,000 results




Not possible

Not possible

Not possible

Not possible


implied in advanced search mode no "" automatic; also finds word variations automatically (eg mice/mouse) yes can use a pipe | to refine search results(eg dogs|dalmations); automatically searches previous results to narrow search result; can limit by country and top level domain (eg edu, com)




Not possible

Not possible

Not possible

Not possible

ADJ, NEAR, NEAR/X, FAR, BEFORE. "" no no Lycos has moved to concentrate on its subject directory and no longer updates the search engine. Test searches revealed strange effects


1095 (c)

1046 (d)

Not possible

Not possible

Not possible

Not possible

Northern Light

AND/OR/NOT no "" * (eg colo*r*)
(can also use % for one character only)
no Has, in addition to the Web, indexed 6200 trusted, full-text journals, books, magazines, newswires, and reference sources, including full text versions of medical journals such as The Lancet.

Groups results into "custom folders" which enables more efficient locating of relevant information; can limit by date, language and type of Web page (ie edu, org, etc) in power search mode.




6076 (e)


Not possible

Not possible


no "" no no offers a shortcuts option which also lists relevant sites from its directory in your results listing; offers natural language searching.




Not possible

Not possible

Not possible

Not possible

Medical World Search

AND/ OR/ NOT NEAR no no no Concept based search, terms are mapped to UMLS/MeSH and can broaden the search to these synonyms and exploded terms

203 (f)

Not possible


Not possible

40 (g)

21 (h)

7 (i)


AND/OR   ADJ * no Allows mapping to the MeSH and translation into 8 European languages.






Not possible

Not possible

(a) in most search engines an entry like "atopic eczema" would be treated like a Boolean "atopic AND eczema". If this is not the case (some search engines would use an implied OR as default), we used the appropriate syntax instead to force an "AND" (for example "+atopic +eczema")

(b) " We have found that nearly 100% of users never have a need to drill down beyond the 1000th result for a given query. For these reasons, we no longer provide more than 1000 results per query submitted."

(c) phrase search does not seem to work properly: Revealed exactly the same results as an "AND" search

(d) OR search does not work properly: "(atopic AND eczema) OR (atopic AND dermatitis)" revealed 1046 hits, while (atopic AND eczema) revealed 1095 hits. Also, the number of hits were not reliable

(e) modified as "atop* AND ecze*", because truncated words must have at least 4 characters

(f) automatically searches for synonyms such as "Atopic Dermatitis, Atopic, Neurodermatitis" etc.

(g) mwsearch does not support truncation, but can explode terms such as "antihistamines", taking into account more specific terms such as methdilazine, Tripelennamine, Promethazine. This search was done by using the search concepts "atopic eczema" [exploded] AND "antihistamine" [exploded]

(h) This search was done by using the search concepts "atopic eczema" [exploded] NEAR "antihistamine" [exploded]

(i) This search was done as "atopic eczema" (exploded) NEAR "antihistamine" (exploded) NEAR (study OR "randomized controlled trial" [exploded] OR trial OR randomized)


B. Hints to published and unpublished studies

For 4 out of the 7 systematic reviews a total of 57 hints to published and 16 hints to unpublished studies were found (Table 2). The remaining 3 systematic reviews (CAF, DPA, DIG), for which we found no hints or references, were all "mini-reviews" with very few randomized controlled trials included in the review.

References to published studies were found in reference lists of web articles, on virtual journal clubs or online published tables of contents of printed journals.



Table 2. Overview of the Cochrane Systematic Reviews for which we tried to find unpublished or ongoing trials on the Internet










  Cochrane Systematic Review (CSR) Gold standard:

Number of RCTs listed in the CSR

AltaVista advanced search strategy Time needed for search (number of hits) Hints to published studies (PS) found Hints to unpublished or ongoing studies (US) found Recall (=published studies found / published studies in the CSR) Precision (=pages with hints to published or unpublished RCTs / total hits)
PROM Kenyon S, Boulvain M.

Antibiotics for preterm premature rupture of membranes.

Excluded: 22

Included: 12

Ongoing: 0

Total: 34

(study or trial or investigat* or evaluat*) near ((antibiotic* or antimicrob* or ampic* or erythro* or metronid* or ceph*) near (PROM or pPROM or ((premature or preterm) near (rupture near membrane*)))) 4 hours

(43 hits)

3 2 3/34



(11,6 %)

ACT Marshall M, Lockwood A:

Assertive Community Treatment for people with severe mental disorders

Included: 20

Excluded: 50

Awaiting: 5

Ongoing: 2

Total references: 109

((study or trial or random*) near ((assertive near community near treatment) or (training near community near living) or (Madison near model)))) 4 h

(168 hits)

30 4 30/109




ASTH Gibson PG:

The effects of self-management education and regular practitioner review in adults with asthma

Included: 25

Excluded: 27

Ongoing: 3

Total: 55

(study or trial or random*) near asthma* near (education* or (self near management)) 9 h

(159 hits)

24 8 24/55




OSA Wright J, White J The effectiveness of continuous positive airways pressure for the treatment of obstructive

sleep apnoea

Included: 7

Excluded: 24

Ongoing: 0 (authors mention that they are aware of two ongoing trials)

(study or trial or random* or evaluation) near ((sleep or obstruct*) near (hypopn* or apn*)) near (CPAP or (positive near pressure) 2,5 h

(46 hits)

0 2 0/31




CAF Bara AI, Barley EA: The bronchodilator effect of caffeine in asthma Included: 6

Excluded: 3

Ongoing: 0

(study or trial or random* or evaluation) near asthma* near (caffe* or coffee or tea or chocolate or cola) < 30 min

(8 hits)

0 0 0/9




DPA Phelps DL, Lakatos L, Watts JL:

D-Penicillamine to prevent retinopathy of prematurity

Included: 2

Excluded: 2

Ongoing: 0

((retrolent* near fibroplas*) or (retinopat* near prematur*)) near (penicillami*) < 30 min

(5 hits)

0 0 0/4 0/5
DIG Soll RF: Digoxin in the prevention or treatment of respiratory distress syndrome Included: 2

Excluded: 0

Ongoing: 0

(digoxin or digitalis) near ((respirat* near distress) or RDS) < 30 min

(0 hits)

0 0 0/2 0/0
TOTAL       21 hours

429 hits

(av.: 3 min/hit)



including one publication cited on Berg J, Dunbar-Jacob J, Sereika SM. An evaluation of a self-management program for adults with asthma. Clin Nurs Res 1997 Aug;6(3):225-38, which was not mentioned in the CLIB Iss 2/98, but appears in Iss 1/2000 as included study


Table 3 (see appendix) shows on which web pages we found hints to possibly unpublished studies. These hints were on

We did not encounter any web pages of institutional review boards (IRB’s) or ethics committees, listing approved research projects or publishing protocols on the web.

C. Recall and precision

For three of the seven systematic reviews we found published references on the web (Table 2), the recall for references of published studies in these three studies was 8.8%, 27.5% and 43.0%, respectively (mean 11.4%).

In four of the systematic reviews we found hints to published or unpublished studies. The precision, i.e. the proportion of webpages containing relevant information in these four searches, was 11.6%, 20.2%, 20.1% and 4.3%, respectively (mean 8%).

While the recall for published literature on the web is obviously way lower than searches in bibliographic databases, the precision of an Internet search can be optimized such that it is comparable to sensitive searches in Medline; for example, Dickersin and colleagues also report a mean precision of 8% for their Medline search strategy to locate randomized controlled trials. [16] However, the best Medline searches can reach a precision of up to 52% [27]

D. Critical appraisal of hints to unpublished studies

In total, we found 16 hints to unpublished studies (Table 3). Where possible and necessary, we contacted the authors of the studies by fax or email and asked whether it was a randomized controlled trial and whether it was published or unpublished.


E. Quality issues

We encountered an interesting quality problem, which may be characteristic for Internet searches. We found a online published paper with the title "Buteyko Breathing in Asthma: A Controlled Trial", published on the apparently commercial website "Buteyko New Zealand Ltd." (ASTH3), which promotes videos and courses on a special breathing technique, the Buteyko breathing technique (BBT). We were unable to find this study in Medline, but after contacting the investigators we found out that the trial just has been published in the Medical Journal of Australia in the very same week when we conducted the search. [29] It is interesting to note that the online published study (now not longer online) was not exactly identical with the article published in the peer-reviewed literature: The version published in the peer-reviewed journal contained a description of the adverse events in both study arms and the discussion raised the possibility of study contamination as "some of the BBT subjects who were experiencing difficulties with the technique were contacted frequently by the Buteyko therapist. We did not anticipate this contact, which leaves the study open to the criticism that the BBT group were influenced in ways the control group were not.". This information was omitted in the online version.


IV. Discussion

A. Principal findings

To our knowledge, this is the first study evaluating the usefulness of the Internet for conducting systematic reviews and shedding light on the future role of the Internet in synthesizing research. It may even be a example of what Jadad and colleagues call a "needed synergy between the Internet and evidence-based decision making".[34]

This study shows that the proposed search strategy for the Internet has the potential to identify unpublished and especially ongoing studies for systematic reviews. We found 16 web pages containing information about unpublished trials, including at least 9 trials that appeared to be relevant but were not mentioned in the systematic reviews. The reason that these trials were not mentioned in the systematic reviews was however not necessarily failure of the reviewers to identify them, but appeared to be that they were still ongoing or just recently finished studies. We found no evidence for a trial that remained unpublished because of negative results, but our insight into the results of the studies which have not yet been published remains limited as we had no access to the original data and we don’t know which of them will eventually remain unpublished.


B. Potential problems of Internet searches

Todays search engines are subject of considerable limitations. Not only that they only cover a fraction of the web,[24] they are also unable to index information contained in dynamic databases such as trial registers (see textbox) or on password-protected pages, and most are not able to handle more complex queries necessary to search for trials. The development of a Internet search strategy is not trivial and should be conducted by an experienced Internet searcher. To alleviate this problem, we propose to develop specialised search engines, especially to overcome the language bias and allowing concept-based searches. Such a specialised search engine could also contain expert knowledge on which sites ongoing studies are published and access dynamic databases and meta-trial registers.

Another concern is the quality of information retrieved. In several instances we encountered outdated web pages (PROM1, ACT1), which pointed to apparently ongoing trials, that in reality were already finished and published. More importantly, the appraisal of online published studies which have not been published in the peer-reviewed literature is difficult and time-consuming. Reviewers must be wary of promotional, non-peer-reviewed material on the web and – even if an article appears to be peer-reviewed – must expect that online published material can be different from the published peer-reviewed article, as the case described above illustrates.


C. Limitations of this study

The primary aim of this study was to develop and evaluate a search strategy for finding trials on the Internet. We did not attempt to show a bias of systematic reviews where no Internet searches were performed as compared to a systematic review where the Internet was searched. This would have required to obtain the complete results from the studies which we found on the Internet and to do a systematic review in parallel to another group which develops a systematic review without Internet searches.

One limitation of this study is that the Internet searchers were no domain experts. If the Internet searches would have been conducted by systematic reviewers themselves, they possibly may have found additional evidence. Also, the actual reviewers may would have classified some of the studies we judged as relevant for their systematic review as not relevant. In several cases we were not sure whether hints to studies we located were actually within the scope of a given review (in particular in the asthma education review, e.g. ASTH3, ASTH4). We tried to minimize these limitations by requesting help from the reviewers in appraising the studies found.

We restricted our search to English and German sites – sites with other languages were not evaluated and the search strategy was optimized to find English documents. Comprehensive Internet searches may broaden the search by including synonyms in other languages and translating retrieved pages (AltaVista has an automatic translation tool for some languages). This could increase the number of retrieved studies.

D. Conclusions for systematic reviewers

We conclude that carefully designed Internet searches may, in addition to "traditional" searches in bibliographic and trial databases, represent an additional source for identifying evidence for systematic reviews, particularly ongoing trials. By conducting Internet searches and identifying and prospectively monitoring the progress of ongoing trials reviewers may reduce the risk of overlooking evidence. In addition, grey literature such as dissertations, and articles published in obscure journals, online journals and e-print servers may be found using Internet searches: These sources may contain hints to negative trials or trials that have been finished early.

Internet searches with the proposed search strategy proved to be feasible and resulted into search results with reasonable precision. The search strategy also appears to be rather sensitive. For example, reviewers of the OSA review mentioned in their review that they were aware of two ongoing trials, and we also found two ongoing trials on the web.

However, these possibilities of general web searches have to be balanced against risks of invoking false leads and increasing the cost of reviewing. Current search engines are subject to significant limitations in search functionality and their inability to index information contained in dynamic web-databases. To decrease the time and costs required to search the Internet it has been proposed to develop a specialised search engine for locating ongoing and unpublished trials on the web (a specialised Cochrane search engine) that is linked with meta-trial registers [35].

While results of this study suggest that Internet searches can in some cases be useful for detecting unpublished and ongoing studies for conducting systematic reviews, and could avoid publication bias if reviewers follow-up the development of ongoing trials identified on the web, we found no hard reasons to suggest that systematic reviews which are not performing Internet searches are necessarily invalid. Still, we recommend to conduct Internet searches in order to complement other efforts to locate unpublished evidence, such as personal communication with experts.

Future research should compare newer search engines such as NorthernLight and Google, as well as different search strategies. As various Internet-accessible trial registers are being launched (see textbox), it will also be interesting to evaluate their use for locating studies and avoiding publication bias in the context of reviews. To facilitate future research, it also seems important that systematic reviewers carefully document their Internet search strategy in reports of systematic reviews (rather than just mentioning that "Internet searches have been performed") so that factors influencing the effectiveness and necessity of Internet searches can be identified.

E. Future role of the Internet in linking trials

To prevent unknowing duplication of clinical research and to detect underreporting of research it has been long demanded to establish prospective registers of clinical trials. [36-39]. It is however unlikely that there will ever be one complete central multinational database. Instead, multiple resources set up by numerous different organizations will exist.[40] Internet technology will play a central role in linking the evidence. As information contained in these trial databases will probably always be incomplete, especially on an international scale, general searches on the World-Wide-Web will remain important. Even if a trial doesn’t make it into a trial database, authors of randomised controlled trials will increasingly leave their "digital footprints" on the Internet, in form of traces left by grants proposals, funding agencies, and by the process of recruiting participants. These traces can be found by reviewers, even if the trial remains unpublished. Iain Chalmers reported how in 1984 a group of visitors to a hospital in Zimbabwe by chance discovered that an important unpublished randomized controlled trial had been conducted 7 years before.[41] Today, at least in the industrialized world, researchers may stumble over hints to unpublished trials on the Internet without leaving the building.

We recommend that already today, funding agencies, institutional ethics committees, researchers, patient organisations and other groups, who all share a common interest (and the ethical responsibility) in ensuring that clinical research is published [42] should assist reviewers and researchers in this task by at least publishing trials they have funded, approved, supported or heard of on a robot accessible webpage. This can be done by for example listing trials on a webpage using the standard format "randomized trial on (intervention) in (condition)", together with study details, trial identifier and the name and contact details of the principal investigator, all in English, so that they can be indexed by search engines and found by systematic reviewers.

A better way would be to agree on a common syntax and organizational structure for representing and exchanging information about ongoing and finished trials. XML (the eXtensible Markup Language), for example the syntax used by McCray and Ide,[43] could not only be used to represent information internally in trial registers, but also to publish XML-tagged information on the web, for example on department homepages by investigators themselves or IRB’s, thereby enabling harvesting of this information by specialised search engines. Similarly, trial databases, including databases designed to recruit patients such as Centralwatch, should ideally be interoperable.[44] Much as the Santa Fe Convention of the Open Archives initiative presents a technical and organizational framework designed to facilitate the discovery of content stored in distributed e-print archives, an "Open Trial Registry" initiative is needed to facilitate knowledge discovery about ongoing and finished clinical trials, to ensure interoperability of trial registers and to enable discovery and harvesting of this kind of information from homepages of funding agencies, institutional review boards, individual researchers and department homepages.


Textbox: Growing importance of the Internet for information identification

Information about ongoing and completed clinical trials is increasingly being published on the Internet. However, to date there is no standard format prescribing how this information can be presented in a computer-readable format.

Researchers use their personal or department homepages to announce their interest in a certain research area or to recruit patients.[45] Journals like the Lancet have begun to publish research protocols on their website [46] and more and more researchers will also publish pre-prints of their findings on the web. [47]

Consumers and patient organisations also have an interest to disseminate information about ongoing trials. For example, the National Alliance of Breast Cancer Organisations ( has established a Clinical Trials Accessibility Working Group "to find ways to make clinical trials more widely accessible to women with breast cancer and women who are at high risk for breast cancer.". Government and funding agencies react to this need by establishing trial databases for consumers, for example the recent launch of the searchable database at of the US National Institutes of Health (NIH),[43] or the AIDS Clinical Trials Information Service for patients ( Also, commercial companies exist that specialise in online-publishing information from the clinical trials industry to help researchers to recruit patients and to help patients find clinical trials that may be of interest to them (e.g. Although these examples are taken from the US, similar websites exist in other countries, for example the cancer trial registry of the German Cancer Society (

A number of other web-accessible trial databases for researchers exist, for example the NIH’s database for investigators (, the database of Current Controlled Trials Ltd, a new web-based publishing company (, which aims to establish a meta-register, allowing simultaneous searches in different registries.

Individual pharmaceutical companies begin to recognize that openness and access to information on clinical trials can improve patient care and are part of the social responsibility. For example, Glaxo Wellcome opened a web-based Clinical Trials Register which will provide a comprehensive record of all phase II-IV studies conducted on Glaxo Wellcome's newly registered medicines ( [48] Pharmaceutical industry associations, e.g. the Pharmaceutical Research and Manufacturers of America database, contains information on pharmaceutical products in the research and testing phase (




I thank the authors of the web pages, the randomized controlled trials and the systematic reviews for providing additional information and commenting on the relevance of the retrieved studies. Jens Tuische helped with the Internet searches.


Reference List


1. Chalmers I. The Cochrane collaboration: preparing, maintaining, and disseminating systematic reviews of the effects of health care. Ann N Y Acad Sci 1993;703:156-63.

2. Egger M, Smith GD. Misleading meta-analysis [editorial] [see comments]. BMJ 1995;310(6982):752-4.

3. Dickersin K, Chan S, Chalmers TC, Sacks HS, Smith HJ. Publication bias and clinical trials. Control Clin Trials 1987;8(4):343-53.

4. Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR. Publication bias in clinical research [see comments]. Lancet 1991;337(8746):867-72.

5. Easterbrook PJ, Matthews DR. Fate of research studies. J R Soc Med 1992;85(2):71-6.

6. Dickersin K, Min YI, Meinert CL. Factors influencing publication of research results. Follow-up of applications submitted to two institutional review boards [see comments]. JAMA 1992;267(3):374-8.

7. Scherer RW, Dickersin K, Langenberg P. Full publication of results initially presented in abstracts. A meta-analysis [published erratum appears in JAMA 1994 Nov 9;272(18):1410]. JAMA 1994;272(2):158-62.

8. Stern JM, Simes RJ. Publication bias: evidence of delayed publication in a cohort study of clinical research projects [see comments]. BMJ 1997;315(7109):640-5.

9. Egger M, Smith GD. Bias in location and selection of studies. BMJ 1998;316(7124):61-6.

10. Cook DJ, Guyatt GH, Ryan G, Clifton J, Buckingham L, Willan A, McIlroy W, Oxman AD. Should unpublished data be included in meta-analyses? Current convictions and controversies. JAMA 1993;269(21):2749-53.

11. Sutton AJ, Duval SJ, Tweedie RL, Abrams KR, Jones DR. Empirical assessment of effect of publication bias on meta-analyses. BMJ 2000;320(7249):1574-7.

12. Assendelft WJ, van Tulder MW, Scholten RJ, Bouter LM. [The practice of systematic reviews. II. Searching and selection of studies]. Ned Tijdschr Geneeskd 1999;143(13):656-61.

13. Cochrane Reviewers' Handbook 4.0 [updated July 1999]. In: The Cochrane Library [computer program]. Clarke M &Oxman AD. 4.0. Oxford, England: Update Software / The Cochrane Collaboration; 1999;

14. Jefferson T, Jefferson V. The quest for trials on the efficacy of human vaccines. Results of the handsearch of Vaccine. Vaccine 1996;14(6):461-4.

15. McManus RJ, Wilson S, Delaney BC, Fitzmaurice DA, Hyde CJ, Tobias RS, Jowett S, Hobbs FD. Review of the usefulness of contacting other experts when conducting a literature search for systematic reviews. BMJ 1998;317(7172):1562-3.

16. Dickersin K, Scherer R, Lefebvre C. Identifying relevant studies for systematic reviews [see comments]. BMJ 1994;309(6964):1286-91.

17. Intercessory prayer for the alleviation of ill health (Cochrane Review). In: The Cochrane Library [computer program]. Roberts L, Ahmed I, Hall S et al. 2/98. Oxford, England: Update Software / The Cochrane Collaboration; 1998; database on disk and CDROM.

18. Surfing Systematically - Experience Of Searching The Internet. 1998; Symposium on Systematic Reviews: Beyond the basics. St. John's College, Oxford, UK, 8th and 9th January 1998.

19. The Cochrane Library [computer program]. Mulrow CD &Oxman AD. 2/98. Oxford, England: Update Software / The Cochrane Collaboration; 1998; database on disk and CDROM.

20. Hersh WR, Gorman PN, Sacherek LS. Applicability and quality of information for answering clinical questions on the Web [letter]. JAMA 1998;280(15):1307-8.

21. Allen ES, Burke JM, Welch ME, Rieseberg LH. How reliable is science information on the web? [letter; comment]. Nature 1999;402(6763):722

22. Eysenbach G, Diepgen TL. Towards quality management of medical information on the internet: evaluation, labelling, and filtering of information [see comments]. BMJ 1998;317(7171):1496-500.

23. Lawrence S, Giles CL. Searching the world wide Web. Science 1998;280(5360):98-100.

24. Lawrence S, Giles CL. Accessibility of information on the web [see comments]. Nature 1999;400(6740):107-9.

25. Suarez HH, Hao X, Chang IF. Searching for information on the Internet using the UMLS and Medical World Search. Proc AMIA Annu Fall Symp 1997;824-8.

26. Baujard O, Baujard V, Aurel S, Boyer C, Appel RD. A multi-agent softbot to retrieve medical information on Internet. Medinfo 1998;9 Pt 1:150-4.

27. Jadad AR, McQuay HJ. A high-yield strategy to identify randomized controlled trials for systematic reviews. Online J Curr Clin Trials 1993;Doc No 33:3973

28. Ford ME, Havstad SL, Tilley BC, Bolton MB. Health outcomes among African American and Caucasian adults following a randomized trial of an asthma education program. Ethn Health 1997;2(4):329-39.

29. Bowler SD, Green A, Mitchell CA. Buteyko breathing techniques in asthma: a blinded randomised controlled trial [see comments]. Med J Aust 1998;169(11-12):575-8.

30. Jenkinson C, Davies RJ, Mullins R, Stradling JR. Comparison of therapeutic and subtherapeutic nasal continuous positive airway pressure for obstructive sleep apnoea: a randomised prospective parallel trial [see comments]. Lancet 1999;353(9170):2100-5.

31. Salkever D, Domino ME, Burns BJ, Santos AB, Deci PA, Dias J, Wagner HR, Faldowski RA, Paolone J. Assertive community treatment for people with severe mental illness: the effect on hospital use and costs. Health Serv Res 1999;34(2):577-601.

32. Mercer BM, Miodovnik M, Thurnau GR, Goldenberg RL, Das AF, Ramsey RD, Rabello YA, Meis PJ, Moawad AH, Iams JD, et al. Antibiotic therapy for reduction of infant morbidity after preterm premature rupture of the membranes. A randomized controlled trial. National Institute of Child Health and Human Development Maternal-Fetal Medicine Units Network [see comments]. JAMA 1997;278(12):989-95.

33. Herinckx HA, Kinney RF, Clarke GN, Paulson RI. Assertive community treatment versus usual care in engaging and retaining clients with severe mental illness. Psychiatr Serv 1997;48(10):1297-306.

34. Jadad AR, Haynes RB, Hunt DL, Browman GP. The Internet and evidence-based decision-making: a needed synergy for efficient knowledge management in health care. CMAJ 2000;162(3):362-5.

35. Eysenbach G. Use of the World-Wide-Web to identify unpublished evidence for systematic reviews - the future role of the Internet to improve information identification. [Abstract] VII Cochrane Colloquium.Rome, Italy, Oct 5-9, 1999 (Abstracts Book) 1999;18

36. Simes RJ. Publication bias: the case for an international registry of clinical trials. J Clin Oncol 1986;4(10):1529-41.

37. Chalmers I, Dickersin K, Chalmers TC. Getting to grips with Archie Cochrane's agenda [editorial] [see comments]. BMJ 1992;305(6857):786-8.

38. Chalmers I, Gray M, Sheldon T. Handling scientific fraud. Prospective registration of health care research would help [letter; comment]. BMJ 1995;311(6999):262

39. Horton R, Smith R. Time to register randomised trials. The case is now unanswerable [editorial]. BMJ 1999;319(7214):865-6.

40. Tonks A. Registering clinical trials. BMJ 1999;319(7224):1565-8.

41. Chalmers I. Underreporting research is scientific misconduct. JAMA 1990;263(10):1405-8.

42. Pearn J. Publication: an ethical imperative [see comments]. BMJ 1995;310(6990):1313-5.

43. McCray AT, Ide NC. Design and implementation of a national clinical trials registry. J Am Med Inform Assoc 2000;7(3):313-23.

44. Sim,I. Trial banks: An informatics foundation for evidence- based medicine 1997; Stanford University, Stanford, CA, USA;

45. Wilmoth MC. Computer networks as a source of research subjects. West J Nurs Res 1995;17(3):335-8.

46. Chalmers I, Altman DG. How can medical journals help prevent poor medical research? Some opportunities presented by electronic publishing [see comments]. Lancet 1999;353(9151):490-3.

47. Delamothe T, Smith R, Keller MA, Sack J, Witscher B. Netprints: the next phase in the evolution of biomedical publishing [editorial]. BMJ 1999;319(7224):1515-6.

48. Sykes R. Being a modern pharmaceutical company: involves making information available on clinical trial programmes [editorial]. BMJ 1998;317(7167):1172



Appendix - Table 3


Table 3. Overview of pages which contained hints to possibly unpublished randomized controlled trials (RCTs) and their critical appraisal concerning relevance for Cochrane Systematic Reviews (CSR) listed in Table 1.

  Web Pages containing evidence about "unpublished" studies Critical appraisal: Relevant for CSR? Screenshot (for reviewers only – may not be possible / necessary to be reproduced in the final manuscript)

department webpage mentions a study as being in progress

Yes - page indicates study as being unpublished and in progress, study is actually already published in JAMA (Mercer, 1997) and was included in the CSR

MRC grant awards July 1998, listing a planned trial "Clinical and economic evaluation of antibiotics in preterm labour and rupture of the membranes (ORACLE Trial)".


Yes – the ORACLE study is being conducted by the CSR authors and has been mentioned as "ongoing" in Iss 1/2000 of the Cochrane Libary

Homepage of James Dias - "Current research projects: Randomized controlled trial will be used to assess the efficacy of the Psychiatric Assertive Community Treatment (PACT) approach to community-based treatment of patients with severe mental illness that are living in a rural setting."

Yes. Randomized trial of PACT in Charleston, South Carolina for 144 patients recruited from August 1989 through July 1991. Subsequently published as Salkever 1999 (CSR not yet updated). CSR author wrote in 1998 "I've heard informally that the trial was going on from a contact in the US".

Presentation at the 124th Annual Meeting of the American Public Health Association, Nov 1996

"First Year Outcomes from a Randomized Trial Comparing Consumer and Non-Consumer Assertive Community Treatment Teams with Usual Care"

Robert I. Paulson, PhD; Greg Clarke, PhD; Heidi Herinckx, MA; Karen Lewis, BS; Evie Oxman, BA

Yes. CSR author hadn't heard of the presentation. According to the RCT authors, this trial has been published as Herinckx, 1997, (which was included in the CSR), but first year outcomes were published only in this abstract thus far. - Screenshot not available -

Variations on Assertive Community Treatment: A Study of Approaches and Client Outcomes of Four Teams in South Eastern Ontario. Principal Investigators: Dr. Shirley Eastabrook & Ms. Terry Krupa

Yes. Ongoing trial, CSR author hadn't heard of it. Ongoing trial, not yet published.

A Randomized Control Trial of Assertive Community Treatment in a Canadian Inner City Setting. Principal Investigator: Dr. Donald Wasylenki

Yes. CSR author hadn't heard of it. Ongoing trial, not yet published.

Message from the UCSF News Listserver: "The UCSF Asthma Center also conducts a study called the Asthma Education Project for patients. It focuses on the results of teaching asthmatics about their disorder, self-management skills, and correct use of inhalers. For more information about the research studies, contact Lila Glogowsky at the UCSF Asthma Clinical Research Center at (415) 380-9678."



Homepage of Lawren Daltroy: "Other ongoing research includes a trial of patient and community education to reduce asthma morbidity among adult minority- group members in inner-city Boston."


Online paper "Buteyko Breathing in Asthma: A Controlled Trial"

Possibly relevant. RCT has later been published in the MJA (Bowler, 1998).

Webpage mentioned a study about "Computer assisted assessment and management of patients with asthma"

Possibly unpublished RCT on computer-assisted asthma education (Author: C. McCowan) - Screenshot not available -

Foundation (American Lung Association) describing ongoing funded projects including a asthma education program for teens, finnished end of 1998, conducted by C.L. Joseph CL, Henry Ford Health System, Detroit, MI

Possibly relevant, but participants may not meet the inclusion criteria of the CSR (>16 years). Not yet published. Not mentioned in the CSR.

Press release describing a study with "asthma patients at the University of Michigan Health System", who "were taught to take control of their health needs".

No. Principal investigator W. Bria contacted for details, says that it is not a RCT. Manuscript submitted.

Abstract 1998 AHSR Annual Meeting. Blixen et al "Feasibility of a Nurse-Run Asthma Education Program for Urban African Americans: A Pilot Study"

Randomized controlled trial, with questionable relevance for this CSR. In press at Journal of Asthma

Abstract 1995 AHSR Annual Meeting. M.E. Ford et al. Racial differences in the effects of asthma education on health beliefs and health behavior

Possibly relevant. Has been published in Ethn Health (Ford, 1997). Not mentioned in the CSR

Oxford continuous positive airways pressure therapy trial project information webpage

Yes. Well conducted RCT conducted 1996-1998, published in 1999 (Jenkinson, 1999). As of 1/2000 not yet included in CSR

Webpage lists NHS R&D grants May 1998.

A double-blind randomised control trial of nasal CPAP in patients with obstructive sleep Apnoea Dr Adrian Kendrick, Senior Clinical Scientist, United Bristol Health Care Trust

Probably. Study not yet published. Not mentioned in the CSR.


This Article
Right arrow Abstract Freely available
Right arrow Similar articles in this netprints
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Eysenbach, G.
Right arrow Search for Related Content
Right arrow Articles by Eysenbach, G.
Related Collections
Right arrow Journalology:
Peer Review

Right arrow Medical informatics:
World Wide Web

Right arrow Statistics and Research Methods:
Systematic reviews (incl meta-analyses): descriptions

Right arrow Medical informatics:
Other Medical Informatics

Right arrow Evidence Based Practice:
Other evidence based practice