Literature Searching

The UGPD group facilitates literature searching for review authors. Usually, the Trials Search Coordinator (TSC) will devise a search strategy with authors during the development of a review's protocol. For each review a search strategy is produced based on relevant clinical terms agreed by the author and the Trials Search Co-ordinator.

The search strategy is constructed using a combination of Mesh terms and free text terms. All reports of randomised controlled trials identified whilst searching will be added to the Group's Specialised Register.

During the development of the review, the TSC executes the search on the following databases,

When appropriate, the following additional resources are searched:

  • LILACS
  • AMED
  • CINAHL
  • PsychINFO
  • Web of Science
  • PDQ
  • Current Contents
  • SIGLE

Occasionally a review team will have sufficient in-house library or information support. In this case the TSC will work with the in-house team, confirming the searches are of a sufficient quality and supplementing the searching where access to particular resources are limited.

Search Filters for RCTs

The Embase project

For many years, Cochrane has been feeding reports of trials from PubMed and Embase into Cochrane’s central database of controlled trials (CENTRAL). This has made CENTRAL an incredibly rich and valuable resource for authors and others trying to identify the evidence. The way that Embase records are fed into CENTRAL changed in 2013 when a new model which included crowdsourcing (the Embase Project) was introduced. Records of possible RCTs and quasi-RCTs from Embase are now identified in two ways:

 1. Through an autofeed, and

2. Through human processing/screening (using a ‘crowd’)

 The autofeed

Approximately 2/3 of all the reports of RCTs in Embase are indexed with the EMTREE term RCT or CCT. Every month (around the 20th) we feed these records directly into CENTRAL. So that’s 2/3 of the records we want in CENTRAL identified already.

 The crowd approach

The remaining 1/3 is retrieved through a sensitive search strategy developed by Julie Glanville at YHEC. The search (complete strategy available at: http://www.cochranelibrary.com/help/central-creation-details.html) is run every month in Embase via Ovid SP. These records are then screened by a crowd. Anyone can join the crowd and start screening. When someone signs up, they undergo a brief, interactive training module before being able to screen ‘live’ records.

 How do we ensure quality in this process?

To be included in CENTRAL, a record is assessed by at least two different screeners. We have evaluated this method and the results show very high levels of accuracy in terms of the crowd’s ability to identify the records we want in CENTRAL, and to reject the records we don’t want.

 Progress to date

Our vision is that in the future authors and information specialists will only need to search CENTRAL to find relevant reports of randomized and quasi-randomised trials. We are much closer to this goal now as far as Embase is concerned. We have established the new crowd model and evaluated its accuracy, and we have cleared several years’ worth of records. As of mid-December 2015, the crowd were screening records that were added to Embase in October 2015. The number of records needing human screening roughly doubled in the last year with the introduction of conference records into the crowd process, but despite this, we are closing the small time-lag between the date of publication in Embase and publication in CENTRAL.

 Does this mean review author teams won’t need to search Embase anymore?

Not quite, or least not completely.  If you only searched CENTRAL at the time of writing you would potentially miss RCTs that were added to Embase in October, November and the first couple of weeks of December 2015. For the time being you will need to run a ‘top-up’ search in Embase covering the last few months to be completely up-to-date.

 Looking ahead

Now that we have established our method as robust, we are focusing on improving efficiency by closing the time lag, and investigating the feasibility of including other databases into the crowd model. Work on a a ‘centralised search service’ is at an early stage. There are challenges ahead, particularly the issue of getting permission to republish records from other commercial databases. However, we’re working right now on a way of identifying and including RCT reports from CT.gov. The approach outlined above won’t be right for every database, but using a combination of highly sensitive search filter development, crowdsourcing and machine learning we’re pressing ahead with the project!

 Where can I find out more?

Any other questions or queries: Anna Noel-Storr (anna.noel-storr@rdm.ox.ac.uk), Gordon Dooley (Gordon@metaxis.com) or Ruth Foxlee (rfoxlee@cochrane.org)