Register for an account

X

Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.

X

Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.

Mind

How to Be A PubMed Historian

NeuroskepticBy NeuroskepticMay 18, 2010 8:05 PM

Newsletter

Sign up for our email newsletter for the latest science news

Quite a lot of people seem to like those graphs I sometimes make showing the number of papers published about a certain topic in any given year, based on the number of PubMed hits.

But how do I do it? Surely I don't sit there manually searching PubMed for each term, for each year, right? That would mean dozens, maybe hundreds, of manual searches. Well, unfortunately, that is exactly how I've done it in the past. I really am that cool, see.

placeholder

Actually it doesn't take verylong once you get into the swing of it, but I've now worked out a better way. See below for a

bash

script which repeatedly searches PubMed for a given sequence of years, downloads the first page of the results, picks out the bit where it tells you how many hits you got, and puts it all into a single output text file ready to be pasted into Excel or whatever. This comes with no guarantees whatsoever, but it seems to work. Enjoy...

Edit 29/06/2010: Vastly improved version that searches for multiple different terms sequentially, accepts terms that include spaces, and outputs the data into a sensible format

. The search term text file should be a plain text file containing one search term per line. e.g:

serotonin depressiondopamine depressionGABA depression

Would search for each of those terms and output the data for each year into a single text file - with three data columns in this case - good for comparing the relative popularity of many different terms across time.

---

#! /bin/bash# 29 . 06 . 2010#PubMedHistory script by Neuroskeptic http://neuroskeptic.blogspot.com# script to find out how many PubMed hits for a certain string in a given year range.

# usage: script (search term text file) (start year) (end year) (output file)# e.g script list_of_terms.txt 2000 2005 dope.txt#first, print the HEADER line of the output file.

printf "YEARt" > $4cat $1 | while read subjectdo#pre-format the subject to remove spacesffa=${subject/' '/%20}echo -n "$ffa" >> $4printf "t" >> $4done#and a newlineprintf "n" >> $4

#Now the real thing. The main loop is a YEAR loop:

for (( yearz=$2; yearz<=$3; yearz++ )) do #For each year, create a temporary file t.txt containing the output for this line.#First, the year, then a tab.

printf "$yearzt" > t.txt

#now, a second loop to go through the list of searchescat $1 | while read subjectdoone=${subject/' '/%20}wget -O $yearz.txt http://www.ncbi.nlm.nih.gov/sites/entrez?term="$one"+"$yearz"'[Publication Date]'

#find the line in the output with what we're interested inoutput=`cat $yearz.txt | grep ncbi_resultcount | awk '{print}'`#now, change it to get rid of the bit containing the search term#as this will screw up the next step if it contains spaces!output=${output/content*

publication/LOL}#print to a temp fileecho $output > temp$one$2$3$4.txt#find the bit we want using awkoutput=`awk '{ print $22 }' temp$one$2$3$4.txt`rm temp$one$2$3$4.txtrm $yearz.txt#trim outputtrimmedout=${output#content=

"}trimmedoutB=${trimmedout%"}#replace "false" with 0 because that's what "false" meanstrimmedoutC=${trimmedoutB/'

false'/0}echo in year $yearz , I got $trimmedoutC. Saving to temp file t.txt#write the result, and a tab, to the TEMPORARY output fileprintf "$trimmedoutCt" >> t.txtdone#Now we've done all the search terms for this YEAR, so send the temporary data to the final filecat t.txt >> $4#and give it a newlineprintf "n" >> $4donerm t.txt

    2 Free Articles Left

    Want it all? Get unlimited access when you subscribe.

    Subscribe

    Already a subscriber? Register or Log In

    Want unlimited access?

    Subscribe today and save 70%

    Subscribe

    Already a subscriber? Register or Log In