Retraction Watchreports on a strange case of alleged plagiarism. In February 2016, F1000Research published a paper called How blockchain-timestamped protocols could improve the trustworthiness of medical science. The authors, Greg Irving and John Holden, demonstrated the use of the bitcoin blockchain as a way of publicly verifying the existence of a certain document at a certain point in time. This approach, they say, could be used to make preregistered research protocols more secure. A problem with preregistration is that it requires a trusted central authority to securely store the protocols. To overcome this, Irving and Holden suggested using the distributed bitcoin network to timestamp documents.
The method involves hashing the document containing the protocol, and then using the hash value as a password (private key) to create a new bitcoin account. By transferring a nominal sum of bitcoins into the new account, a permanent data trail is created, all across the worldwide bitcoin network, which anyone can later use to verify that the hash value was used on the network at that particular time. Because the hash value is unique to a particular document (even a change of one character would totally change the hash), this serves as a tamper-proof way of verifying preregistration. It's a clever idea - repurposing the bitcoin network to help make science more rigorous. But it turns out that it wasn't Irving and Holden's idea. Back in August 2014, a blogger called Benjamin Gregory Carlisle wrote a post called Proof of prespecified endpoints in medical research with the bitcoin blockchain. In this piece, Carlisle proposed the hash document/create bitcoin account/transfer nominal sum system as a way of verifying preregistration in science. He provided a step by step guide to how to do it. Yet Irving and Holden didn't cite or acknowledge Carlisle's post at all. In fact, they implied that the idea was theirs e.g. they wrote that "we propose" the blockchain scheme.
Reading both documents makes it clear that intellectually speaking, the F1000Research paper is very closely based on Carlisle's blog post. The main difference is that Carlisle simply proposed the idea, while Irving and Holden actually tried it out in practice - but what they tried was 100% Carlisle's idea. Also, in terms of the text, the paper contained some passages which are strikingly similar to Carlisle's post. For example, here's the post:
Bitcoin uses a distributed, permanent, timestamped, public ledger of all transactions (called a “blockchain”) to establish which addresses have been credited with how many bitcoins. The blockchain indirectly provides a method for establishing the existence of a document at particular time that can be independently verified by any interested party...
And here's the paper:
A blockchain is a distributed, permanent, timestamped public ledger of transactions. In doing so it provides a method for establishing the existence of a document at a particular time that can be independently verified by any interested party.
Hmm. I did my own plagiarism check and I found another possible issue with the paper. Here is some text from a Wall Street Journal article published on 2nd February 2016:
Once a block of data is recorded on the blockchain ledger, it’s extremely difficult to change or remove. When someone wants to add to it, participants in the network — all of which have copies of the existing blockchain — run algorithms to evaluate and verify the proposed transaction. If a majority of nodes agree that the transaction looks valid — that is, identifying information matches the blockchain’s history — then the new transaction will be approved and a new block added to the chain.
Compare this to this passage from the paper, published 26th February 2016:
When someone wishes to add to it, participants in the network – all of whom have copies of the existing blockchain – run algorithms to evaluate and verify the proposed action. Once the majority of ‘nodes’ confirm that a transaction is valid i.e. matches the blockchain history then the new transaction will be approved and added to the chain. Once a block of data is recorded on a blockchain ledger it is extremely difficult to change or remove it as doing so would require changing the record on many thousands computers worldwide.
According to Retraction Watch, Carlisle wrote to F1000Research to protest about the similarities. Irving and Holden subsequently uploaded a revised version of their paper
. In the new version, they do cite Carlisle as the originator of the blockchain preregistration idea, and the textual similarities are reduced (the WSJ-like passage remains, however). But is this enough? Should the original paper have been retracted for plagiarism? Carlisle is not happy with the revisions, telling Retraction Watch
First, ex post facto citation would not undo the misconduct of plagiarism, if it were deemed to have occurred. According to COPE, corrections are only warranted with small passages (e.g. a few sentences in the discussion) of unattributed parallel text. The COPE guidelines also say “Publications should be retracted as soon as possible after the journal editor is convinced that the publication is seriously flawed and misleading (or is redundant or plagiarised).”
In my view, allowing authors to simply correct their papers when accused of plagiarism is tantamount to condoning the practice. It's easy to "correct" a plagiarized article after the fact by adding citations and modifying text. If authors know that the worst possible outcome of plagiarism is that they'll have to modify their paper, why not plagiarize? Only if there is a real prospect of a sanction (i.e. retraction) will people think twice before committing plagiarism. But if F1000Research's decision to allow the paper to be corrected is questionable, some of the statements by the editors and peer reviewers who approved the paper are downright bizarre. Retraction Watch quotes one reviewer as saying that "blog posting is just equivalent to idea sharing and brainstorming", seemingly implying that blogs don't count when it comes to plagiarism, and don't need to be cited. In fact plagiarism is plagiarism whatever the source, and people cite blogs all the time: here
are some citations that my blog has received this year. Another reviewer claimed that "The blog contains no supplementary materials and yet the research paper does" as if this proves that the paper wasn't based on the blog post. It's concerning to see such misunderstandings on the part of academics.
Irving G, & Holden J (2016). How blockchain-timestamped protocols could improve the trustworthiness of medical science. F1000Research, 5 PMID: 27239273