Randomising the word order of a text

To share texts with others, it occasionally may be necessary to obfuscate what was written so that it is not so easy to understand what a person has said. The following approach randomly changes the word order in a text. This makes it hard to understand the person’s writing, which helps towards anonymisation. Of course, this is only useful if word order does not matter for your text analysis, which is often the case.

I use the free and open-source statistical program R to achieve this goal:

#example data frame:
d = data.frame(ID=c(1,2,3), CLASS=c(0,2,3), text = c(“I read your paper Towards Automated Content Analysis of Discussion Transcripts with great interest”, “I find this topic very interesting especially as I am working on the automated detection of a related construct reflective thinking”, “The paper announces that the complete dataset for the study and the source code of the implementation are available” ))


d$text = sapply(d$text, function(x) paste(sample( scan(text = as.character(x), what = “character”, quote = “”, quiet = TRUE) ), collapse=” “) )



[1] “Towards Transcripts interest Discussion great I with Content paper your Automated read Analysis of”
[2] “this related very I interesting of automated am I on as working construct the thinking reflective detection topic especially find a”
[3] “study code that complete implementation paper the the the are available the of for and The source announces dataset”