Categories
original research

Appendix to JAWWS: An Incrementally Rewritten Paragraph

Yesterday, I published a post describing an idea to improve scientific style by rewriting papers as part of a new science journal. I originally wanted to conclude the post with a demonstration of how the rewriting could be done, but I didn’t want to add too much length. Here it is as an appendix.

We start with a paragraph taken more or less at random from a biology paper titled “Shedding light on the ‘dark side’ of phylogenetic comparative methods“, published by Cooper et al. in 2016. Then, in five steps, we’ll incrementally improve it — at least according to my preferences! Let me know if it fits your own idea of good scientific writing as well.

1. Original

Most models of trait evolution are based on the Brownian motion model (Cavalli-Sforza & Edwards 1967; Felsenstein 1973). The Ornstein–Uhlenbeck (OU) model can be thought of as a modification of the Brownian model with an additional parameter that measures the strength of return towards a theoretical optimum shared across a clade or subset of species (Hansen 1997; Butler & King 2004). OU models have become increasingly popular as they tend to fit the data better than Brownian motion models, and have attractive biological interpretations (Cooper et al. 2016b). For example, fit to an OU model has been seen as evidence of evolutionary constraints, stabilising selection, niche conservatism and selective regimes (Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013). However, the OU model has several well-known caveats (see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014). For example, it is frequently incorrectly favoured over simpler models when using likelihood ratio tests, particularly for small data sets that are commonly used in these analyses (the median number of taxa used for OU studies is 58; Cooper et al. 2016b). Additionally, very small amounts of error in data sets can result in an OU model being favoured over Brownian motion simply because OU can accommodate more variance towards the tips of the phylogeny, rather than due to any interesting biological process (Boettiger, Coop & Ralph 2012; Pennell et al. 2015). Finally, the literature describing the OU model is clear that a simple explanation of clade-wide stabilising selection is unlikely to account for data fitting an OU model (e.g. Hansen 1997; Hansen & Orzack 2005), but users of the model often state that this is the case. Unfortunately, these limitations are rarely taken into account in empirical studies.

Okay, first things first: let’s banish all those horrendous inline citations to footnotes.

2. With footnotes

Most models of trait evolution are based on the Brownian motion model.1Cavalli-Sforza & Edwards 1967; Felsenstein 1973 The Ornstein–Uhlenbeck (OU) model can be thought of as a modification of the Brownian model with an additional parameter that measures the strength of return towards a theoretical optimum shared across a clade or subset of species.2Hansen 1997; Butler & King 2004 OU models have become increasingly popular as they tend to fit the data better than Brownian motion models, and have attractive biological interpretations.3Cooper et al. 2016b For example, fit to an OU model has been seen as evidence of evolutionary constraints, stabilising selection, niche conservatism and selective regimes.4Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013 However, the OU model has several well-known caveats.5see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014 For example, it is frequently incorrectly favoured over simpler models when using likelihood ratio tests, particularly for small data sets that are commonly used in these analyses.6the median number of taxa used for OU studies is 58; Cooper et al. 2016b Additionally, very small amounts of error in data sets can result in an OU model being favoured over Brownian motion simply because OU can accommodate more variance towards the tips of the phylogeny, rather than due to any interesting biological process.7Boettiger, Coop & Ralph 2012; Pennell et al. 2015 Finally, the literature describing the OU model is clear that a simple explanation of clade-wide stabilising selection is unlikely to account for data fitting an OU model,8e.g. Hansen 1997; Hansen & Orzack 2005 but users of the model often state that this is the case. Unfortunately, these limitations are rarely taken into account in empirical studies.

Much better.

Does this need to be a single paragraph? No, it doesn’t. Let’s not go overboard with cutting it up, but I think a three-fold division makes sense.

3. Multiple paragraphs

Most models of trait evolution are based on the Brownian motion model.9Cavalli-Sforza & Edwards 1967; Felsenstein 1973

The Ornstein–Uhlenbeck (OU) model can be thought of as a modification of the Brownian model with an additional parameter that measures the strength of return towards a theoretical optimum shared across a clade or subset of species.10Hansen 1997; Butler & King 2004 OU models have become increasingly popular as they tend to fit the data better than Brownian motion models, and have attractive biological interpretations.11Cooper et al. 2016b For example, fit to an OU model has been seen as evidence of evolutionary constraints, stabilising selection, niche conservatism and selective regimes.12Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013

However, the OU model has several well-known caveats.13see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014 For example, it is frequently incorrectly favoured over simpler models when using likelihood ratio tests, particularly for small data sets that are commonly used in these analyses.14the median number of taxa used for OU studies is 58; Cooper et al. 2016b Additionally, very small amounts of error in data sets can result in an OU model being favoured over Brownian motion simply because OU can accommodate more variance towards the tips of the phylogeny, rather than due to any interesting biological process.15Boettiger, Coop & Ralph 2012; Pennell et al. 2015 Finally, the literature describing the OU model is clear that a simple explanation of clade-wide stabilising selection is unlikely to account for data fitting an OU model,16e.g. Hansen 1997; Hansen & Orzack 2005 but users of the model often state that this is the case. Unfortunately, these limitations are rarely taken into account in empirical studies.

We haven’t rewritten anything yet — the changes so far are really low-hanging fruit! Let’s see if we can improve the text more with some rephrasing. This is trickier, because there’s a risk I change the original meaning, but it’s not impossible.

4. Some rephrasing

Most models of trait evolution are based on the Brownian motion model, in which traits evolve randomly and accrue variance over time.17Cavalli-Sforza & Edwards 1967; Felsenstein 1973

What if we add a parameter to measure how much the trait motion returns to a theoretical optimum for a given clade or set of species? Then we get a family of models called Ornstein-Uhlenbeck,18Hansen 1997; Butler & King 2004 first developed as a way to describe friction in the Brownian motion of a particle. These models have become increasingly popular, both because they tend to fit the data better than simple Brownian motion, and because they have attractive biological interpretations.19Cooper et al. 2016b For example, fit to an Ornstein-Uhlenbeck model has been seen as evidence of evolutionary constraints, stabilising selection, niche conservatism and selective regimes.20Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013

However, Ornstein-Uhlenbeck models have several well-known caveats.21see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014 For example, they are frequently — and incorrectly — favoured over simpler Brownian models. This occurs with likelihood ratio tests, particularly for the small data sets that are commonly used in these analyses.22the median number of taxa used for Ornstein-Uhlenbeck studies is 58; Cooper et al. 2016b It also happens when there is error in the data set, even very small amounts of error, simply because Ornstein-Uhlenbeck models accommodate more variance towards the tips of the phylogeny — therefore suggesting an interesting biological process where there is none.23Boettiger, Coop & Ralph 2012; Pennell et al. 2015 Additionally, users of Ornstein-Uhlenbeck models often state that clade-wide stabilising selection accounts for data fitting the model, even though the literature describing the model warns that such a simple explanation is unlikely.24e.g. Hansen 1997; Hansen & Orzack 2005 Unfortunately, these limitations are rarely taken into account in empirical studies.

What did I do here? First, I completely got rid of the “OU” acronym. Acronyms may look like they simplify the writing, but in fact they often ask more cognitive resources from the reader, who has to constantly remember that OU means Ornstein-Uhlenbeck.

Then I rephrased several sentences to make them flow better, at least according to my taste.

I also added a short explanation of what Brownian and Ornstein-Uhlenbeck models are. That might not be necessary, but it’s always good to make life easier for the reader. Even if you defined the terms earlier in the paper, repetition is useful to avoid asking the reader an effort to remember. And even if everyone reading your paper is expected to know what Brownian motion is, there’ll be some student somewhere thanking you for reminding them.25I considered doing this with the “evolutionary constraints, stabilising selection, niche conservatism and selective regimes” enumeration too, but these are mere examples, less critical to the main idea of the section. Adding definitions would make the sentence quite long and detract from the main flow. Also I don’t know what the definitions are and don’t feel like researching lol.

This is already pretty good, and still close enough to the original. What if I try to go further?

5. More rephrasing

Most models of trait evolution are based on the Brownian motion model.26Cavalli-Sforza & Edwards 1967; Felsenstein 1973 Brownian motion was originally used to describe the random movement of a particle through space. In the context of trait evolution, it assumes that a trait (say, beak size in some group of bird species) changes randomly, with some species evolving a larger beak, some a smaller one, and so on. Brownian motion implies that variance in beak size, across the group of species, increases over time.

This is a very simple model. What if we refined it by adding a parameter? Suppose there is a theoretical optimal beak size for this group of species. The new parameter measures how much the trait tends to return to this optimum. This gives us a type of model called Ornstein-Uhlenbeck,27Hansen 1997; Butler & King 2004 first developed as a way to add friction to the Brownian motion of a particle.

Ornstein-Uhlenbeck models have become increasingly popular in trait evolution, for two reasons.28Cooper et al. 2016b First, they tend to fit the data better than simple Brownian motion. Second, they have attractive biological interpretations. For example, fit to an Ornstein-Uhlenbeck model has been seen as evidence of a number of processes, including evolutionary constraints, stabilising selection, niche conservatism and selective regimes.29Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013

Despite this, Ornstein-Uhlenbeck models are not perfect, and have several well-known caveats.30see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014 Sometimes you really should use a simpler model! It is common, but incorrect, to favour an Ornstein-Uhlenbeck model over a Brownian model after performing likelihood ratio tests, particularly for the small data sets that are often used in these analyses.31the median number of taxa used for Ornstein-Uhlenbeck studies is 58; Cooper et al. 2016b Then there is the issue of error in data sets. Even a very small amount of error can lead researchers to pick an Ornstein-Uhlenbeck model, simply because they accommodate more variance towards the tips of the phylogeny — therefore suggesting interesting biological processes where there is none.32Boettiger, Coop & Ralph 2012; Pennell et al. 2015

Additionally, users of Ornstein-Uhlenbeck models often state that the reason their data fits the model is clade-wide stabilising selection (for instance, selection for intermediate beak sizes, rather than extreme ones, across the group of birds). Yet the literature describing the model warns that such simple explanations are unlikely.33e.g. Hansen 1997; Hansen & Orzack 2005

Unfortunately, these limitations are rarely taken into account in empirical studies.

Okay, many things to notice here. First, I added an example, bird beak size. I’m not 100% sure I understand the topic well enough for my example to be particularly good, but I think it’s decent. I also added more explanation of what Brownian models are in trait evolution. Then I rephrased other sentences to make the tone less formal.

As a result, this version is longer than the previous ones. It seemed justified to cut it up into more paragraphs to accommodate the extra length. It’s plausible that the authors originally tried to include too much content in too few words, perhaps to satisfy a length constraint posed by the journal.

Let’s do one more round…

6. Rephrasing, extreme edition

Suppose you want to model the evolution of beak size in some fictional family of birds. There are 20 bird species in the family, all with different average beak sizes. You want to create a model of how their beaks changed over time, so you can reimagine the beak of the family’s ancestor and understand what happened exactly.

Most people who try to model the evolution of a biological trait use some sort of Brownian motion model.34Cavalli-Sforza & Edwards 1967; Felsenstein 1973 Brownian motion, originally, refers to the random movement of a particle in a liquid or gas. The mathematical analogy here is that beak size evolves randomly: it becomes very large in some species, very small in others, with various degrees of intermediate forms between the extremes. Therefore, across the 20 species, the variance in beak size increases over time.

Brownian motion is a very simple model. What if we add a parameter to get a slightly more complicated one? Let’s assume there’s a theoretical optimal beak size for our family of birds — maybe because the seeds they eat have a constant average diameter. The new parameter measures how much beak size tends to return to the optimum during its evolution. This gives us a type of model called Ornstein-Uhlenbeck,35Hansen 1997; Butler & King 2004 first developed as a way to add friction to the Brownian motion of a particle. We can imagine the “friction” to be the resistance against deviating from the optimum.

Ornstein-Uhlenbeck models have become increasingly popular, for two reasons.36Cooper et al. 2016b First, they often fit real-life data better than simple Brownian motion. Second, they are easy to interpret biologically. For example, maybe our birds don’t have as extreme beak sizes as we’d expect from a Brownian model, so it makes sense to assume there’s some force pulling the trait towards an intermediate optimum. That force might be an evolutionary constraint, stabilising selection (i.e. selection against extremes), niche conservatism (the tendency to keep ancestral traits), or selective regimes. Studies using Ornstein-Uhlenbeck models have been seen as evidence for each of these patterns.37Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013

Of course, Ornstein-Uhlenbeck aren’t perfect, and in fact have several well-known caveats.38see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014 For example, simpler models are sometimes better. It’s common for researchers to incorrectly choose Ornstein-Uhlenbeck instead of Brownian motion when using likelihood ratio tests to compare models, a problem especially present due to the small data sets that are often used in these analyses.39the median number of taxa used for Ornstein-Uhlenbeck studies is 58; Cooper et al. 2016b Then there is the issue of error in data sets (e.g. when your beak size data isn’t fully accurate). Even a very small amount of error can lead researchers to pick an Ornstein-Uhlenbeck model, simply because it’s better at accommodating variance among closely related species at the tips of a phylogenetic tree. This can suggest interesting biological processes where there are none.40Boettiger, Coop & Ralph 2012; Pennell et al. 2015

One particular mistake that users of Ornstein-Uhlenbeck models often make is to assume that their data fits the model due to clade-wise stabilising selection (e.g. selection for intermediate beak sizes, rather than extreme ones, across the family of birds). Yet the literature warns against exactly that — according to the papers describing the models, such simple explanations are unlikely.41e.g. Hansen 1997; Hansen & Orzack 2005

Unfortunately, these limitations are rarely taken into account in empirical studies.

This is longer still than the previous version! At this point I’m convinced the original paragraph was artificially short. That is, it packed far more information than a text of its size normally should.

This is a common problem in science writing. Whenever you write something, there’s a tradeoff between brevity, clarity, amount of information, and complexity: you can only maximize three of them. Since science papers often deal with a lot of complex information, and have word limits, clarity often gets the short end of the stick.

Version 6 is a good example of sacrificing brevity to get more clarity. In this case it’s important to keep the amount of information constant, because I don’t want to change what the original authors were saying. It is possible that they were saying too many things. On the other hand, this is only one paragraph in a longer paper, so maybe it made sense to simply mention some ideas without developing them.

I tried a Version 7 in which I aimed for a shorter paragraph, on the scale of the original one, but I failed. To be able to keep all the information, I would have to sacrifice the extra explanations and the bird beak example, and we’d be back to square one. This suggests that both the original paragraph and my rewritten version are on different points on the tradeoff curve. The original is brief, information-rich, and complex dense; my version is information-rich, complex, and clear.. To get brief and clear would require taking some information out, which I can’t do as a rewriter.

It is my opinion that sacrificing clarity is the worst possible world, at least in most contexts. We could then rephrase my project as attempting to emphasize clarity above all else — after all, brevity, information richness and complexity serve no purpose if they fail to communicate what they want to.

Categories
essay

The Journal of Actually Well-Written Science

Once upon a time, I was a master’s student in evolutionary biology, on track towards a PhD and an academic research career.

Some gloomy day (it was autumn and it was Sweden), a professor suggested that we organize a journal club — a weekly gathering to discuss a scientific paper — as an optional addition to regular coursework. I immediately thought, “Reading science papers sucks, so obviously I’m not going to do more of that just for fun.” But all my classmates enthusiastically signed up for it, so I caved in and joined too. And so, every week, I went to the journal club and tried to hide the fact that I had barely skimmed the assigned paper.

I am no longer on track towards a PhD and an academic research career.

There were, of course, many reasons to leave the field after my master’s degree, some better than others. “I hate reading science papers” doesn’t sound like a very serious reason — but if I’m honest with myself, it was a true motivation to quit.

And I think that generalizes far beyond my personal experience.

Science papers are boring. They’re boring even when they should be interesting. They’re awful at communicating their contents. They’re a chore to read. They’re work.

In a way, that’s expected — papers aren’t meant to be entertainment — but over time, I’ve grown convinced that the pervasiveness of bad writing is a major problem in science. It requires a lot of researchers’ precious time and energy. It keeps the public out, including people who disseminate knowledge, such as teachers and journalists, and those who take decisions about scientific matters, such as politicians and business leaders. It discourages wannabe scientists. In short, it makes science harder than it needs to be.

The quality of the writing is, of course, only one of countless problems with current academic publishing. Others include access,1most papers are gated by journals and very expensive to get access to peer review,2a very bad system in which anonymous scientists must review your paper before it gets published, and may arbitrarily reject your work, especially if they are in competition with you, or ask you to perform more experiments labor exploitation,3scientists don’t get paid for writing papers, or for reviewing them, and journals take all the financial upside the failure to report negative results,4which are less exciting than positive results fraud, and so on. These issues are important, but they are not the focus of this essay. The focus here is to examine and suggest a solution to a question that sounds petty and unserious, but is actually a genuine problem: the fact that science papers are incredibly tiresome.


This post contains three main sections:

If you’re short on time, please read the third one, which includes the sketch of a plan to improve scientific style. The other two sections provide background and justification for the plan.

Additionally, I published an appendix in which I rewrite a paragraph multiple times as a demonstration.


What makes scientific papers difficult to read?

Three reasons: topic, content, and style.

Boring topics

Science today is hyperspecialized. To make a new contribution, you need to be hyperspecialized in some topic, and read hyperspecialized papers, and write hyperspecialized ones. It’s unavoidable — science is too big and complex to allow people to make sweeping general discoveries all the time.

As a result, any hyperspecialized paper in a field that isn’t your own isn’t going to be super interesting to you. Consider these headlines:5These are a few titles taken at random from the journal Nature, all published on 30 June 2021.

I could see myself maybe skimming the third one because I’ve been interested in covid vaccines to some superficial extent, but none of them strike me as fun reading. But if you work in superconductors, maybe the Wigner crystal one (whatever that is) sounds appealing to you.

One of the reasons I quit biology is that I eventually figured out that I wasn’t sufficiently interested in the field. Surely that also contributed to my lack of eagerness to read papers. But that isn’t the whole story. There were scientific questions I was genuinely curious about, and for which I should have been enthusiastic about reading the latest research. Yet that almost never happened.

Just like you’re sometimes attracted to a novel or movie because of its premise, only to be disappointed in the actual execution — there are papers that should be interesting due to their topic, but still fail due to their contents or style.

Tedious content

The primary goal of a scientific paper is to communicate science. Surprisingly, we tend to forget this, because, as I said, papers are also a measure of work output. But still, they’re supposed to contain useful information. A good science paper should answer a question and allow another scientist to understand and perhaps replicate the methods.

That means that, sometimes, there is stuff that must be there even though it’s not interesting. A paper might contain a lengthy description of an experimental setup or statistical methods which, no matter what you do, will probably never be particularly compelling.

Besides, it might be very technical and complicated. It’s possible to write complex material that is engaging, but that’s a harder bar to clear.

And then sometimes your results just aren’t that interesting. Maybe they disprove the cool hypothesis you wanted to prove. Maybe you merely found a weak statistical correlation. Maybe “more research is needed.” It’s important to publish results even if they’re negative or unimpressive, but of course that means your paper will have a hard time generating excitement.

So there’s not much we can do in general about content. All scientists try to do the most engaging and life-changing research they can, but only a few will succeed, and that’s okay. (And some scientists adopt a strategy of publishing wrong or misleading content in order to generate excitement, which, well, is a rather obvious bad idea.)

Awful style

Style is somehow both the least important and the most important part of writing.

It’s the least important because it rarely is the reason we read anything. Except for some entertainment,6And even then! There’s some intellectual pleasure to be gleaned from looking at the form of a poem, but it rarely is the top reason we like poetry and songs. we pick what to read based on the contents, whether we expect to learn new things or be emotionally moved. Good style makes it easier to get the stuff, but it’s just a vehicle for the content.

And yet style is incredibly important because without good style (or, as per the transportation analogy, without a functioning vehicle), a piece of writing will never get anywhere. You could have the most amazing topic with excellent content — if it’s badly written, if it’s a chore to read, then very few people will read it.

Scientific papers suck at style.

(Quick disclaimer: As we’re going to discuss below, this isn’t the fault of any individual scientist. It’s a question of culture and social norms.)

Anyone who’s ever read anything knows that long, dense paragraphs aren’t enjoyed by anyone. Yet scientific papers somehow consist of nothing but long and dense paragraphs.7That’s not to say giant paragraphs are always bad; they serve a purpose, which is to make a coherent whole out of several ideas, and they can be written well. But often they aren’t written well, and sometimes they’re messy at the level of ideas. As a result, they often make reading harder, for no gain. Within the paragraphs, too many sentences are long and winding. The first person point of view is often eschewed in favor of some neutral-sounding (but not actually neutral, and very stiff) third person passive voice. The vocabulary tends to be full of jargon. The text is commonly sprinkled with an overabundance of AAAs,8Acronyms And Abbreviations, an acronym I just made up for illustrative purposes. even though they are rarely justified as a way to save space in this age where most papers are published digitally. Citations, which are of course a necessity, are inserted everywhere, impeding the flow of sentences.

Here’s an example, selected at random from an old folder of PDFs from one of my master’s projects back in the day. Ironically, it discusses the fact that some methods in evolutionary biology are applied incorrectly because… it’s hard to extract the info from long, technical papers.9Here’s the original paper, which by a stroke of luck for me, is open-source and shared with a Creative Commons license.

Don’t actually read it closely! This is just for illustration. Skim it and scroll down to the end to keep reading my essay.

Most models of trait evolution are based on the Brownian motion model (Cavalli-Sforza & Edwards 1967; Felsenstein 1973). The Ornstein–Uhlenbeck (OU) model can be thought of as a modification of the Brownian model with an additional parameter that measures the strength of return towards a theoretical optimum shared across a clade or subset of species (Hansen 1997; Butler & King 2004). OU models have become increasingly popular as they tend to fit the data better than Brownian motion models, and have attractive biological interpretations (Cooper et al. 2016b). For example, fit to an OU model has been seen as evidence of evolutionary constraints, stabilising selection, niche conservatism and selective regimes (Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013). However, the OU model has several well-known caveats (see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014). For example, it is frequently incorrectly favoured over simpler models when using likelihood ratio tests, particularly for small data sets that are commonly used in these analyses (the median number of taxa used for OU studies is 58; Cooper et al. 2016b). Additionally, very small amounts of error in data sets can result in an OU model being favoured over Brownian motion simply because OU can accommodate more variance towards the tips of the phylogeny, rather than due to any interesting biological process (Boettiger, Coop & Ralph 2012; Pennell et al. 2015). Finally, the literature describing the OU model is clear that a simple explanation of clade-wide stabilising selection is unlikely to account for data fitting an OU model (e.g. Hansen 1997; Hansen & Orzack 2005), but users of the model often state that this is the case. Unfortunately, these limitations are rarely taken into account in empirical studies.

This paragraph is not good writing by any stretch of the imagination.

First, it’s a giant paragraph.10Remarkably, it is the sole paragraph in a subsection titled “Ornstein-Uhlenbeck (Single Stationary Peak) Models of Traits Evolution,” which means that the paragraph’s property of saying “hey, these ideas go together” isn’t even used; the title would suffice. It contains two related but distinct ideas, which are that (1) the Ornstein–Uhlenbeck model can be useful, and that (2) it has caveats. Why not split it? Speaking of which, the repetition of the “OU” acronym is jarring. It doesn’t even seem to serve a purpose other than shorten the text a little bit. It’d be better to spell “Ornstein-Uhlenbeck” out each time, and try to avoid repeating it so much.

The paragraph also contains inline citations to an absurd degree. Yes, I’m sure they’re all relevant, and you do need to show your sources, but this is incredibly distracting. Did you notice the following sentence when reading or skimming?

However, the OU model has several well-known caveats.

It’s a key sentence to understand the structure of the paragraph, indicating a transition from idea (1) to idea (2), but it is inelegantly sandwiched between two long enumerations of references:

(Wiens et al. 2010; Beaulieu et al. 2012; Christin et al. 2013; Mahler et al. 2013). However, the OU model has several well-known caveats (see Ives & Garland 2010; Boettiger, Coop & Ralph 2012; Hansen & Bartoszek 2012; Ho & Ané 2013, 2014).

Any normal human will just gloss over these lines and fail to grasp the structure of the paragraph. Not ideal.11The ideal format for citations in scientific writing is actually a matter of some debate, and depends to some extent on personal preference. As a friend said: “The numbered citation style (like in Science or Nature) is really nice because it doesn’t interrupt paragraphs, especially when there are a lot of citations. But many people also like to see which paper/work you are referencing without flipping to the end of the article to the references section.”

I admit I am biased towards prioritizing reading flow, but it’s true that having to match numbers to references at the end of a paper can be tedious. In print and PDFs, I’d be in favor of true footnotes (as opposed to endnotes), so that you don’t have to turn a page to read it. In digital formats, I’d go with collapsible footnotes (like the one you’re reading right now if you’re on my blog). Notes in the margin can also work, either in print or online. Alexey Guzey’s blog is a good example.

And if mentioning a reference is useful to understand the text, the writer should simply spell it out directly in the sentence.

Finally, there is quite a bit of specialized vocabulary that will make no sense to most readers, such as “niche conservatism” or “clade-wide stabilising selection.” That may be fine, depending on the intended audience; knowing what is or isn’t obvious to your audience is a difficult problem. I tend to err on the side of not including a term if a general lay audience wouldn’t understand it, but that’s debatable and dependent on the circumstances.

Now, I don’t mean to pick on this example or its authors in particular. In fact, it isn’t even a particularly egregious example.12Interestingly, the more I examined the paragraph in depth, the less I thought it was bad writing. This is because, I think, becoming familiar with something makes us see it in a more favorable light. In fact this is why authors are often blind to the flaws in their own writing. But by definition a paper is written for people who aren’t familiar with it. Many papers are worse! But as we saw, it’s far from being a breeze to read. Bad, boring style is so widespread that even “good” papers aren’t much fun.

Yet science can definitely be fun. Some Scott Alexander blog posts manage to make me read thousands of (rigorous!) words about psychiatric drugs, thanks to his use of microhumor. And then, of course, there’s an entire genre devoted to “translating” scientific papers into pleasant prose: popular science. Science popularizers follow different incentives than scientists: their goal is to attract clicks, so they have to write in a compelling way. They take tedious papers as input, and then produce fun stories as output.

There is no fundamental reason why scientists couldn’t write directly in the style of science popularizers. I’m not saying they should copy that exactly — there are problems with popular science too, like sensationalism and inaccuracies — but scientists could at least aim at making their scientific results accessible and enjoyable to interested and educated laypeople, or to undergraduate students in their discipline. I don’t think we absolutely need a layer of people who interpret the work of scientists for the rest of us, in a way akin to the Ted Chiang story about the future of human science.

Topic and content are hard to solve as a general problem. But I think we can improve style. We can create better norms. I have a crazy idea to do that, which we’ll get into at the end of the post, but first, we need to discuss the reasons behind the dismal state of current scientific style.

Why is scientific style so bad?

There are many reasons why science papers suck at style. One is that people writing them, scientists, aren’t selected for their writing ability. They have a lot on their plate already, from designing experiments to performing them to applying for funding to teaching classes. Writing plays an integral part of the process of science, but it’s only a part — compared to, say, fields like journalism or literature.

Another problem is language proficiency. Almost all science (at least in the more technical fields) today is published in English, and since native English speakers are a small minority of the world’s population, it follows that most papers are written by people who have only partial mastery over the language. You can’t exactly expect stellar style from a French or Russian or Chinese scientist who is forced to publish their work in a language that isn’t their own.

Both these reasons are totally valid! There’s no point blaming scientists for not being good writers. It’d be great if all scientists suddenly became masters of English prose, but we all know that’s not going to happen.

The third and most important reason for bad style is social norms.

Imagine being a science grad student, and having to write your first Real Science Paper that will be submitted to a Legit Journal. You’ve written science stuff before, for classes, for your undergrad thesis maybe, but this is the real deal. You really want it to be published. So you try to understand what exactly makes a science paper publishable. Fortunately, you’ve read tons of papers, so you have absorbed a lot of the style. You set out to write it… and reproduce the same crappy style as all the science papers before you.

Or maybe you don’t, and you try to write in an original, lively manner… until your thesis supervisor reads your draft and tells you you must rewrite it all in the passive voice and adopt a more formal style and avoid the verb “to sparkle” because it is “non-scientific.”13The “sparkle” example happened to a friend of mine recently.

Or maybe you have permissive supervisors, so you submit your paper written in an unconventional style… and the journal’s editors reject it. Or they shrug and send it to peer review, from whence it comes back with lots of comments by Reviewer 2 telling you your work is interesting but the paper must be completely rewritten in the proper style.

Who decides what style is proper? No one, and everyone. Social norms self-perpetuate as people copy other people. For this reason, they are extremely difficult to change.

As a scientist friend, Erik Hoel, told me on Twitter:

There is definitely a training period where grad students are learning to write papers (basically a “literary” art like learning how to write short stories) wherein you are constantly being told that things need to be rephrased to be more scientific

And of course there is. Newbie scientists have to learn the norms and conventions of their field. Not doing so would be costly for their careers.

The problem isn’t that norms exist. The problem is that the current norms are bad. In developing its own culture, with its traditions and rituals and “ways we do things,” science managed to get stuck with this horrible style that everyone is somehow convinced is the only way you can write and publish science papers, forever.

It wasn’t always like this. If you go back and look at science papers from the 19th century, for instance, you’ll find a rather different style, and, dare I say, a more pleasant one.

I know this thanks to a workshop I went to in undergrad biology, almost a decade ago. Prof. Linda Cooper of McGill University (now retired, as I have found out when trying to contact her during the writing of this post) showed us a recent physics paper, and a paper written in 1859 by Carlo Matteucci about neurophysiology experiments in frogs, titled Note on some new experiments in electro-physiology.14At least I think this is it; my memory of the workshop is very dim. Dr. David Green, local frog expert, helped me find this paper, and it fits all the details I can remember. You might expect very old papers to be difficult to parse — but no! It’s crystal clear and in fact rather delightful. Here’s a screenshot of the introduction:

It isn’t quite clickbait, but there’s an elegant quality to it. First, it’s told in first person. Second, there’s very little jargon. Third, we quickly get to the point; there’s no lengthy introduction that only serves as proof that you know your stuff. Fourth, there are no citations. Okay, again, we do want citations, but at least we see here that avoiding them can help the writing flow better. (No citations also means that you can’t leave something unexplained by directing the reader to some reference they would prefer not to read. Cite to give credit, but not as a way to avoid writing a clear explanation.)

By contrast, the contemporary physics paper shown at the workshop was basically non-human-readable. I can’t remember what it was, which is probably a good thing for all parties involved.

In the past 150 years, science has undoubtedly progressed in a thousand ways; yet in the quality of the writing, we are hardly better than the scientists of old.

I want to be somewhat charitable, though, so let’s point out that some things are currently done well. For example, I think the basic IMRaD structure — introduction, methods, results, and discussion — is sound.15Although one could argue that IMRaD is perhaps too often followed without thought, like a recipe. The systematic use of abstracts, and the growing tendency to split them into multiple paragraphs, is an excellent development.

There’s been a little bit of progress — but we should be embarrassed that we haven’t improved more.

What happened? It’s hard to say. Some plausible hypotheses, all of which might be true:

  • In the absence of a clear incentive to maximize the number of readers, good style doesn’t develop. The dry and boring style that currently dominates is simply the default.
  • Everyone has their own idea of what good scientific writing should be, and we’ve naturally converged onto a safe middle ground that no one particularly loves, but that people don’t hate enough to change.
  • The current style is favored because it is seen as a mark of positive qualities in science such as objectivity, rigor, or detachment.
  • The style serves as an in-group signal for serious scientists to recognize other serious scientists. Put differently, it is a form of elitism. This might mean that for the people in the in-group, poor style is a feature, not a bug.16Just like unpleasant bureaucracy acts as a filter so that only the most motivated people manage to pass through the system.
  • Science is too globalized and anglicized. There is only one scientific culture, so if it gets stuck on poor norms, there isn’t an alternative culture that can come to the rescue by doing its own thing and stumbling upon better norms.

It’s possible that these forces are too powerful for anyone to successfully change the current norms. Maybe most scientists would think I’m a fool for wanting to improve them. But it does seem to me that we should at least try.

How can we forge better norms?

First, I want to emphasize that the primary goal of scientific writing is communication among researchers, not between researchers and the public. Facilitating this communication, and lowering the barriers to entry into hyperspecialized fields,17For students, and for scientists in adjacent fields are the things I want to optimize for.

However, I do think there are benefits to making science more accessible to non-specialists — scientists in very different fields, academics outside science, journalists, teachers, politicians, etc. — without having to rely on the layer of popular science. So while we won’t optimize for this directly, it’s worth improving it along the way if we can.

With that in mind, how can we improve the social norms for style across all of scientific writing?

Here’s one recipe for failure. Come up with a new style guide, and share it with grad students and professors. Publish op-eds and give conferences on your new approach. Teach writing classes. In short, try to convince individual scientists. Then watch as they just write in the old style because it’s all they know and there’s no point in making it harder for themselves to publish their papers and get recognition.

Science is an insanely competitive field. Most scientists, especially grad students, postdocs and junior professors, are caught in a rat race. They will not want to reduce their chances of publication, even if they privately agree that scientific style should be improved.

(Not to mention, many have been reading and writing in that style for so long that they don’t even see it as problematic anymore.)

By definition, social norms are borderline impossible to change if you’re subject to them. That means that the impulse to change must come from someone who’s not subject to them. Either an extremely well established person, i.e. somebody famous enough to get away with norm-defying behavior, or an outsider — i.e. somebody who just doesn’t care.

Well, I don’t have a Nobel Prize, but I gave up on science years ago and I have zero attachment to current scientific norms, so I think I qualify as an outsider.

But what can an outsider do, if you can’t convince scientists to change? The answer is: do the work for them. Create something new, better, that scientists have an incentive to copy.

Here’s a sketch of how that could be done. Mind you, it’s very much at the stage of “crazy idea”; I don’t know if it would work. But I think there’s at least a plausible path.

The Plan

1. Found a new journal

Let’s call it the Journal of Actually Well-Written Science. I’ll make an exception to my anti-abbreviation stance and call it JAWWS because I just realized it’s a pretty cool and memorable one.

The journal would have precise writing guidelines. Those guidelines are the new norms we’ll try to get established. They would be dependent on personal taste to some extent, but I think it’s possible to come up with a set of guidelines that make sense.

Here’s some of what I have in mind:

  • If it’s a choice between clarity and brevity, prioritize clarity.
  • Split long paragraphs into shorter ones.
  • Use examples. Avoid expressing abstract ideas without supporting them with concrete examples.
  • Whenever possible, place the example before the abstract idea to draw the reader in.
  • Avoid abbreviations and acronyms unless they’re already well-known (e.g. DNA). If you must use or create one, make sure it’s effortless for the reader to remember what it means.
  • Allow as little space as possible for references while still citing appropriately. Of course, it’s fine to write a reference in full if you want to draw attention to it. Also, don’t use a citation as a way to avoid explaining something.
  • Write in the first person, even in the introduction and discussion. Your paper is being written by you, a human being, not by the incorporeal spirit of science.
  • Don’t hesitate to use microhumor; it is often the difference between competent and great writing. My mention of the incorporeal spirit of science is an example of that.
  • Avoid systematic use of the passive voice.
  • Avoid ornamental writing for its own sake. Occasionally, a good metaphor can clarify a thought, but be mindful that it’s easy to overuse them.
  • Remember that the primary goal of your paper is to communicate methods or results. Always keep the reader in mind. And make that imaginary reader an educated nonspecialist, i.e. you whenever you read papers not directly relevant to your field.

In the appendix, I show a multistep application of this to the paragraph I quoted above as an example.

Again, we’re not trying to reinvent popular science writing. We will borrow techniques and ideas from it, and try to emulate it insofar as it’s good at communicating its content. But the end goal is very different — JAWWS is intended not to entertain, but to publish full, rigorous methods and results that can be cited by researchers. I want it to be a new kind of scientific journal, but a scientific journal nonetheless.

2. Hire great writers

JAWWS will eventually accept direct submissions by researchers. But as a new journal, it will have approximately zero credibility at first. So we will start by republishing existing papers that have gone through a process of rewriting by highly competent science communicators.

Finding those communicators might be the hardest part. We need people who can understand scientific papers in their current dreadful state, but who haven’t already accepted the current style as inevitable. And we need them to be excellent at their job. If we rewrite a paper into something that’s no better than the original — or, worse, if we introduce mistakes — then the whole project falls apart.

On the other hand, tons of people want to be writers in general and science writers in particular, so there is some hope.

3. Pick papers to rewrite

It’s unclear how many science papers are published each year, but a reasonable estimation is quite a lot. I saw the 2,000,000 per year figure somewhere; I have no idea if it’s accurate, but even if it’s off by an order of magnitude or two, that’s still a lot.

How should JAWWS select the papers it rewrites?

I’m guessing that one criterion will be copyright status. I’m no intellectual property specialist, so I have no idea if it’s legal to rewrite an entire article that’s protected by copyright. Fortunately, there are many papers that are released with licenses allowing people to adapt them, so I suggest we start with those. Another avenue is to rewrite papers by scientists who like this project and grant us permission to use their work.

Then there are open questions. Should JAWWS focus on a particular field at first? Should it rewrite top papers? Neglected papers? Particularly difficult papers? Randomly selected papers? Should it focus more on literature reviews, experimental studies, meta-analyses, or methods papers? Should it accept applications by scientists who’d like our help? We can settle these questions in due time.

Crucially, the authors of a JAWWS rewritten paper will be the same as the paper it is based on. When people cite it, they’ll give credit to the original authors, not the rewriter, whose name should be mentioned separately. This also means that the original authors should approve the rewritten paper, since it’ll be published under their names.18My friend Caroline Nguyen makes an important point: the process must involve very little extra work for scientists who are already burdened with many tasks. Their approval could therefore be optional — i.e. they can veto, but by default we assume that they approve. It might also be possible to involve a writer earlier in the research process, so that they are in close contact with a team of scientists and are able to publish a JAWWS paper at the same time as the scientists publish a traditional one. In all cases, we can expect the first participating researchers to be the ones who agree with the aims of our project and trust that JAWWS is a good initiative.

4. Build prestige over time

If the rewritten papers are done well, then they’ll be pleasant to read. If they’re pleasant to read, more people will read them. If more people read them, then they’re likely to get cited more. If they get cited more, then they will have more impact. If JAWWS publishes a lot of high-impact papers, then JAWWS will become prestigious.

There’s no point in aiming low — we should try making JAWWS as prestigious, if not more, than top journals like NatureScience, or Cell.19Is this a good goal? Wouldn’t it be better to just try to build something different? Well, I see this project kind of like Tesla for cars: Tesla isn’t trying to replace cars with something else, it’s just trying to make cars much better. So I would like JAWWS to be taken as seriously as the prestigious journals — while being an improvement over them. The danger in building a new thing is that you just create your little island of people who care about style while the rest of science is still busy competing for a paper in prestigious journals. That wouldn’t be a good outcome.

Of course, that won’t happen overnight. But I don’t see why it wouldn’t be an achievable goal. And even if we don’t quite get there, the “aim for the moon, if you fail you’ll fall among the stars” principle comes into play. JAWWS can have a positive influence even if it doesn’t become a top journal.

Along the way, JAWWS will become able to accept direct submissions and publish original science papers. It might also split into several specialized journals. At this point we’ll be a major publishing business!

5. Profit!

I don’t know a lot about the business side of academic publishing, but my understanding is that there are two main models:

  • Paywall: researchers/institutions pay to access the contents of the journal.
  • Open-access: researchers/institutions pay to publish content that is then made accessible to everyone.

For JAWWS, a paywall model might make sense, since the potential audience would be larger than just scientists. But it would run contrary to the ideal of making science accessible to as many people as possible. Open-access seems more promising, and it feels appropriate to ask for a publication fee as compensation for the work needed to rewrite a paper. But that might be hard to set up in the beginning when we haven’t proven ourselves yet.

Maybe some sort of freemium model is conceivable, e.g. make papers accessible on a website but provide PDFs and other options to subscribers only.

Another route would be to set up JAWWS as a non-profit organization. An example of a journal that is also a non-profit is eLife. This might help with gaining respectability within some circles, but my general feeling is that profitability is better for the long-term survival of the project.

6. Improve science permanently

No, “profit” is not the last step in the plan. Making money is great, but we can and should think bigger. The end goal of this project is to improve science writing norms forever.

If JAWWS becomes a reasonably established journal, then other publications might copy its style. That would be very good and highly encouraged. But more importantly, it would show that it’s possible to change the norms for the better. Other journals will feel more free to experiment with different formats. Scientists will gain freedom in the way they share their work. Maybe we can even get rid of other problems like the ones associated with peer review while we’re at it.

One dark-side outcome I can imagine is that the norms are simply destroyed, we lose the coherence that science currently has, and then it becomes harder to find reliable information. To which I respond… that I’m not sure that it would be worse than the present situation. But anyway, it seems unlikely to happen. There will always be norms. There will always be prestigious people and publications that you can copy to make sure you write in the most prestigious style. We are a very mimetic bunch, after all.

And if we succeed… then science becomes fun again.

Less young researchers will drop out (like I did). Random curious people will read science directly instead of sensationalist popularizers. It’ll be easier for the public (who pays for most of science, after all) to keep informed about the latest research. Maybe it’ll even encourage more kids to get into the field. If everything goes well, we’ll get one step closer to a new golden age of humanity.

Okay, maybe I’m getting ahead of myself. But then again, like I said, there’s no point in aiming low.


To repeat, this is still a crazy idea. It did get less crazy after I finished writing the above plan, though. I have a feeling it might really work.

But it’s very possible I’m wrong. Maybe there are some major problems I haven’t foreseen. Maybe the entire scientific establishment will hate me for trying to change their norms. Maybe it’s just too ambitious a project, and it will fail if somebody doesn’t devote themselves to it. I don’t know if I should devote myself to it.

So, I’d really love for this post to be shared widely and for readers — whether professional scientists, writers, students, science communicators, and really anyone who’s interested in science somehow — to let me know what they think. Like science as a whole, this should be a collaborative effort.

 

Further reading

 

Thanks to Khalis Afnan, Dan Stern, Caroline Nguyen, Mahwash Jamy, Daniel Golliher, and Ulkar Aghayeva for feedback on this piece.