Credit where credit is due

There has been a lot of tension in the psychological community recently. Replications are becoming more prevalent and many of them are finding much smaller effects or none at all. This then raises a lot of uncomfortable questions: is the studied effect real? How was it achieved in the first place? Were less than honest methods used (p-hacking etc.)? The original researchers can sometimes feel that these questions go beyond valid criticisms to full-blown attacks on their integrity and/or their abilities as a scientist. This has led to heated exchanges and some choice pejoratives being thrown about by both “sides”1.

This blog post isn’t here to pass judgement on those who are defending the original studies or the replications (and all the associated behaviour and comments). This article is here to celebrate the behaviour of a researcher whose example I think many of us should follow.

Dana Carney was one of the original researchers who investigated a phenomenon called “power posing” (Carney, Cuddy, Yapp; 2010). They supposedly found that “high-power nonverbal displays” affected your hormone levels and feelings of confidence. But a large-scale failed replication and a re-analysis later, it appears there is no effect.

So, as one of the main researchers for this effect, what should you do when faced with this evidence? All the incentive structures currently in place2 would encourage you to hand-wave away these issues: the replication wasn’t conducted properly, there are hidden moderators that were not present in the replication, the replicators were looking to find no effect, etc. But Carney has written an article stating that she does not “believe that “power pose” effects are real.” She goes into further detail as to the problems with the original study (admitting to using “researcher degrees of freedom” to fish for a significant result and to analysing “subjects in chunks” and stopping when a significant result was found).

I find this honesty commendable and wish all researcher’s whose work is shown to be false would be able to admit past mistakes and wrong-doings. Psychology cannot improve as a science unless we update our beliefs in the face of new evidence. As someone who is quite early in their science career, I’ve never had the experience of someone failing to replicate a finding of mine but I imagine it is quite hard to take (for more detail I recommend this post by Daniel Lakens). Admitting that something you have discovered isn’t real, whilst difficult, helps us have a clearer picture of reality. Hopefully this acknowledgement will encourage others to be more honest with their work.

But there’s a reason why few have taken this step. The potential negative consequences can be quite daunting: loss of credibility due to admissions of p-hacking, undermining of key publications (which may have an impact on job and tenure applications), to name a few. I understand and am (slightly) sympathetic as to why it is so rare. This is why I like Tal Yarkoni’s suggestion of an “amnesty paper” where authors could freely admit they have lost confidence in a finding of theirs and why. They could do so without any fear of repercussions and because many others are doing it, it would be less daunting. Until journals are willing to publish these kinds of articles, I would suggest there be  a website/repository created which is dedicated to such articles. This will mean there is a publicly available record of this paper of doubt about a researcher’s finding.  I also think celebrating those who do make the decision to publicly denounce one of their findings is important as it should encourage scientists to see this admission as a sign of strength, not weakness. This will help change the culture of how we interpret failed replications and past findings. This will hopefully then encourage scientists to write these articles expressing doubts about their past work and journals to publish them. I believe psychology will only improve if this behaviour becomes the norm.

Notes: I contacted Dana Carney prior to publication and she had no corrections to make.

References:

Carney, D.R.; Cuddy, A.J.C.; & Yap, A.J. (2010). Power Posing: Brief Nonverbal Displays Affect Neuroendocrine Levels and Risk Tolerance. Psychological Science, 21 (10) 1363–1368.

Lakens, D. (2016). Why scientific criticism sometimes needs to hurt [online] Available at: http://daniellakens.blogspot.co.uk/2016/09/why-scientific-criticism-sometimes.html

Ranehill, E.; Dreber, A.; Johannesson, M.; Leiberg, S.; Sul, S.; & Weber, R.A. (2015). Psychological Science, 1–4.

Simmons, D. & Simonsohn, U. (2015). [37] Power Posing: Reassessing The Evidence Behind The Most Popular TED Talk [online] Available at: http://datacolada.org/37

Notes on Paul Meehl’s “Philosophical Psychology Session” #05

These are the notes I made whilst watching the video recording of Paul Meehl’s philosophy of science lectures. This is the fifth episode (a list of all the videos can he found here). Please note that these posts are not designed to replace or be used instead of the actual videos (I highly recommend you watch them). They are to be read alongside to help you understand what was said. I also do not include everything that he said (just the main/most complex points).

  • Operationism states all misible concepts in scientific theory must be operationally defined in observable predicates BUT that’s incorrect, don’t need all theoretical postulates to map to observable predicates.
  • Don’t need constants to be able to use functions and see if the components are correct. Given the function forms you can know the parameters (ideal case is to derive parameters). Weaker version: I can’t say what a, b, and c are but I know they are transferable or that a tends to be twice as big as b. If theory permits that it’s a risky prediction (could be shown to be wrong). Theories are lexically organised (from higher to lower parts). You don’t ask questions about lower points before answering the higher up ones in a way that makes the theories comparable. If two theories have the same entities arranged in the same structure with the same connections, with the same functions that describe the connections between them, and the parameters are the same: t1 and t2 are empirically the same theory. If we can compare two theories, we can compare our theory (tI) to omniscient Jones’ theory (tOJ) and see verisimilitude of our theory (how much it corresponds with tOJ).
  • People can become wedded to theories or methods. This results in demonising the “enemy” & an unwillingness to give up that theory/method.

 

  • Lakatosian defence (general model of defending a theory): 1) (t^At^Cp^Ai^Cn) follows deductively that [sideways T, strict turnstile of deducibility] (o1,  [if, then], o2)

AND absent the theory P(o2/[conditional on]o1)bk[background knowledge] is small

– this extension allows you to say you have corroborated the theory by the facts (because without this small prior it’s formally invalid logic). When P is very small, meets Salmon criteria for a damn strange coincidence

^=conjunction

t= theory we are interested in

At= theoretical auxiliaries we’ve tied to our initial theory (almost always more than 1)

Cp= ceteris paribus clause (all other things being equal). No systematic other factors (they have been randomised/controlled for) but there will be individual differences.

Ai= instrumental auxiliaries. Theories about some controlling or measuring instruments. You distinguish between At and Ai by which field it’s in (if it’s in the same science then it’s an At)

Cn= conditions, experimenter describes to you what they did, very thorough methodology (often incompletely described).

*If the theory is true and the auxiliaries are true, the ceteris paribus clause is true and the instruments are accurate and you did what you said you did, it follows deductively that if you observe o1 you will observe o2

  • This only works left to right; can never deduce the scientific theory from the facts.
  • Sometimes you can’t assume the main theory to test the auxiliary theories; you are testing both of them. So if it’s corroborated, then you’ve corroborated both.
  • Can be validating a theory and validating a test at the same time. Only works if the conjunction of the two leads to a damn strange coincidence.

 

  • Strong use of predictions=to refute the theory.
  • Suppose: (o1,-o2). Modus tollens: P>Q, ~Q therefore ~P
  • Lakatosian criticism: Modus tollens only tells us the whole of the left side is false, not which specific part is.
  • To deny: (p x q x r x s is equivalent to p is false or q is false or r is false or s is false.
  • Formal equivalent of ~ on top a conjunction is disjunctions between statements on the left.
  • Short form: the denial of a conjunction is a disjunction of the denial of the conjuncts.
  • So when we falsify the right in the lab, we falsify the left but because its a conjunction it only tells us something on the left is wrong. But we are testing T so we want to specify whether that is false or not.
  • Randomness is essential for Fisherian statistics.
  • In soft psychology, probability that Cp is literally true is incredibly small.
  • If you start distributing confidence levels to the different conjuncts you work towards “robustness”, can see how by how much Cp is false.

 

  • Often can’t tell (from an experiment) whether a finding is due to what is reported or a confounding variable. Have to consider all potential confound variables and escape from logically invalid 3rd syllogism by exploring all of them.
  • Different methods result in different Cp’s & At’s, something not often considered.
  • Lakatosian defence of theory is only worthwhile if it has something going for it; it has been falsified in a literal sense but has enough verisimilitude that it’s worth sticking with.
  • When examining part of the conjunct, look at Cn first. Can say: “Let’s wait to see if it replicates”.
  • Ai isn’t a great place to start for psychologists.
  • Cp is good (can almost assume it’s false). When we have different types of experiments over different qualitative domains of data, by challenging Cp in one experiment it doesn’t threaten the theories success in other domains.
  • If you challenge At, if that auxiliary plays a role in derivation chain to experiments in other domains and you try to fix up failed experiments by challenging auxiliaries then all derivation chains that worked in past will now be screwed (because you’re undermining one of the links). Cp is more likely to be domain specific (violated in different ways in different settings).
  • Can modify Cp by adding a postulate (as you don’t want to fiddle with At) because you may have changed subjects or environment etc.
  • Progressive movement: Can turn falsifier into a corroborator by adding auxiliaries that allow you to predict new (previously un-thought of) experiments. Not just post-hoc rationalisation of a falsifier (ad-hoc 1). Ad-hoc 2: when you post-hoc rationalise a falsifier by adding auxiliaries & make new predictions but those predictions are then falsified.
  • Honest ad-hocery: ad-hoc rationalisations that give new predictions (which are found to be correct) that are risky and are a damn strange coincidence.

References:

Yonce, J. L., 2016. Philosophical Psychology Seminar (1989) Videos & Audio, [online] (Last updated 05/25/2016) Available at: http://meehl.umn.edu/video [Accessed on: 06/06/2016]