Monthly Archives: July 2013

Totally Underwhelming

What do you get when you cross dozens of New Zealand’s best scientists with a myriad of Ministry officials?  The answer is the underwhelming reports from the 10 June workshops of the National Science Challenges. If every there was need for proof that science by committee does not work, here it is.  Each report consists of a series of power point slides with dot points. About 3 slides for each challenge pertain to confirming or changing the pre-workshop goals/themes (yes there were pre-workshop meetings in May to get these together), and then about 4 or 5 slides on “Indicative research programs.”  These handful of slides were the output of on average 44 people per group consisting of scientists, industry or other “user group” representatives, and ministry representatives.  The people I know who attended a workshop were senior and very very competent people.  The problem is not the people, but the process.  I saw nothing in the reports to inspire, and nothing that couldn’t have been cobbled together by one person after receiving emails with suggestions.  Most of the “indicative research programs” were simply restatements of the obvious questions in the field.  There was no meat. There was only one concrete proposal (High Value Nutrition proposed establishing a “Biomarker Centre”).

Recall that the 10 challenges will have $133.5M to spend over 4 years, about $3.34M per challenge per year.  The June meeting, I estimate, just cost about $0.5M in salaries for the day, overheads on those salaries, travel etc.  For $0.5M we have been given about 60 power point slides most of which could have been reproduced in half an hour or so by one or two of the scientists from each group – “Mr Speaker, would the Minister of MBIE please explain why one power point slide costs $8,300?”  Indeed, I have no doubt if the $3.34M was used to support half a dozen scientists and they were told – here’s the field (name of the challenge), you have $3.34M for each of the next 4 years, do some good science for the country in this field, – then it would be done.  Furthermore, the outcome would be at least, if not more, valuable than any multitude of small projects that are likely to emerge from the June workshops (but only after much more time and $ has been spent on more meetings, development of requests for proposals and a grant funding process that will take up many more millions and waste time for 90+% of the applicants; much as what happens now with other government funding models).  The Great Science Challenge for New Zealand is not how to define the projects, but how to provide long term sustainable funds for scientists who already know the projects.

Prostate cancer and omega 3

The media is in a feeding frenzy with reports of a link between Omega 3 and Prostate Cancer.  Here’s a sample:

Link Between Omega-3 Fatty Acids and Increased Prostate Cancer Risk Confirmed (Science Daily)
Omega-3 supplements ‘could raise prostate cancer risk’ (Telegraph)
Omega-3 supplements linked to prostate cancer (Fox)
Omega 3 could increase cancer risk (TV3)

So, what’s the fuss?  The fuss is about a study published online yesterday in the Journal of the National Cancer Institute:

Brasky, T. M., Darke, A. K., Song, X., Tangen, C. M., Goodman, P. J., Thompson, I. M., et al. (2013). Plasma Phospholipid Fatty Acids and Prostate cancer Risk in the SELECT trial. Journal Of The National Cancer Institute, 1–10. doi:10.1093/jnci/djt174/-/DC1

The article is behind a paywall, so I’m not sure how many of the journalists have bothered to read it instead of relying on press releases.  I’ve access to the paper through my university, so here is a synopsis for the lay reader (bearing in mind I am not an expert in either omega 3 or cancer).

The thinking in the general public: Prostate cancer bad, Omega 3 good, therefore Omega 3 may prevent/delay prostate cancer

The thinking of the scientists: Is there a link between phospholipids (including omega 3) and prostate cancer?

The subjects studied:  Participants were enrolled in a trial of Vitamin E supplementation verse Placebo.  They were all male, from the US, Canada or Peurto Rico, aged 50+ if black (the medical literature uses this description), or 55+ if not, had no history of prostate cancer and with a PSA (prostate-specific antigen) test of <4ng/ml at the start of the study.  They were enrolled between July 2001 and May 2004.  While 35,533 men were enrolled in the trial, in this study only 2273 were studied.  These consisted of 834 patients who had prostate cancer diagnosed prior to 1 January 2008 and 1364 “matched” subjects who had no prostate cancer diagnosed in that time.  This is called a case-controlled study.  The “matching” is a statistical process whereby they make sure the two groups being compared (those with and without cancer) have certain demographic features in common on average.  In this case the groups had similar age ranges and similar ethnicities.  The cancer group was further divided into those with low and those with high grade cancers.

The methods:  Blood samples taken when patients were recruited and the total fatty acid content along with 4 types of Omega-3 fatty acids, 2 types of Omega-6 fatty acids, and 3 types of Trans-fatty acids were measured. The mean (average) proportions of each of the types of fatty acids (compared with total fatty acid) were compared between the No cancer and the Prostate Cancer groups.

The results:  Those with cancer had on average a greater proportion of each of  three of the kinds of Omega-3 fatty acids than those without cancer.  The p values were 0.03, <0.001, 0.006 (see here for an explanation of p values).  The p values for the two Omega-6 were higher (therefore more likely to be arrived at by chance) at 0.17 each.  The Trans-Fatt p values were 0.048, 0.08, 0.002. At this point it is very important to remember that not all those with cancer had high proportions of Omega-3 – it was the average that was higher.  An analysis comparing the 25% of subjects with the lowest Omega-3 (combination of the three Omega-3s) values with those with the highest 25% showed that the risk of prostate cancer was between 9 and 88% greater (with 95% confidence that this was not just by chance), ie a Hazard Ratio of 1.43 (95%CI 1.01 to 1.88).  Considering only those with the highest grade of cancer the Hazard Ratio was 1.71 (95%CI 1.0 to 2.94).

The authors performed a multivariable analysis.  That is when they check to see if other factors may be influencing the results.  They say that for Omega-3:

The continuous multi-variable-adjusted hazard ratios predicting total, …prostate cancer risk, [was] 1.16 (95% CI = 0.98 to 1.36),

This means that Omega-3 proportions changed the risk of getting prostate cancer by between a 2% decrease (100*(1-0.98)) and 36% increase (100*(1.346-1)) when other factors (not stated what) are accounted for.  This is what the 95% CI (Confidence interval) suggests.  The 1.16 is merely somewhere near the middle of the change in risk (16% higher).  It is the confidence interval that matters.  When it crosses 1, as it does here, it is not normally considered very important (ie not “statistically significant” as is often said).

The authors then conducted a meta-analysis for the Relative Risk of getting prostate cancer for two types of Omega-3 (DHA and EPA) and Omega-3 total fatty acid.  A meta-analysis is where they gather up all the studies and combine the results together.  In this case there were 7 studies (including the present one) which reported DHA and EPA and 4 which reported totals.  The results were

EPA:  RR = 1.07 (95%CI 0.95 to 1.21)
DHA: RR=1.16 (95%CI 1.03 to 1.31)
Total: RR=1.14 (95% CI 0.99 to 1.32)

Remember it is the 95% CI that is most important.  In this case only DHA creeps above 1 for the 95% CI.  Remember also that RR (Relative Risk) is a comparison of the rates of cancer between those with the level of Omega-3 among the lowest 20% and among the highest 20%.

The Conclusions:  The authors conclude

…these findings contradict the expectation that high consumption of long-chain ω-3 fatty acids and low consumption of ω-6 fatty acids would reduce the risk of prostate cancer.

This sounds reasonable under the assumption that consuming omega-3 (eg in supplements) actually increases the proportion of omega-3 in the blood.  They also state

It is unclear why high levels of long-chain ω-3 PUFA would increase prostate cancer risk,

What the media said:  TV3 borrowing from Sky, had a graphic with the word “Supplements” prominent and they talked of a 71% increased risk of high grade prostate cancer and 43% increased risk overall.  As we’ve seen these numbers are not what is relevant, the confidence intervals are – this adds a lot more uncertainty to the results (but not such good TV).  Also, they ignored the meta-analysis entirely (numbers not so big or interesting). They said nothing about the age range etc.  Finally, and most importantly, the study was not a study of supplements!  We have no idea why some participants had higher Omega-3 than others.  Some may have been because of supplements, some because of fish eating, some simply because of their own body composition and own metabolism.

My conclusion:  The study did not show that supplementation of Omega-3 is risky.  Nor did it show that supplementation is beneficial. It simply was not a study of supplementation. It did show that elevated proportions of Omega-3 fatty acids are possibly associated with increased risk of prostate cancer in men 50+ (black) and 55+ (non-black). Remember, too, that this is talking about relative risk.  The overall prostate cancer risk during the study period was just 2.35%.  If I’ve done my math right, then those in the top 25% of Omega-3 have an absolute risk of 2.77% (95%CI 2.12% to 3.65%).

 

The legend of Chris Martin: Part II

Chris Martin was Not-out a remarkable 50% of the time.  That is, 52 times out of 104 innings.  Is this a record?  I don’t know an answer, so I sent off an email to the gurus at Cricinfo to see if they do.  Michael Jones replied that “Yes it is” for batsman with over 100 innings (see here)!  Well done Chris! What this does raise is the possibility of working out whether it was better for an incoming batsman to swing and hope to score a few runs before Chris was out, or whether they should just play normally? For this we must first consider what to do with the innings in which both Chris and the other batsmen were Not out.  In such circumstances the choice is to include the innings on both sides or to exclude. I’ve chosen to exclude as I think this has the least room for bias.

Now let us apply my Rule #1 and visualise the data (see previous Chris Martin post).

Christ Martin's Partnerships: Data source: Crininfo

Christ Martin’s Partnerships:
Data source: Crininfo

Plot A is a histogram in which I have grouped for each of the two sets of data (the partnership scores when Chris was Out and the scores when he was Not out) into bins.  Each bin is 5 runs wide except for the first.  That is the first bin is from 0 to 2.5 (really to 2), the second from 2.5 to 7.5 etc.   What can be seen from this is that there appear to be more very low partnerships when Chris was Out than when the other batsman was Out.  However, don’t be fooled by histograms like this.  Remember, there were not the same number of innings in which he was out (52) compared to when the other batsman was Out (49).  This may distort the graph.

Plot B is better, but harder to read.  Each black or red dot is a score.  The coloured boxes show the range called the “Interquartile range”.  That is, 25% of the scores are below the box, and 25% are above.  The line in the middle of the box is the median – that is 50% of score are below and 50% of scores are above. The “Whiskers” (lines above and below the box) show the range of scores.

Plot C is less often used in the medical literature (at least), but is really very useful.  It plots cumulatively the percentage of scores below a particular score for each of the two sets of data.  For example, we can read off the graph that about 27% of the partnership scores for when Chris Martin was out were zero.  If we look a the dashed line at 50% and where it intersects the blue line, then we see that 50% of the scores for when Chris Martin was out were 2 or below.  This is a bit more informative than plot B.

What all the plots show is that the distribution of scores in both data sets is highly skewed.  That is, there are many more scores at one end of plot A than the other, or the lines in plot C are not straight lines.  This is very important because it tells us what tests we can not use and how we should not present data.  Quite often when I referee papers, and in papers I read I see the averages (means) presented for data like this.  This is wrong.  They are presented like:

Chris Out:  8.4±13.9

Chris Not  Out: 10.8±11.8

The first number is the mean (ie add all the scores and divide by the number of innings).  The second number after the “plus-minus” symbol is called the standard deviation.  It is a measure of the spread of the numbers around the mean.  In this case the standard deviation is large compared to the mean.  Indeed anything more than half the size of the mean is a bit of a give away that the distribution is highly skewed and that presenting the numbers this way is totally meaningless.  We should me able to look at the mean and standard deviation and conclude that about 95% of the scores are between two standard deviations below the mean and two above.  However two below (8.4 – 2*13.9) is a negative score!  Not possible.

What should be presented is the medians with interquartile range (ie the range from where 25% are below and 75% are below).

Chris Out:  2.0 (0-12.8)

Chris Not  Out: 8 (1-16.5)

We are now ready to apply a statistical test found in most statistical packages to see if Chris being out or the other batsmen being out was better for the partnership.  The test we apply is called the Mann-Whitney U test (or Kruskall-Wallis test if we were comparing 3 or more data sets).  Some people say this is comparing the medians – it is not, it is comparing the whole of the two data sets.  If you don’t believe me, see  http://udel.edu/~mcdonald/statkruskalwallis.html.

So, I apply the test and it gives me the number p=0.12.  What does this mean?  It means that if Chris Martin were to bat in another 104 innings, and another, and another etc, then 12% of the time we would see the difference (or greater) between the Outs and Not Out partnerships that we do actually see (see significantly p’d for more explanation of p).  12% for a statistician is quite large and so we would suggest that there is no overall difference in partnerships whether Chris Martin was Out or was Not Out.  Alas, Chris Martin’s playing days are over and we have the entire “population” of his scores to assess his batting prowess.  The kind of statistical test I’ve presented is only really useful when we are looking at a sample from a much greater population.  However, in the hope that Chris may make a return to Test cricket one day, then what is presented here should give pause for thought for the next batsman who goes out to bat with him… perhaps there is not a lot to gain by swinging wildly, and thereby increasing their chances of getting out; they are probably not improving the chances of the team.

Nelson Mandela is on dialysis

CNN is reporting Nelson Mandela is on dialysis. http://t.co/HZTIlmGrtO.  This means he is suffering from Acute Kidney Injury, the disease I study.  Having to have dialysis is very serious. Unfortunately, survival rates are only about 50% by this stage, less in the very elderly.  Dialysis is not a treatment, merely a support for the kidney to try and give them time to recover  function on their own and  a means to remove toxins from the body.

 

The legend of Chris Martin: Part I

His innings may be over, but the legend lives on.  Chris Martin retired this week from international cricket. He was a legend with ball and he was a legend with bat, for quite different reasons.  His Test batting average of 2.36 was the worst ever of any international cricketer who batted in more than 15 innings.  But his average does not tell the whole story.  Indeed, the legend of Chris Martin’s batting is a long tale which will require several blog posts to tell.  We need to answer some important questions, “What was his best average?”, “Was it better for his partners to slog or should they have respected his abilities more?”  Along the way I hope that you will pick up on some techniques which will help you interpret those pesky statistics, or to present your own data.

Rule #1:  Always visualise your data

Christ Martin's batting innings by innings. Data source: CricInfo

Christ Martin’s batting innings by innings.
Data source: CricInfo

The best place to begin any quest is with a graph.  Here is a graph showing all 104 of Chris’s innings in chronological order.  On it is represented the scores when he was Out (red lines) and the scores when he was Not Out (blue lines).  Funnily enough he was out and not out exactly 52 times each.  We can see immediately that the peak of his batting performance was a score of 12 Not Out which occurred approximately half-way through his career.  His best form seems to be innings 30 to 34 where he went undefeated in 5 successive innings scoring 17 runs.  On the other hand he had several bad runs where he was Out for zero (red marks below the zero line).  One of the interesting things is that his first 4 innings may have given a false impression of his batting prowess.  In his first innings he scored 7, well above his eventual average of 2.36.  In his 2nd and 4th innings he was 0 Not Out.  In between he was 5 Not Out.  This coincided with his peak average every, 12 (orange triangles).  This allows us to note an important feature of statistics.  Let us pretend for a moment the average of 2.36 was “built-in” to Chris Martin from the beginning.  This means that it was inevitable that after many innings he would end up with that average.  But it is not inevitable that any one innings taken at random is equal to that mean.  Importantly, with only a few samples (ie the first few innings) the average at that point can be a long way from the “real” average.  This is a phenomenon caused by sampling from a larger population.  It is why we have to be very cautious with conclusions drawn from a small sample population.  For example, if General Practitioners throughout the country see on average 5 new leukemia cases a year, but we sample only three General Practitioners from Christchurch who saw 8, 9 and 14 then we would be quite wrong to conclude that Christchurch has a higher average leukemia rate than other regions.  We need a much larger sample from Christchurch to get a reasonable estimate of Christchurch’s average.  There are statistical techniques for deciding what proportion of General Practitioners should be sampled and what the uncertainty is in the average we arrive at.  Graphs also help… we can see with Chris that after only 10% of his innings he is within 1 of his average and stays that way throughout the rest of his career (orange triangles).

That’s it for today.  More on the legend of Chris Martin in the weeks ahead.

Open letter for a professional association regarding impact factors

Dr John Pickering
Department of Medicine
University of Otago Christchurch
Christchurch
New Zealand
Dr Peter Kerr
Editor
Nephrology
Journal of the Asia Pacific Society of Nephrology

3 July 2013

Re:  Open letter regarding the Nephrology journal’s use of impact factors

Dear Dr Kerr

As an occasional referee for Nephrology and member of the ANZSN affiliated with the APSN I write concerning Nephrology’s use of the Thomson Reuter’s Impact Factor and the journal rankings based on them.  Specifically I urge that Nephrology remove the Impact Factor and ranking from the journal web site.  This is because the continued use of Impact Factors reflects poorly on the integrity of the journal and the APSN.  My reasons are:

(i)            Regularly when refereeing I have to ask authors to present medians and interquartile ranges rather then means and standard deviations when the distribution of the variable they are measuring is not normally distributed.  The Impact Factor is the mean of a very highly skewed distribution and, as such, it is a nonsense statistical metric.

(ii)          Rankings on the basis of the mean of a skewed distribution are similarly a nonsense metric.

(iii)         Impact Factors are open to manipulation.  See:

Nature 2013: http://blogs.nature.com/news/2013/06/new-record-66-journals-banned-for-boosting-impact-factor-with-self-citations.html

Science Feb 2012: http://www.sciencemag.org/content/335/6068/542.summary

(iv)         Professional associations have begun to recognise the inherent flaws in how research is assessed.  In particularly a world-wide movement initiated by the American Association of Cell Biology, namely the San Francisco Declaration on Research Assessment (DORA), has identified some much needed standards to maintain integrity for scientists and associated professional associations.  See http://am.ascb.org/dora/and http://www.sciencemag.org/content/340/6134/787.long

Regards,

John

cc. Dr Yasuhiko Tomino, President Asia Pacific Society of Nephrology (APSN)
Dr Rowan Walker, President Australia New Zealand Society of Nephrology (ANZSN)
Any interested party may read this letter through https://100dialysis.wordpress.com or http://sciblogs.co.nz/kidney-punch/