Category Archives: Keeping it simple

A vision of kiwi kidneys

Sick of writing boring text reports.  Take a leaf out of Christchurch nephrologist Dr Suetonia Palmer’s (@SuetoniaPalmer) book and make a visual abstract report.  Here are two she has created recently based on data collected about organ donation and end stage renal failure by ANZDATA (@ANZDATARegistry). Enjoy.

Suetonia C-18RfJXUAApRcU

Suetonia C-16lBZXsAERoeM

ps. The featured image is of the Kidney Brothers.  Check out the great educational resources at The OrganWiseGuys.

Cheesecake files: A little something for World Kidney Day

Today is World Kidney Day, so I shall let you in on a little secret. There is a new tool for predicting if a transplant is going to be problematic to get working properly.

Nephrologist call a transplant a “graft” and when the new kidney is not really filtering as well as hoped after a week they call it “Delayed Graft Function.”  Rather than waiting a week, the nephrologist would like to know in the first few hours after the transplant if the new kidney is going to be one of these “problematic” transplants or not.  A lot of money has been spent on developing some fancy new biomarkers (urinary) and they may well have their place.  At this stage none are terribly good at predicting delayed graft function.

A while ago I helped develop a new tool – simply the ratio of  a measurement of the rate at which a particular substance is being peed out of the body  to an estimate how much the body is is producing in the first place.  If the ratio is 1 then the kidney is in a steady state. If not, then either the kidneys are not performing well (ie not keeping up with the production), or they have improved enough after a problem and are getting rid of the “excess” of the substance from the body.  This ratio is simple and easy to calculate and doesn’t require extra expense or specialist equipment.

A few months ago, I persuaded a colleague in Australia to check if this ratio could be used soon after transplant to predict delayed graft function. As it turns out in the small study we ran that it can, and that it adds value to a risk prediction model based on the normal stuff nephrologists measure! I’m quite chuffed about this.  Sometimes, the simple works.  Maybe something will become of it and ultimately some transplants will work better and others will not fail.  Anyway, it’s nice to bring a measure of hope on World Kidney Day.

This was published a couple of weeks ago in the journal Nephron.


Christchurch has breast cancer research hub

Guest post by: Kim Thomas, Communications Manager at the University of Otago, Christchurch

Research Radar UOC

A team of specialist cancer researchers have joined forces to focus on the impact of obesity on breast cancer.

The researchers all work at the University of Otago, Christchurch’s Mackenzie Cancer Research Group. The Group is headed by Canterbury District Health Board oncologist Professor Bridget Robinson, a breast cancer expert.

Researchers Associate Professor Gabi Dachs, Dr Margaret Currie and Dr Logan Walker have previously investigated various aspects of cancer but decided to team up and focus on the significant health issue of obesity.

Associate Professor Dachs says that international studies have shown breast cancer patients who were obese before or after diagnosis are less likely to survive than patients with normal BMI. Risk of dying from breast cancer increases by a third for every increment of 5kg/m2 in BMI.


From left to right: A/Prof Gabi Dachs, Dr Margaret Currie, Dr Logan Walker

The three researchers are investigating different aspects of obesity and breast cancer:

  • Associate Professor Dachs is looking at molecular factors associated with obesity in cancer, particularly how fat cells communicate with cancer cells and negatively affect them.
  • Dr Margaret Currie is putting fat and breast cancer cells together to see how the fat cells make tumours more resistant to treatment. She suspects the fat cells provide ‘an extra energy hit’ to cancer cells by providing lipids, or fats, in addition to glucose.
  • Geneticist Dr Logan Walker will investigate whether the obesity-related gene responsible for the amylase enzyme in saliva (AMY1) contributes to breast cancer development. He will also explore the role of key genes that behave differently in breast tumours from obese women.

The researchers’ work is funded by the NZ Breast Cancer Foundation, the Cancer Society of New Zealand, the Canterbury and West Coast Division of the Cancer Society NZ, the Mackenzie Charitable Foundation and the University of Otago.


My 10 Commandments of a Data Culture

Thou shalt have no data but ethical data.

Thou shalt protect the identity of thy subjects with all thy heart, soul, mind and body.

Thou shalt back-up.

Thou shalt honour thy data and tell its story, not thy own.

Thou shalt always visualise thy data before testing.

Thou shalt share thy results even if negative.

Thou shalt not torture thy data (but thou may interrogate it).

Thou shalt not bow down to P<0.05 nor claim significance unless it is clinically so.

Thou shalt not present skewed data as mean±SD.

Thou shalt not covet thy neighbour’s P value.

Significantly p’d

I may be a pee scientist, but today is brought to you by the letter “P” not the product.  “P” is something all journalists, all lay readers of science articles, teachers, medical practitioners, and all scientists should know about.  Alas, in my experience many don’t and as a consequence “P” is abused. Hence this post.  Even more abused is the word “significant” often associated with P; more about that later.

P is short for probability.  Stop! – don’t stop reading just because statistics was a bit boring at school; understanding maybe the difference between saving lives and losing them.  If nothing so dramatic, it may save you from making a fool of yourself.

P is a probability.  It is normally reported as a fraction (eg 0.03) rather than a percentage (3%).  You will be familiar with it when tossing a coin.  You know there is a 50% or one half or 0.5 chance of obtaining a heads with any one toss.  If you work out all the possible combinations of two tosses then you will see that there are four possibilities, one of which is two heads in a row.  So the prior (to tossing) probability of two heads in a row is 1 out 4 or P=0.25. You will see P in press releases from research institutes, blog posts, abstracts, and research articles, this from today:

“..there was significant improvement in sexual desire among those on  testosterone (P=0.05)” [link]

So, P is easy, but interpreting P depends on the context.  This is hugely important.  What I am going to concentrate on is the typical medical study that is reported.  There is also a lesson for a classroom.

One kind of study reporting a P value is a trial where one group of patients are compared with another.  Usually one group of patients has received an intervention (eg a new drug) and the other receives regular treatment or a placebo (eg a sugar pill).  If the study is done properly a primary outcome should have been decided before hand.  The primary outcome must measure something – perhaps the number of deaths in a one year period, or the mean change in concentration of a particular protein in the blood.  The primary outcome is how these what is measured differs between the group getting the new intervention and the group not getting it.  Associated with it is a P value, eg:

“CoQ10 treated patients had significantly lower cardiovascular mortality (p=0.02)” [link]

To interpret the P we must first understand what the study was about and, in particularly, understand the “null hypothesis.”  The null hypothesis is simply the idea the study was trying to test (the hypothesis) expressed in a particular way.  In this case, the idea is that CoQ10 may reduce the risk of cardiovascular mortality.  Expressed as a null hypothesis we don’t assume that it could only decrease rates, but we allow for the possibility that it may increase as well (this does happen with some trials!).  So, we express the hypothesis in a neutral fashion.  Here that would be something like that the risk of cardiovascular death is the same in the population of patients who take CoQ10 and in the population which does not take CoQ10.  If we think about it for a minute, then if the proportion of patients who died of a cardiovascular event was exactly the same in the two groups then the risk ratio (the CoQ10 group proportion divided by the non CoQ10 group proportion) would be exactly 1.  The P value, then answers the question:

If the risk of cardiovascular death was the same in both groups (the null hypothesis) was true what is the probability (ie P) that the difference in the actual risk ratio measured from 1 is as large as was observed simply by chance?

The “by chance” is because when the patients were selected for the trial there is a chance that they don’t fairly represent the true population of every patient in the world (with whatever condition is being studied) either in their basic characteristics or their reaction to the treatment. Because not every patient in the population can be studied, a sample must be taken.  We hope that it is “random” and representative, but it is not always.  For teachers, you may like to do the lesson at the bottom of the page to explain this to children.  Back to our example, some numbers may help.

If we have 1000 patients receiving Drug X, and 2000 receiving a placebo.  If, say, 100 patients in the Drug X group die in 1 year, then the risk of dying in 1 year we say is 100/1000 or 0.1 (or 10%).  If in the placebo group, 500 patients die in 1 year, then the risk is 500/2000 or 0.25 (25%).  The risk ratio is 0.1/0.25 = 0.4.  The difference between this and 1 is 0.6.  What is the probability that we arrived at 0.6 simply by chance?  I did the calculation and got a number of p<0.0001.  This means there is less than a 1 in 10,000 chance that this difference was arrived at by chance.  Another way of thinking of this is that if we did the study 10,000 times, and the null hypothesis were true, we’d expect to see the result we saw about one time.  What is crucial to realise is that the P value depends on the number of subjects in each group.  If instead of 1000 and 2000 we had 10 and 20, and instead of 100 and 500 deaths we had 1 and 5, then the risks and risk ratio would be the same, but the P value is 0.63 which is very high (a 63% chance of observing the difference we observed).  Another way of thinking about this is what is the probability that we will state there is a difference of at least the size we see, when there is really no difference at all. If studies are reported without P values then at best take them with a grain of salt.  Better, ignore them totally.

It is also important to realise that within any one study that if they measure lots of things and compare them between two groups then simply because of random sampling (by chance) some of the P values will be low.  This leads me to my next point…

The myth of significance

You will often see the word “significant” used with respect to studies, for example:

“Researchers found there was a significant increase in brain activity while talking on a hands-free device compared with the control condition.” [Link]

This is a wrong interpretation:  “The increase in brain activity while talking on a hands-free device is important.” or  “The increase in brain activity while talking on a hands-free device is meaningful.”

“Significant” does not equal “Meaningful” in this context.  All it means is that the P value of the null hypothesis is less than 0.05.   If I had it my way I’d ban the word significant.  It is simply a lazy habit of researchers to use this short hand for p<0.05.  It has come about simply because someone somewhere started to do it (and call it “significance testing”) and the sheep have followed.  As I say to my students, “Simply state the P value, that has meaning.”*



For the teachers

Materials needed:

  • Coins
  • Paper
  • The ability to count and divide

Ask the children what the chances of getting a “Heads” are.  Have a discussion and try and get them to think that there are two possible outcomes each equally probable.

Get each child to toss their coin 4 times and get them to write down whether they got a head or tail each time.

Collate the number of heads in a table like.

#heads             #children getting this number of heads

0                      ?

1                      ?

2                      ?

3                      ?

4                      ?

If your classroom size is 24 or larger then you may well have someone with 4 heads or 0 (4 tails).

Ask the children if they think this is amazing or accidental?

Then, get the children to continue tossing their coins until they get either 4 heads or 4 tails in a row.  Perhaps make it a competition to see how fast they can get there.  They need to continue to write down each head and tail.

You may then get them to add all their heads and all their tails.  By now the proportions (get them to divide the number of heads by the number of tails).  If you like, go one step further and collate all the data.  The probability of a head should be approaching 0.5.

Discuss the idea that getting 4 heads or 4 tails in a row was simply due to chance (randomness).

For more advanced classes, you may talk about statistics in medicine and in the media.  You may want to use some specific examples about one off trials that appeared to show a difference, but when repeated later it was found to be accidental.


*For the pedantic.  In a controlled trial the numbers in the trial are selected on the basis of pre-specifying a (hopefully) meaningful difference in the outcome between the case and control arms and a probability of Type I (alpha) and Type II (beta)  errors.  The alpha is often 0.05.  In this specific situation if the P<0.05 then it may be reasonable to talk about a significant difference because the alpha was pre-specified and used to calculate the number of participants in the study.

Cheesecake files: Ponting’s last innings

“Only as good as your last match” goes the cliché.  This is true for Ricky Ponting and here is why. I recently published an article1 (Open Access :)) on some new techniques being used in medical research which determine if making an additional measurement improves what we call “risk stratification.”  In other words – does measuring substance X help us to rule in or rule out if someone had a disease or not.  I got a bit board with talking about “biomarkers” and medical stuff, so when it came to presenting this at the Australian New Zealand Society of Nephrology’s annual conference I looked to answer the very important question: “Does Ricky Ponting’s last inning’s matter?”, or in Australian cricket jargon “Ponting, humph, he’s only as good as his last innings, mate.”

How did I do it?

  1. I chose Australia winning a one-day international when chasing runs as an outcome (Win or Loss).
  2. Using data available from Cricinfo I determined which of the following on its own predicts if Australia will win (ie which predicts the outcome better than just flipping a coin): (1) Who won the toss, (2) whether it is a day or night match, (3) whether it is a home or away match, (4) how many runs the opposition scored.
  3. As it turned out if Australia lost the toss they were more likely to win (!), and, not surprisingly, the fewer runs the opposition scored the more likely they were to win.  I then built a mathematical model.  All this means is that I came up with an equation where the inputs were the winning or losing of the toss and the number of runs and the output was the probability of winning.  This is called a “reference model.”
  4.  I added to this model Ricky Ponting’s last innings score and recalculatd the probability of Australia winning.
  5. I then could calculate some numbers which told me that by adding Ricky Ponting’s last innings to the model I improved the model’s ability to predict a win and to predict a loss.  Below is a graph which I came up with to illustrate this.  I call this a Risk Assessment Plot.

So, when the shrimp hit the barbie, the beers are in the esky, and your mate sends down a flipper you can smack him over the fence for you now know that when Ricky Ponting scored well in his last innings, Australia are more likely to win.

The middle bit is the Risk Assessment Plot. The dotted lines tell us about the reference model. The solid lines tell us about the reference model + Ricky Ponting. The further apart the red and blue lines are the better. The red lines are derived from when Australia won, the blue lines from when the lost. If you follow the black lines with arrows you can see that by adding in Rick Ponting’s last innings the model the predicted probability (risk) of a win increases when Australia went on to win (a perfect model would have all these predictions equal to 1). Similarly the predicted probability of a loss gets smaller when Australia did lose (ideally all these predictions would equal 0).

  1. Pickering JW, Endre ZH. New Metrics for Assessing Diagnostic Potential of Candidate Biomarkers. Clin J Am Soc Nephro 2012;7:1355–64.

The Hunting of the SNARF

Some of you may know Lewis Carroll’s classic nonsense poem “The hunting of the Snark”.  Eight men set off with a blank map to find the mythical Snark.

 And the Banker, inspired with a courage so new
          It was matter for general remark,
     Rushed madly ahead and was lost to their view
          In his zeal to discover the Snark

Snarks were dangerous creatures, however

 “For, although common Snarks do no manner of harm,
          Yet, I feel it my duty to say,
     Some are Boojums—”

I dwell in a world where inspired by the new many have rushed on ahead to discover the SNARF (SigNals of Acute Renal Failure).  The hunting of the SNARF has followed contours familiarly trodden and graphically illustrated by a Hype cycle(1).

The Hunting of the SNARF

The Hunting of the SNARF: A Hype Cycle of the hunt for the perfect biomarker of Acute Kidney Injury

It was kickstarted by new technologies called proteomics and genomics which gave the hope that soon would be discovered a rapid, accurate, and, most importantly, early biomarker of Acute Renal Failure (later renamed Acute Kidney Injury, AKI).  This was the beginning of the hype that was driven in no small part by some fantastic early results.  A paper published in the Lancet in 2005 was an important driver in the hype that followed(2).  As with many early studies this involved children and cardiac surgery.  Importantly the biomarker involved almost perfectly distinguished between those who had the disease and those who didn’t (ie not false negatives or false positives).  As the field progressed and more and more studies were investigated across a more diverse range of patient groups and potential AKI causes the ability to discriminate between those with and without the disease became much more modest.  It became apparent that one biomarker to rule them all was not going to be the solution – rather a panel of biomarkers whereby the clinician would choose which biomarkers, if any, to use according to the timing and suspected etiology of the renal injury, the baseline renal function and specific illness of the patient.  We do not yet have such a panel, nor have we conducted sufficient investigations to find if an AKI biomarker(s) adds value to what the clinician can already deduce.  That is partly my job and these are the greater challenges that must drive us up the slope of enlightenment to reach the plateau of productivity where finally we may capture the SNARF.

(1)    Jackie Fenn, “When to Leap on the Hype Cycle,” Gartner Group, January 1, 1995

(2)   Mishra J, Dent CL, Tarabishi R, et al. Neutrophil gelatinase-associated lipocalin (NGAL) as a biomarker for acute renal injury after cardiac surgery. Lancet 2005;365(9466):1231–8.