Talk of The Villages Florida - View Single Post - The China Study: Nutritional effects are the same for all cancers
View Single Post
 
Old 04-13-2015, 11:35 PM
jimbo2012's Avatar
jimbo2012 jimbo2012 is offline
Sage
Join Date: Mar 2012
Location: LI, NY >Fernandina South
Posts: 7,255
Thanks: 93
Thanked 176 Times in 101 Posts
Default

B. The use of 'raw' univariate correlations.

In a study like this survey in China (ecologic, cross-sectional), univariate correlations represent one-to-one associations of two variables, one perhaps causal, the other perhaps effect. Use of these correlations (about 100,000 in this database) should only be done with caution, that is, being careful not to infer one-to-one causal associations. Even though this project provided impressive and highly unique experimental features, using univariate correlations to identify specific food vs. specific disease associations is not one of these redeeming features, for several reasons. First, a variable may reflect the effects of other factors that change along with the variable under study. Therefore, this requires adjustment for confounding factors--mostly, this was not done by Denise. Second, for a variable to have information of value (as in making a correlation), it must exhibit a sufficient range. If, for example, a variable is measured in 65 counties (as in China), there must be a distribution of values over a sufficiently broad range for it to be useful. Third, the variables should represent exposures representative of prior years when the diseases in question are developing. I see little or no indication that Denise systematically considered each of these requirements.
I should point out that when we were deciding to publish these data in the original monograph, we decided to do something highly unusual in science--to publish the uninterpreted raw correlations, hoping that future researchers would know how to use or not use them. We felt that this highly unusual decision was necessary because we were wary of those in the West who might have doubted the validity of data collected in China--we had several experiences to suspect this. But also, we believe that research should be as transparent as possible, simply for the sake of transparency, thus minimizing suspicion of hidden agendas. We knew that taking this approach was a risk because there could be those who, knowing little or nothing about experimentation of this type, might wish to use the data for their own questionable purposes. Nonetheless, we decided to be generous and, in order advise future users of these data, we added our words of caution--written about 1988--as part of our 894-page monograph. I also have repeated this caution in other publications of mine. It seems that Denise missed reading this material in the monograph.
As I was writing this, I discovered this comment from a self-described professional epidemiologist (PhD, cancer epidemiology) on one of the blogs (A Cancer Epidemiologist refutes Denise Mingers China Study Claims due to incorrect data analysis - 30 Bananas a Day!)--a comment that is relevant to the point that I am now addressing in this response.
I do not know this person but did find her comment interesting. After reviewing Denise's critique, she wrote the following for her (Denise's) blog, only then to see it quickly and mysteriously disappear.
"Your analysis is completely OVER-SIMPLIFIED. Every good epidemiologist/statistician will tell you that a correlation does NOT equal an association. By running a series of correlations, you've merely pointed out linear, non-directional, and unadjusted relationships between two factors. I suggest you pick up a basic biostatistics book, download a free copy of "R" (an open-source statistical software program), and learn how to analyze data properly. I'm a PhD cancer epidemiologist, and would be happy to help you do this properly. While I'm impressed by your crude, and - at best - preliminary analyses, it is quite irresponsible of you to draw conclusions based on these results alone. At the very least, you need to model the data using regression analyses so that you can account for multiple factors at one time."
This blogger is making the same point that I am making but I am puzzled why was it deleted from Denise's blog?
Lest it be forgotten, the main value of the China data set is its descriptive nature, thus providing a baseline against which other data sets can be broadly compared, either over time or over geographic space. I must emphasize: the correlations published in our monograph CANNOT be blindly used to infer causality--at least for specific cause-effect associations having no biological plausibility. Nonetheless, they do offer a rich trove of opportunities to generate interesting hypotheses, relatively few of which have been explored to date. In contrast, using models representing biological plausibility, which was determined from prior research, I simply wanted to see if they were consistent with the China survey data.

For the sake of understanding the downside risk of using univariate correlations, I'll use this imaginary conversation involving a few correlations that Denise thought were relevant to her personal allergy to wheat, although many other examples from Denise's treatise could serve the same purpose.
Denise makes a point concerning a highly significant (but unadjusted) univariate correlation between wheat flour consumption and two cardiovascular diseases plus a couple other diseases. In doing so, she infers that wheat flour causes these cardiovascular diseases. She also makes the point that "none of these correlations appear to be tangled with any risk-heightening variables, either." And further, she implies that I ignored this potentially important correlation, perhaps intentionally, because of my alleged bias against meat. I use this particular example here because others who very much dislike my views have pointed out on the Internet that this example cited by Denise represents evidence of my lack of integrity.
The conversation goes like this, after Denise reminds me of these univariate correlations.
"Denise, that correlation of wheat flour and heart disease is interesting but I am not aware of any prior and biologically plausible and convincing evidence to support an hypothesis that wheat causes these diseases, as you infer."
"Did you, by any chance, look for evidence whether there might be other variables confounding the wheat flour correlation, variables that change in parallel with wheat flour consumption? I presume you did because you said that 'none of these correlations appear to be tangled with any risk-heightening variables.'
"But just a minute, I found some, and they're all highly statistically significant (p<0.01 to p<0.001)."
"Higher wheat flour consumption, for example, is correlated, as univariate correlations, with
lower green vegetable consumption (many of these people live in northern, arid regions where they often consume meat based diets with little no consumption of vegetables). [By the way, Tuoli county data, to which you refer as my "sin of omission" intentionally were excluded from virtually all our analyses on meat consumption because this county ranked very high when meat consumption was documented at survey time, but much lower when responding to the questionnaire on frequency of meat consumption. That is, these nomadic people migrate for part of the year to valleys, where they consume more vegetables and fruits.]
lower serum levels of monounsaturated fats (possibly increasing risk of heart disease?)
higher serum levels of urea (a biomarker of protein consumption)
greater body weight (higher risk of heart disease?)"
"Interestingly, you might be interested to know that all of these variables are known from prior knowledge, i.e., biological plausibility, to associate with higher risk for heart disease."
"Denise, this is quite an oversight that could suggest the opposite conclusion from the one that you intended to convey. Or was this bias reflecting your personal preference for eating raw meat and avoiding wheat flour? Any thoughts?"
"Why did you highlight this relationship as a key example of my "sin of omission", being even more 'troubling than the distorted facts in The China Study and the details that (I) leave out?'"
Incidentally, aside from Denise's claiming there were no confounding factors, I might have taken her seriously when she posed a possible effect of wheat flour on heart disease, because it may be possible to gather prior evidence that could be considered as supporting the opposite point of view. In fact, this would be a proper use of univariate correlations, simply searching for those correlations that might hint of supporting evidence for such an hypothesis. If sufficiently convincing, then we could design a more analytical type of study. This exercise is called hypothesis generation, which is one of the virtues of the China data set. But Denise is doing something different, coming very close to almost randomly inferring causality without adjusting for confounding factors, without scanning the variables for analytical authenticity and without--to my knowledge--having prior evidence of biological plausibility for such an hypothesis.
Then, she uses this example as evidence of a "sin of omission" and a "distorted fact" on my part. Using these rather inflammatory words infers serious personal indiscretion on my part. Does she really mean this?
There are different ways of using univariate correlations in a database like this. It is not that these correlations are useless and should be ignored. Rather, it is a question of using them intelligently. By this, I mean first adjusting these correlations for confounding factors (if and when possible) then examining the individual variables of the correlations for authenticity. Depending on the reliability of these correlations, they may be used to guide whether a hypothetical, cause-effect model, perhaps having preliminary evidence of biological plausibility, is on the right track. The most critical expertise needed for their use is knowing the underlying biology, which is so often missing among trained statisticians.
The six models to which I referred in our book are those evaluated in this manner. Yes, when possible, I also used univariate correlations (along with statistical significance) in support of these models but only after we had preliminary supportive data for the model (only brief summarized in the book). Here are a few representative publications of those supportive data for the six models that we explored in our book:

Breast cancer (Marshall JR, Qu Y, Chen J, Parpia B, Campbell TC. Additional ecologic evidence: lipids and breast cancer mortality among women age 55 and over in China. Europ. J. Cancer 1991;28A:1720-1727; Key TJA, Chen J, Wang DY, Pike MC, Boreham J. Sex hormones in women in rural China and in Britain. Brit. J. Cancer 1990;62:631-636.)

Liver cancer (Campbell TC, Chen J, Liu C, Li J, Parpia B. Non-association of aflatoxin with primary liver cancer in a cross-sectional ecologic survey in the People's Republic of China. Cancer Res. 1990;50:6882-6893; .Youngman LD, Campbell TC. Inhibition of aflatoxin B1-induced gamma-glutamyl transpeptidase positive (GGT+) hepatic preneoplastic foci and tumors by low protein diets: evidence that altered GGT+ foci indicate neoplastic potential. Carcinogenesis 1992;13:1607-1613).

Energy utilization (Horio F, Youngman LD, Bell RC, Campbell TC. Thermogenesis, low-protein diets, and decreased development of AFB1-induced preneoplastic foci in rat liver. Nutr. Cancer 1991;16:31-41:Campbell TC. Energy balance: interpretation of data from rural China. Toxicological Sciences 1999;52:87-94).

Colon cancer (Campbell, T.C., Wang G., Chen J., Robertson, J., Chao, Z. and Parpia, B. Dietary fiber intake and colon cancer mortality in The People's Republic of China. In: Dietary Fiber, Chemistry Physiology and Health Effects, (Ed. Kritchevsky, D., Bonfield, C., Anderson, W.), Plenum Press, New York, 473-480, 1990).

Affluent-Poverty Diseases (Campbell TC, Chen J, Brun T, et al. China: from diseases of poverty to diseases of affluence. Policy implications of the epidemiological transition. Ecol. Food Nutr. 1992;27:133-144).

Protein-growth rate (Campbell TC, Chen J. Diet and chronic degenerative diseases: a summary of results from an ecologic study in rural China. In: Temple NJ, Burkitt DP, eds. Western diseases: their dietary prevention and reversibility. Totowa, NJ: Humana Press, 1994:67-118; Campbell TC, Junshi C. Diet and chronic degenerative diseases"perspectives from China. Am. J. Clin. Nutr. 1994;59:1153S-1161S).