Talk of The Villages Florida - View Single Post - accurate covid -19 data
View Single Post
 
Old 10-07-2020, 06:50 AM
CoachKandSportsguy CoachKandSportsguy is offline
Sage
Join Date: Jan 2019
Location: Marsh Bend
Posts: 3,794
Thanks: 656
Thanked 2,778 Times in 1,351 Posts
Default Test basis for statistics

OP,

Your question is valid, but then the data has more than one way to be calculated, which is the root cause for your question. So without understanding the different viewpoints of the basis of calculation, you are staring at just numbers, not enough context, very nuanced so read the fine print of basis The following discussion is not to discredit the prior poster's graphics and data, but to understand the basis and use/abuse of the data presented.

1) calculate the positivity per test. seems valid, but may not represent the population. Why? what if the tests are only of health care workers? or the tests are for long term care patients? Raw test data will also include the same person tested over and over, which introduce a time dimension or repeat versus never tested blend. best answer is to test the entire population but not possible nor practical, especially at the very beginning, which led to small populations and symptomatic only, so much higher rate of positivity. Statistical sampling will infer that rate to the population, but if not bias adjusted for sample definition, a poor or wrong inference will occur. (hence data confusion)

2) counting the only individuals tested. May be better, but has the same population bias. Then throw in what do you do with positive people? stop counting them? make sense, but then as the virus spreads, your denominator of tested individuals will continue to shrink, and with a smaller denominator, the positivity rate will start to creep higher.

3) inferential stats are based on random sampling. The testing here on covid is not random in general. There are population tested, and the rate of college student positivity probably won't represent the florida over 65 population rate of positivity. One can population adjust the testing stats, but if the populations don't overlap with any common attributes, like the college student positivity in the villages, there is no validity of information.

So, these type of issues plague (pun intended) data analytics, especially in a population as large and diverse as the US. That point is why i have stated in a post about what have we learned about the virus, a lot, but information / knowledge is assymetrical, increasing over time, from very wrong conclusions to a more accurate but never complete understanding because the entire population can never be tested and never routinely tested.

I hope this helps you understand that the fine print must be read to understand the relevance of the population and the frequency, and any transformations, such as a exponential weighted average to smooth out daily noise.

sportsguy