Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Statistics, as a discipline, is essentially predicated on the principle that the objects of study do not know that they are being observed.

There is no such "principle" in statistics. Statistics is based on statistical methodology, i.e. formulas, models, and techniques that are used in statistical analysis of raw research data, which is collected, organized, analyzed, interpreted and presented. The Hawthorne effect, "a type of reactivity in which individuals modify an aspect of their behavior in response to their awareness of being observed,"[1] arose from analysis of a statistical study.

> Without this assumption, the domain is now more accurately described as game theory.

Game theory is utilized for decision-making in strategic environments where rational agents interact with each other. Statistics, on the other hand, is employed for reasoning in non-adversarial settings where the samples are assumed to be generated by some stationary and non-reactive source.

> Statisticians will happily and confidently ignore this and draw very wrong conclusions as a result.

Contradiction. You've already claimed that statistics is "essentially predicated on the principle that the objects of study do not know that they are being observed." Yet now you're claiming experts "confidently" ignore their discipline's "essentially predicated" principle.

[1] https://en.wikipedia.org/wiki/Hawthorne_effect



There is a difference between a subject that changes with observation and a subject that changes adversarially with your statistical methods. A qualitative difference.


Statistical methods may include observation during the gathering of data. Whether or not change is measurably different from adversarial change depends on the variables chosen. It is clear that two distinct disciplines can approach the same problem with varying results without invalidating the entire other discipline. There is a difference between sound argument and a straw man employing equivocation. A qualitative difference.


Sorry for the late reply, I had to collect my thoughts a bit and I'm quite busy.

First, I have not invalidated the discipline, I have said not to trust statisticians who are clearly acting outside the bounds of the fundamental assumptions of statistics, and that statistical methods DO NOT function on adversaries. Consider the following two scenarios:

1) Alice and Bob are sending messages to each other in a lanugage I don't understand, and I, using statistical methods, wish to find out what they are saying.

2) Alice and Bob are sending messages to each other and don't want me to gain any information about this messages. I, using statistical methods, wish to find out what they are saying.

In scenario 1, I am likely to succeed. In scenario 2, the consensus is that I'm fucked. In just about every way. I can't tell what they're saying, I can't tell that any message that I have discerned is meant to mislead me, or doesn't carry some additional message hidden in the entropy of the message that was meant to mislead me. I can't tell if the communication is just noise meant to distract me, and if we want to talk practically, I can't even tell if they can communicate. Basically the only inference that I can draw is that they can't communicate faster than the speed of light.

Here's another example: Gerrymandering. The scenario is that one party has a clear advantage in terms of number of representatives vs proportion of population. We must establish whether that number was arrived at fairly, or by cheating. We assume that the party in charge of drawing the borders knows what tests we can perform, because that is always the assumption that you give to an adversary.

The adversary has a very simple (though potentially computationally expensive) algorithm to run. Check all possible border configurations for both advantage and your cheat detection. Pick the one which maximizes advantage which does not pass the cheat threshold, or just whatever your utility is.

Statistics needs the assumption of good faith in order to operate. Anyone who uses statistical methods when that assumption that cannot be made is at best a bad statistician.


> statistical methods DO NOT function on adversaries.

I think this is the essence of your argument. This can be defeated with counter-example. Test cheaters are adversarial to any detection of their cheating, yet statistical analysis can expose the cheater without much issue.


i believe that we are on the same page about this claim, but it does not salvage statistics. Originally i claimed that what happens is it becomes game theory. That was a bit of a simplification, but it is illustrated by your example.

In this case, statistical methods cannot positively identify no cheating, and the extent to which they can identify instances of cheating, it is because the observed party was not acting adversarially.

The algorithm i presented anove for gerrymandering is very general.


> i believe that we are on the same page about this claim,

I don't see how that is possible

> but it does not salvage statistics.

Statistics does not need salvaging.

> Originally i claimed that what happens is it becomes game theory. That was a bit of a simplification, but it is illustrated by your example.

My cheating statistics example is a counter-example that defeats your argument.

> In this case, statistical methods cannot positively identify no cheating,

That is the entire point of cheating statistical analysis, to determine if cheating occurred. If cheating is not statistically identified, then the analysis shows "positively" that cheating hasn't occurred.

> and the extent to which they can identify instances of cheating, it is because the observed party was not acting adversarially.

Statistical analysis of cheating does not involve direct observation, and any cheater is adversarial by definition.

> The algorithm i presented anove for gerrymandering is very general.

On the contrary, it is not only specific, it does not support your argument. Politicians are not statisticians, and the depth of statistical analysis is notably shallow and has a single factor, party affiliation.

Statistics is a very old and complex discipline. It is technically a branch of mathematics. In advancing the argument that a biased statistician can produce incorrect results, or that statistics can not accurately study adversarial subjects, the underlying fallacy to these arguments is hasty generalization. As laymen, we can not invalidate an entire discipline or even speculate the limits of such a discipline based on such very specific and synthetic circumstances.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: