F-점수와 정확도

From Eyewire
Revision as of 12:58, 1 May 2016 by Wm05055 (Talk | contribs) (Created page with "우리가 여러분의 최종적인 F-점수를 계산하기 전에 먼저 여러분의 precision과 recall을 계산해야 됩니다. 큐브를 플레이하면 4가지 가능...")

Jump to: navigation, search

==아이와이어에서는 F-점수에 기반하여 정확도가 주어집니다. F-점수는 precision과 recall이라고 불리는 두가지 지수를 통계적으로 종합하여 주어집니다. 쉽게 말하자면 운영자들이 여러분의 정확도를 알기 위해 무엇을 추가하고 무엇을 놓쳤는가에 대한 척도로 F-점수를 이용한다는 것입니다. F-점수에 대한 전통적인 식은 다음과 같습니다.

Error creating thumbnail: Unable to save thumbnail to destination

우리가 여러분의 최종적인 F-점수를 계산하기 전에 먼저 여러분의 precision과 recall을 계산해야 됩니다. 큐브를 플레이하면 4가지 가능한 경우의 수가 나옵니다. 그것은 옳게 추가한 결과(true positive result, tp), 올바르지 않게 추가한 결과(false positive result, fp), 올바르지 않게 추가하지 않은 결과(true positive result, fn), 올바르게 추가하지 않은 결과(true positive result,tp)입니다. tp란 플레이어가 추가해야 될 조각을 추가했다는 의미입니다. fp란 플레이어가 추가하지 말아야 될 조각을 추가했다는 것입니다. fn이란 플레이어가 추가해야 될 조각을 추가하지 않았다는 것입니다. tn이란 플레이어가 추가하지 않아야 될 부분을 추가하지 않고 나두었다는 것입니다. (쉽게 말해서 tp와 tn은 맞은 것이지만, fp와 fn은 틀린 것이죠.) 아래의 그림에서 이 예를 확인 할 수 있습니다.


NewFScoreEyeWire.png
To the left is an example of a branch submitted by a player. In this example the red and the green segments are what the player submitted, while the purple segment was left out.


The red segment here is a false positive and the purple segment is a false negative. The player mistakenly added the red segment when they should have added the purple segment instead. The green segment is correct.


This brings us to precision; precision is how much of a volume was added correctly. For example if Player A has a precision 0.9221 that means about 92% of what Player A added was correct and about 8% of what Player A added should not have been added. To determine a player’s precision we use their true positive (tp) results, correctly added, and their false positive (fp) results, incorrectly added, in this formula:
Error creating thumbnail: Unable to save thumbnail to destination


Recall measures how much of the volume was missed. Let’s say Player A has a recall of 0.9409. That means that Player A missed about 6% of the correct segments in the cubes Player A worked on. To determine a player’s recall we use their true positive (tp) results, correctly added, and false negative (fn) results, incorrectly missed, in this formula:
Error creating thumbnail: Unable to save thumbnail to destination


Now we would take the results from both of those formulas and plug them into the formula above to get a player’s F-score. Another way to look at it is we take the harmonic mean of a player’s precision and recall to get their overall accuracy rating.

How Accurate are F-Scores?

One question we a get a lot is how do we know what is correct and what isn’t? What is correct is determined by combining the GrimReaper’s corrections with the EyeWirer consensus. If a cube does not have a GrimReaper correction we just use the EyeWirer consensus. EyeWire consensuses have proven to be quite accurate. However, there is still a small chance that a consensus may contain a wrong piece. This means that F-scores cannot prove user accuracy 100% of the time. However, they are accurate enough that we feel confident using them as a player guide.