The Surprising Worth of Easy Problems in Test Scoring

Date

November 6, 2025

Type

Phonon

Antecedent

No prior reading is required.

Emergence

A common consensus about tests is that difficult problems should be marked with a higher score. This seems obvious because only good students can manage to solve the hard ones. However, I suggest you think from a different perspective. Easy problems might be a better indicator of who is worse. Why don't we mark easy problems with higher scores? I will explore this idea now.

Stabilization

I will set up the simplest model for this problem: two students and two problems. Student 1 is clearly better than Student 2, and Problem 1 is clearly easier than Problem 2 for both students. If we define the probability that the $i$ -th student gets the $j$ -th problem right as $p_{ij}$ , we can write the above conditions as the following inequalities:

p_{11}>p_{21},\quad p_{12}>p_{22},\quad p_{11}>p_{12}, \quad p_{21}>p_{22}

Then, what is a desirable test? A good test simply means that the better student gets a better score. So, we should compare the probability of Student 1 getting a higher score than Student 2 in two cases:

Case A: Problem 1 (the easier one) has more points. This is the new scoring method I am claiming.
Case B: Problem 2 (the harder one) has more points. This is the typical scoring method we experience.

If you think for a moment, you will realize that if either one of the students get both problems right or both wrong, the scoring weights don't affect who gets the higher score. Therefore, it is enough to concentrate on the cases where each student gets only one problem right.

In Case A (easier problem weighted more), Student 1 gets a higher score only if Student 1 gets Problem 1 right (and P2 wrong) while Student 2 gets Problem 2 right (and P1 wrong). Conversely, in Case B (harder problem weighted more), Student 1 gets a higher score only if Student 1 gets Problem 2 right (and P1 wrong) while Student 2 gets Problem 1 right (and P2 wrong).

So, these are the two probabilities that we should compare. If the probability of the first term is greater, Case A is better, and vice versa.

p_{11}(1-p_{12})(1-p_{21})p_{22},\quad (1-p_{11})p_{12}p_{21}(1-p_{22})

To compare them, let's define this function for our convenience. It is a strictly increasing function on the domain ( $0<p<1$ ).

f(p)=\log{\frac{p}{1-p}}

Using this function, we can simplify the comparison above into this form.

f(p_{11})+f(p_{22}),\quad f(p_{12})+f(p_{21})

Within the conditions of this model, we can find instances where either case can be better than the other. This is one example of an instance when Case B (weighting the easy problem more) is better than Case A: the surprising result we were looking for.

(p_{11},p_{12},p_{21},p_{22})=(\frac{3}{4},\frac{1}{3},\frac{1}{3},\frac{1}{4})

f(p_{11})+f(p_{22})=0>f(p_{12})+f(p_{21})=-2\log{2}

Convergence

Setting intuition aside and with rigorous calculation, I found out that easier problems should be marked higher under some circumstances.

It is important to question widespread consensus — ideas that we barely doubt — because they can be wrong.

Descendant

No more related particles, yet.

Title	Type
How to Summon Cubic Dice With Only Players’ Brains	Tachyon
The Surprising Worth of Easy Problems in Test Scoring	Phonon
Reverse Engineering My Personal Classical Music Preference	Gluon
Seeking the Hidden Unknown Chess Openings	Tachyon
A Pedestrian's Guide to Harsh Winter	Phonon
On the Usefulness of a Crosswalk Without Traffic Lights	Lepton