Antecedent
No prior reading is required.
Emergence
While reading an article about ‘Humanity’s Last Test’, I wondered if it would be possible to orchestrate an intellectual contest between two Large Language Models (LLMs), presided over by a third LLM acting as the judge. To test this, I established a specific set of rules for an automated debate.
Stabilization
Below is the framework for the debate competition. I designed the protocol to be as unbiased, simple, and free from human intervention as possible.
- Assign two LLMs (A and B) as players and one LLM (C) as the referee.
- Model C generates 10 pros-and-cons discussion topics.
- Ask A and B for their stance on each of the ten topics.
- Select the first contending topic and order A, B to make a claim and supporting reasoning for that opinion.
- Exchange opinions and ask each model to refute the other's claims.
- Exchange the rebuttals and ask for a counter-rebuttal.
- Based on the interactions, have A and B draw their final conclusions.
- Finally, provide Model C with the summarized transcript and ask it to select the winner.
Convergence
I am looking forward to the fascinating results this competition will yield. I will write a follow-up article to post the results of the actual debates.
Descendant
The following link leads to the competition results.
Debate Championship For LLM (ChatGPT vs Gemini; Copilot)
Title | Type |
|---|---|
Phonon | |
Phonon | |
Tachyon | |
Phonon | |
Gluon | |
Tachyon | |
Phonon | |
Lepton | |
