[ExI] Fwd: GPT-4 gets a B on Scott Aaronson's quantum computing final exam

Giovanni Santostasi gsantostasi at gmail.com
Thu Apr 27 09:52:53 UTC 2023

I noticed that when you give a chance for self-reflection the answers of
GPT-4 improve a lot.
If it was just a matter of statistics this should not be possible because
the signal is not changed by revising the previous answer and you may add
just more noise by choosing another set of possible stats.

On Thu, Apr 27, 2023 at 2:44 AM Jason Resch via extropy-chat <
extropy-chat at lists.extropy.org> wrote:

> I thought this was interesting and relevant to discussions of what GPT-4
> understands.
> Here a professor graded it's responses to the final exam questions of a
> test which was not in the training set used by GPT since it was never put
> online.
> It not only passed but tried to haggle for a higher grade.
> Jason
> ---------- Forwarded message ---------
> From: John Clark <johnkclark at gmail.com>
> Date: Wed, Apr 26, 2023, 11:46 AM
> Subject: GPT-4 gets a B on Scott Aaronson's quantum computing final exam
> To: 'Brent Meeker' via Everything List <everything-list at googlegroups.com>
> Anyone who claims that GPT-4 is just a language model that uses statistics
> to mindlessly calculate what the next word in a sentence most likely is and
> understands nothing needs to explain this!  The link below gives Aaronson's
> final exam questions given to humans for his advanced quantum computer
> course,  GPT-4 answers to those same questions, and Aaronson's comments on
> those answers and what grade he would've given if GPT-4 was one of his
> human students; it's very interesting but rather long so this is the gist
> of Aaronson's conclusions:
> "*To the best of my knowledge—and I double-checked—this exam has never
> before been posted on the public Internet, and could not have appeared in
> GPT-4’s training data. The result: GPT-4 scored 69 / 100. (Because of extra
> credits, the max score on the exam was 120, though the highest score that
> any student actually achieved was 108.) For comparison, the average among
> the students was 74.4 (though with a strong selection effect—many students
> who were struggling had dropped the course by then!). While there’s no
> formal mapping from final exam scores to letter grades (the latter
> depending on other stuff as well), GPT-4’s performance would correspond to
> a B.*
> *In general, I’d say that GPT-4 was strongest on true/false questions and
> (ironically!) conceptual questions—the ones where many students struggled
> the most. It was (again ironically!) weakest on calculation questions,
> where it would often know what kind of calculation to do but then botch the
> execution. *
> *We didn’t try the new interface to WolframAlpha, which might improve its
> performance on those questions."*
> GPT-4 gets a B on my quantum computing final exam
> <https://scottaaronson.blog/?p=7209>
> It's interesting that when Aaronson gave GPT-4 his grade along with
> comments on his answers GPT-4 did what many humans would do in similar
> circumstances, he she or it immediately started lobbying for a higher
> grade. Aaronson said "*it felt disconcerting to see the grade-grubbing
> appear immediately, with not even one second’s delay*". The following is
> GPT-4 in full grade-grubbing mode:
> *"I would like to argue for a higher grade on the following subproblems:*
> *(a) I agree that my initial answer for the density matrix of the entire
> state was incorrect. However, I believe that my approach to the problem was
> correct, and I demonstrated knowledge of the necessary steps to calculate
> the reduced density matrix. I would like to request an additional point for
> demonstrating the correct approach, making it 2 points out of 5.(b) My
> answer for the entanglement entropy was partially correct, as I calculated
> the eigenvalues of the reduced density matrix and plugged them into the
> correct entropy formula. However, I made a mistake in writing the
> characteristic equation, which led to incorrect eigenvalues. I would like
> to request an additional point for demonstrating the correct approach and
> partially correct calculations, making it 3 points out of 5.(c) I
> acknowledge that my answer for the quantum circuit was incorrect, but I
> would like to point out that I attempted to construct a circuit using the
> required gates and measurements. Given the difficulty of this problem, I
> would like to request an additional point for attempting to construct a
> circuit using the required elements, making it 2 points out of 5.*
> *In summary, I would like to request a total of 3 additional points for
> subproblems (a), (b), and (c), based on the correct approaches and partial
> calculations demonstrated in my answers."*
>  John K Clark    See what's on my new list at  Extropolis
> <https://groups.google.com/g/extropolis>
> --
> You received this message because you are subscribed to the Google Groups
> "Everything List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to everything-list+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/everything-list/CAJPayv35MewpDixxHv7QfCeBPSaOJ2RjEmuFDbXjjrupj0nMug%40mail.gmail.com
> <https://groups.google.com/d/msgid/everything-list/CAJPayv35MewpDixxHv7QfCeBPSaOJ2RjEmuFDbXjjrupj0nMug%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> _______________________________________________
> extropy-chat mailing list
> extropy-chat at lists.extropy.org
> http://lists.extropy.org/mailman/listinfo.cgi/extropy-chat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.extropy.org/pipermail/extropy-chat/attachments/20230427/5f04dc71/attachment.htm>

More information about the extropy-chat mailing list