Even the best AI language models fail dramatically when it comes to logical questions. This is the conclusion reached by researchers from the Jülich Supercomputing Centre (JSC), the School of Electrical and Electronic Engineering at the University of Bristol and the LAION AI laboratory. In their paper, „Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models“, the researchers attest to a „severe breakdown in functional and reasoning ability“ in the state-of-the-art LLMs tested and suggest that although language models have the basic ability to draw conclusions, they cannot reliably retrieve them. They call on the scientific and technological community to stimulate an urgent reassessment of the claimed capabilities of the current generation of LLM. Furthermore, they call for the development of standardized benchmarks to uncover weaknesses in language models‘ reasoning abilities – as current tests have apparently failed to detect this serious flaw. (jr)
Related Articles
Europäischer Exascale-Supercomputer JUPITER setzt neue Maßstäbe für Energieeffizienz
Platz 1 im Green500-Ranking für Europäischen Exascale-Supercomputer JUPITER am Forschungszentrum Jülich. […]
Alibaba Cloud: LLM mit Nvidia-Hardwarebeschleunigung
Das mit Nvidia und Banma entwickelte Large Multimodal Model soll Autobauern in China zur Verfügung gestellt werden und ein interaktiveres Erlebnis für Autofahrer schaffen. […]
LLMs wie ChatGPT versagen selbst bei einfachen Logikaufgaben
KI-Studie deckt gravierende Schwächen von Language Learning Models beim logischen Denken auf. […]