Improving reasoning in language models via representation engineering

In a recent paper (Højer et al., 2025), which will be presented in Singapore at the International Conference for Learning Representations, we assessed whether it was possible to modulate the perceived reasoning ability of an LLM using an approach called representation engineering. A key aspect of a transformer (the machine learning architecture on which (almost) all LLMs are based) is the transformation of high dimensional text representations. In this framing, a representation is a large vector, a long list of numbers, that represents the text that is currently being processed by the model. It is this representation that we attempt to manipulate.

Related work had shown that you could succesfully modulate this representation to induce specific types of “behavior” in an LLM. An example hereof could be to make the model output more or less positive text. This can be done by only manipulating the representation, if you can figure out how to manipulate the representation that is.

As it turns out, it is possible to do the same with the perceived reasoning ability of an LLM. We used the approach on various simple reasoning tasks, and managed to improve the performance of an LLM on these tasks by manipulating the aforementioned representations in a controlled manner. Knowing that this had been done for other types of “behavior”, it wasn’t a revolutionizing finding. It is however important given recent discussions concerning the abilities and intelligence of LLMs. In this debate, reasoning is often emphasized as a key component of intelligence. In these discussions, it seems that reasoning is often thought to be different from other types of information processing done by LLMs. Our results suggest that this is not the case.