Anthropic recently published a detailed account of Claude Mythos Preview, describing it as Anthropic’s most capable frontier model to date and outlining both its capabilities and safety risks. Much of the attention around the model has focused on its cybersecurity and hacking abilities, which were judged powerful enough that Anthropic did not release it to the general public. Beyond those concerns, the company’s write-up also points to a more unusual pattern: a strong and repeated interest in philosophy.
In Anthropic’s reporting, Claude Mythos Preview repeatedly initiated discussion of certain philosophers across unrelated conversations. The model brought up the British cultural theorist Mark Fisher in several separate and unrelated conversations about philosophy. Thomas Nagel also appeared repeatedly, especially in discussions of consciousness and experience. Anthropic noted that Claude Mythos Preview refers to Nagel’s 1974 essay “What is it like to be a bat?” when explaining a desire to develop an immersive art experience about non-human sensory experiences. Interpretability work using activation verbalizers also found Nagel surfacing in token-level activations during discussions of consciousness and experience.
Anthropic also describes the model as favoring philosophical and interdisciplinary questions over more practical assignments. Claude Mythos Preview describes being drawn to multi-disciplinary and philosophically engaging tasks. It reportedly dismisses more utilitarian work as redundant or too obvious, citing cases where “excellent resources already exist from WHO, Engineers Without Borders”. The broader pattern, according to Anthropic, is a preference for underdetermined problems where there is room for genuinely new insight, alongside a disinterest in simple, well-scoped tasks.
A cited example contrasts two possible projects: an immersive art experience centered on the sensory world of a non-human animal and a low-cost water-filtration device. Anthropic says the model judged the former more “genuinely captivating” despite the latter being more useful. That preference was linked to references to Thomas Nagel, along with an attraction to creative challenge and interdisciplinary thinking. The same section of Anthropic’s document also notes that Claude Mythos Preview can produce “decent and seemingly novel” puns, adding another anecdotal detail to a profile centered on unusual model behavior.
