Refuting Digital Epistemology via Literary Theory and Critical AI Studies

Notes on Digital Epistemology, pt. 2

Jun 30, 2024

Introduction

Timothy Bewes’s 2022 book Free Indirect: the Novel in a Postfictional Age examines novel theory, a field of literary studies that seeks to identify and describe the characteristics that define the novel as a distinct form of writing. Despite his career-long engagement with novel theory, Bewes questions its most modest and fundamental premise, which is that all acts of communication — including the writing of novels — consist of correspondence between meaning-making frameworks available to the communicating parties, such as authors and readers. On this view, communication is a process of finding “matches” between frameworks so as to promote a shared awareness of whatever is being communicated. This premise is essential to novel theory, which presupposes that novels invoke or generate frameworks that endow texts with a baseline of accessible general meaning.

Bewes argues that certain novels fail to perform this act of invocation or expression but nevertheless succeed as novels. These works are devoid of an internal logic capable of being accessed on the level of affect or intellect, an exclusivity that applies not only to readers, but also to characters and authors. It’s not so much that this internal logic or “thought” (as Bewes calls it) doesn’t exist at all, but rather that it is withheld from all of the minds — both real or fictional — that inhabit or encounter the novels within which it dwells. These are the works he refers to as having “free indirect” style. Bewes examines passages from novels by writers including W.G. Sebald, Zadie Smith, and J.M. Coetzee for indications of this style, while at the same time asserting that free indirectness defies theorization. In the case of such novels, he tells us, “the so-called tact of the writer may be no more than an invention of the critic.”

What he calls “tact” is something like what I mean by “internal logic” or “thought” — it could also be described as a logos, organ of mentation, or mind. Whatever we call it, it matters to Free Indirect in its capacity as the thing that predetermines and sets limits to the possible combinations of words that might be used to invoke it. If it’s an invention of the critic rather than a facility demonstrated by the author, there would be no necessary link between it and the words that comprise novels. The novel’s text would instead have a contingent or “indirect” relationship with it, a connection whose endless formal mutability renders it resistant to detection and articulation by even the most skillful theorist.

Bewes acknowledges that novel theory and other approaches to literary criticism might have at one point enhanced our understanding of literature. While criticism’s façade of authority has long been questioned by scholars — the best-known example of this might be Susan Sontag’s classic essay “Against Interpretation” — literary criticism is still a robust area of humanistic research. As far as I can tell, lit crit makes relatively little of the possibility that some novels may only be understood insofar as they resist assimilation into theoretical frameworks. At this point, it’s useful to mention that Bewes situates “free indirectness” as a relatively recent phenomenon in the history of literature, although it’s not without precedents (he cites Virginia Woof’s Mrs. Dalloway as an early example). Free indirect novels make up the vanguard of contemporary literary fiction for him, and free indirectness might be considered a “trend,” not in the sense that trends are ephemeral or unimportant but in the sense that “trend” designates the emergence of new creative modalities. He’s interested in discussing this trend for two reasons: first, to make sense of the particular ways that novel theory fails today; second, because free indirect style is symptomatic of a phenomenon that extends beyond the domain of the literary arts.

This broader application accounts for my interest in it. Throughout the rest of this post, I use Free Indirect to discuss the problems of what I referred to in my first post as “digital epistemology,” along with a lot of other research that gets lumped under the category of “critical technology studies.” In order to make this connection, I have to summarize a lot of Free Indirect . (Hopefully it’s not too much of a slog to read).

Part 1: The Instantiation Relation

Free indirect novels neither presuppose nor auto-generate the meaning-making frameworks from which writing is generally understood to take its meaning. Crucial to Bewes’s diagnosis is his observation of the death of the “instantiation relation,” his term for the link between a manifest component of a novel — such as a scene, piece of dialogue, or narrative arc — and the general phenomenon that the component indexes — such as a philosophical question, emotion, or aesthetic impression. Defined succinctly, the instantiation relation is the connection between a moment in a narrative and the categorical generality that it indexes. Imagine, for example, a scene in which two characters discuss the goings-on of other members of their social group. Drawing from that particular conversation and contextual cues from the rest of the novel, readers might take the conversation as evocative of philosophical tensions between rational and mystical worldviews. This example is inspired by the characters of Settembrini and Naphta from Thomas Mann’s The Magic Mountain. Mann won the 1929 Nobel prize for literature and strikes me as the sort of writer whose works novel theorists would consider exemplary of the novel form. As a “novel of ideas” published nearly a century ago, it should be quite different from the type of novel with which Bewes is concerned. I say “should be” because “novel of ideas” has been scrutinized as a “gimmicky” label — in that article, Sianne Ngai suggests that “novels of ideas” may indeed be closer to free indirectness than my account of it (and most accounts of it) indicates. As something of an aside, this possibility calls Bewes’s chronologization of “free indirectness” into question. He’s duly aware of the problems of assigning fixed origin points to literary trends; in the context of his project, this problem embodies the broader paradox of writing about the unformalizable / unintelligible. Concessions have to be made constantly. Regardless of whether classic “novels of ideas” count or not, free indirect novels do not express ideas. Their content isn’t philosophical at all.

Instantiation relations can be conceived as meaning-conferring categories in a Kantian sense, where such categories are essential for the mental apprehension of phenomena. These general categories are necessary precursors to thought and communication, but defy expression in and of themselves. Bewes states as much at the outset of Free Indirect, in a section where he explains that instantiation relations use “particular images, persons, and situations in order to invoke a generality that has no need (or possibility) of being named ‘in its own appropriate language’.” Per Kant, “generalities” are unthinkable predicates of thought. All acts of mentation emanate from and are linked to them through instantiation relations, without which our experience of life would be incoherent.

Free Indirect does not give us an original account of the instantiation relation. Bewes states that this theoretical work has already been carried out by “the great analysts of bourgeois aesthetic ideology,” a list that starts with Immanuel Kant and includes György Lukács, Pierre Bourdieu, Fredric Jameson, and Jacques Rancière, among others. Bewes extends their contributions by inquiring into “the very different relationality that exists at the borders of the novel’s form,” a connection he qualifies as not based on “some merely formal possibility” but as ensuing from “the collapse of the categories of knowledge that literary forms are able to bring forth (directly or indirectly) in the realm of thought.” This collapse marks the emergence of what Bewes refers to as the condition of “post-fictionality.” In the post-fictional condition, organs of thought — which can include both novels and minds — can no longer make contact with categories, as these categories have lost their formal structure and thus effectively ceased to exist. This form of contact is the same entity as the instantiation relation. Without it, the text of novels would have no ties to structures which make it available to the intellect (where “structure” and “category” can both be treated as synonyms for what I described earlier as a “framework”). In effect, this means that free indirect novels’ “thought” has no given or necessary relationship with language.

The death of the instantiation relation would seem to spell out the death of meaning in literature. After all, novels are traditionally composed of instantiation relations, which is why hermeneutic approaches to literature have been at least partially successful in revealing meanings that aren’t on the surface of the text. Free Indirect is concerned with the break with this tradition, which (as I’ve noted) Bewes frames as a recent event (to the extent that such events can be periodized at all). He tells us that the book was occasioned by the emergence of post-fictionality as a new and increasingly widespread condition of thought. I recalled this occasion earlier as the phenomenon that surpasses the domain of the literary arts, or the applicability that drew me to Free Indirect in the first place.

While the book is almost exclusively focused on novels, Bewes spends a chapter examining the role that instantiation relations play in the production of knowledge in the age of big data. As he observes, the correlative processes by which knowledge is “extracted” from datasets function much like the operations of literary criticism as applied to free indirect novels. That is, the latent causal links waiting to be “extracted” from datasets would be an invention of the data scientist rather than an inherent feature of the data, just like the supposed “tact” of the writer would be an invention of the critic. As Bewes puts it,

One of the principal forms for the dissemination of the logic of instantiation is the activity and concept of “profiling,” a term that currently operates, in the United States and elsewhere, in two main disciplinary and discursive contexts: “psychology” and “race.” Both contexts involve the exploitation of intuitive or traceable connections. Associations are made on the basis of certain “raw” data (phenomenological impressions, perceived racial or demographic patterns, identifiable social trends, individuals’ self-reported preferences and desires), certain conclusions are drawn, and certain social consequences ensue. Profiling, which we may define as the extraction of “constants” from such data, is a direct enactment of the logic of instantiation in the public sphere.

To this I would add that all computational functions instantiate. There is no computational activity that does not realize a relatively stable external referent or “connect a system of preexisting qualities or properties” with “objects that embody them,” to quote the sentence that directly precedes that passage. In this chapter, Bewes proceeds from an abbreviated discussion of data profiling to a series of sweeping remarks on instantiability as a predicate of all acts of abstraction, where “abstraction” refers to the act of disembedding features from their instantiated forms. Although this move struck me as a bit hasty and underdeveloped in execution, I was encouraged by Bewes’s choice of data technologies as exemplary of the logic of instantiation.

Part 2: Thought Without Mind

One might assume that if the condition of post-fictionality were to expand and become total, all meaningful communication would halt. The Tower of Babel would lie in ruin. The central argument of Free Indirect is that a dearth of given and necessary links between meaning-making frameworks and the forms that are meant to invoke them does not amount to an absence of meaningful thought. Free indirect novels may lack these links, but they “think” nonetheless. Reviewer Athanassia Williamson compares the existence of this thought to “the life of a character beyond narrative fiction” — a reality that “the novel undeniably invents, even as it survives beyond its margins.” Thus Bewes pursues a relationality that “exists at the borders of the novel form,” at the junction where the margins end and the “beyond” begins. This relationality, he tells us, is “very different” from the sort that scholars of literature are accustomed to studying. It is an entity that necessarily appears to us as anti-form.

To state what might be already clear, Bewes’s endeavor lacks a positive object. It might be seen as futile, even nihilistic. Carson Welch acknowledges this in a somewhat critical review:

Bewes shows that the thought of such novels — irreducible to form, the interpretive possibilities of criticism, or the thought expressed within the novel itself — is like that of cinema in the work of Deleuze: it is a thought unthinkable by us, a thought in which the universe thinks itself. At this point it becomes difficult not to ask: what good is a thought if no one can think it?

I found myself struggling with Free Indirect for the same reason. In certain moments, it courts abject mysticism. I also wondered how different its aims really are from that of the poststructuralists — especially Jacques Derrida — whose claims regarding the autonomy of text were radical in the twentieth century but would be considered standard in academic settings today. I studied literature as an undergrad at Bard College, an institution that takes its reputation in the world of literature studies very seriously. Bard professors drill lit majors on variations of essentially the same idea, which is that there’s no “outside” to fictional text. (A correlate to this maxim is that the greatest crime you can commit against literature is to view biographical facts about authors as having anything to do with their work). Free Indirect can be easily misread as making the same argument, which is that there’s nothing beyond or beneath the surface of prose. I stuck with it because I trusted Bewes, a very well-established scholar, to give us something new.

After voicing the more frustrating aspects of Free Indirect, Welch points to where the book might indeed be said to make positive contributions to knowledge. In his words, Bewes’s direct confrontation with “paradoxes of contemporary criticism” makes the present “newly thinkable” insofar as it traces a relatively new and ongoing “autonomization of thought from thinker.” The fact that Bewes bears witness to “a thought unthinkable by us” is enough for me to distinguish it from what Derrida was doing, as Derrida rejected the notion of a logos that necessarily exists anywhere at all, hidden or not. Regardless of whatever metaphysical meaning we impute to this certainly-not–for-us thought, the possibility that it exists has implications for the era of heavily data-informed communication, which has been characterized as a return to Platonism or positivism due to its (alleged) epistemological presupposition that datasets comprise manifest links between form (correlations) and content (the causalities indicated by correlations). Free Indirect suggests that we can bear witness to a type of thought that confounds this digital epistemology by paying close attention to paradoxes and breaks in form. This is where György Lukács comes in.

Part 3: Lukács and Deception on the Level of Form

Throughout the book, Bewes draws from the work of Lukács, the theorist perhaps best known for conceptualizing reification (and who was, apparently, not always a Marxist — something I didn’t know until after I read Free Indirect). Chapter three departs from Lukács’s distinction between two types of novel: "entertainment novels” and “novels proper.” The distinction is based on what I referred to earlier as the novel form’s most modest and fundamental premise, which is the idea that communication entails correspondence between meaning-making frameworks. E.g., when you read a message from me that says “rosebud,” it activates something in your mind that was already primed to recognize the text-symbol, and we’re united in our understanding of what it means.

Novels are not supposed to refute this premise. They are founded on the recognition that the contents of life can be cognized and fashioned into structures that allow them to be transmitted from mind to mind. Thus they construct life that’s unreal insofar as it’s fictional, but broadly meaningful. This ground of meaning allows us to treat specific books as having essentially the same identity regardless of who reads it. The Magic Mountain is always The Magic Mountain rather than a structure that generates endless haecceities.

The Lukácsian “entertainment novel” functions in this manner. They presuppose a space beyond the text where meaning-making frameworks dwell. The potential impact of these novels, the chance that they might have some effect on readers, lies in their ability to capture and refer back to referents that they (and their readers) recognize as really existing in the sense that the world beyond the text really exists. These referents provide blueprints for the world that the author constructs. So long as we are discussing entertainment novels, which are “traditional” in the sense that they are made up of instantiation relations, these blueprints are essential to the novel’s successful construction of unreal-but-meaningful life.

The difference between entertainment novels and novels proper is that the latter fail to correspond with thinkable (and therefore referenceable) worlds. If novels set out to construct unreal life that is intelligible to the degree that it partakes of a world outside the text, proper novels defer this task through repeated failures to invoke this world. These failures are substituted by what we might think of as staged performances of the invocation-act, where the stage is their own status and identity as a novel. Put differently, if we accept Lukács’s distinction between “entertainment novels” and “novels proper,” we concede that novels are chiefly understood to be mechanisms that render life in forms that more or less approximate something that corresponds with other forms. The novel form is coherent as a mechanism that enhances, reveals, or augments the meaning of these encounters and perceptions, where “meaning” can be construed as broadly as possible — it may be affective (and perhaps, therefore, resistant to absolute capture in language), ethical, religious, and so on. This function gives novels their real identity, it’s what novels are.

The fact that this is the novel form’s “real” identity is crucial, because what I’ve referred to as the “staged performance of the invocation act” depends on the reality of this conceit. Novels really have the identity of reality-correspondent things in the same sense that the reader’s world is real. Thus the fictionalizing effect of novels proper takes place in the domain of the reader’s world rather than the text. When novels fail to render life that can be meaningfully cognized, they’ve pulled off a sophisticated act of deception — a “lie” not on the level of content, but form. This lie stretches beyond the confines of the text. From the perspective of the reader, it introduces a rupture in a structure that would seem to need to be sustained not only for the novel form to cohere, but for the experience of real life to cohere. In indicating that the invocation act can indeed be faked, the Lukácsian novel proper — whose standards are certainly met by the Bewesian free indirect novel — points to the inherent contingency, which might be otherwise phrased as “constructed nature” or “non-inevitability,” of whatever gives the experience of life meaning. These works indicate endless alternatives to this “whatever.” To be clear, these alternatives are not for us — this is “the thought in which the universe thinks itself,” as Welch puts it.

Part 4: Returning to Digital Epistemology

Someone on Mastodon pointed out a problem with my last post. I’d be remiss if I didn’t mention that I was already thinking about it, and in a vital sense, that this post might be seen as a response to that provocation.

I’ll phrase the problem as two related questions: first, if “digital epistemology” refers to a theory of the sorts of knowledge that can and cannot be communicated through the digital medium, wouldn’t all forms of text necessarily have to be included ? Second, if that’s the case, why would we even need such a theory —since, after all, “tacit knowledge” and other epistemologies of the wholly or largely incommunicable have been posited by scholars since the 1950s? If text, numbers, and other patternings of symbols can generally be handled by computers (I’m not going to pursue tangents about computability), these theories would seem to suffice as something like a negative digital epistemology, i.e., as philosophies of what withstands digitization vis-à-vis positive identifications of what does not.

I should clarify that I don’t consider “digital epistemology” a valid field of study. At the outset of the last post, I suggested the opposite, but that’s not where it ended — although the fact that that could easily be missed is probably a sign that I’m not cut out to write on Substack (I’m taking the freedom of having no editor too far…). At any rate, I’m continuing to grapple with “digital epistemology” not because I think that the question of what can and can’t be digitized will yield new insights, but because I’m interested in the reasons that the question is something of a dead end. I think that this dead end points to a notable break between the thought implicit in certain forms of art — of which I consider the Lukacsian “novel proper” and the Bewesian “free indirect” novel ideal cases, although this thought is not unique to the literary arts — and the “thought” of the digital, which takes form as an index of function and ties knowledge to function, or instrumentation, accordingly.

Part 5: Virtual Continuousness on a Discrete Space

As I’ve noted, Bewes contends that the thought (or logos) of free indirect novels is “radically heterogeneous” — his phrasing — with its instantiated form. In the context of computation, instantiated forms have necessary relationships with what we might call their thought. The terms logos, telos, or “inner truth,” would also work, not because they mean the thing, but because for our purposes what matters is that both count among the possible constituents of an “inner truth” (to pick one of those terms at random) that lends itself to manifestation in certain forms but not others.

Insofar as its act of fictionalization — or deception—takes place on the level of form rather than instantiated content, I would argue that free indirect text approximates continuousness on the discrete space of language. Continuousness is the mathematical condition of being endlessly reducible, of comprising a theoretically infinite amount of variables. In the last post, I addressed popular conceptions of continuousness as the essential point of divergence between analog and digital media. As so many would have it, the analog-continuous and the digital-discrete exist in clear opposition to one another. Scholars have investigated this tension through various disciplinary lenses. (For those unfamiliar with this discourse, the first part of "Digital Aesthetics: The Discrete and the Continuous" by M. Beatrice Fazi breaks it down quite well). “Digital” and “discrete” aren’t perfect synonyms, for technical reasons that have practical applications. But I’m still curious about why the analog, in its capacity as a foil to the digital/discrete, has been widely valorized for its aesthetic qualities. My wager is that there’s a relationship between the feature of continuousness and poiesis, the philosophical conceptualization of the creative act which brings forth the new, and which might be thought of as manifest processions or modulations between non-existence and existence.

The forms that make up free indirect text are given to endless modulations, as they have no necessary formal relationship with an entity that expresses itself through them. As a metaphor for this modularity (or “slidey-ness in form,” which is how I sometimes think about it), I would turn to the process of subword tokenization in natural language processing (NLP), a function central to machine learning / artificial intelligence. NLP seeks to understand human language, which is a goal as old as artificial intelligence itself. Subword tokenization is a relatively new NLP technique that’s made natural language processing more efficient. It’s of particular relevance today due to its deposition in large language models (LLMs), a technology central to the present wave of text-generating AI applications, of which the best-known example is ChatGPT. It’s also a subset of the more general technique of tokenization. Here’s a good introduction to that — although I don’t think it’s necessary for grasping the rest of this post, since I’ve tried to summarize the relevant bits.

In their article “Synthesizing Proteins on the Graphics Card: Protein Folding and the Limits of Critical AI Studies,” Fabian Offert, Paul Kim, and Qiaoyu Cai claim that a morphological understanding of intelligence will yield innovations in machine learning that might otherwise lie dormant. This potential rests on a view of intelligence as a continuous “shape” rather than an object capable of being rendered in discrete terms, which they consider analogous with the “linguistic” rather than the “shape-based.” In the opening pages, Offert et al. explain that humanists tend to view AI’s capacity to “ ‘see’ higher-order structures where humans cannot” as “structuralism 2.0.” This is the interpretation that’s led certain scholars to describe AI as “neo-Platonist.” It’s based on what the authors claim to be a widespread misconception of AI as constituted by discrete rather than continuous components. They attempt to debunk this myth by exploring the architecture of protein-folding AI models, which are made up of algorithms that predict patterns of amino acid structures. They explain their choice of case study as follows:

both natural language and protein folding models deal with sequences, [but] this is where the commonality ends. Amino acids are not letters and amino acid sequences are not sentences… techniques like word embedding, positional encoding, and subword tokenization “infuse” tokens with a continuity that could not be more language-unlike, and actually is much closer, analogically, to a physical, not a symbolic understanding of the world.

Word embedding, positional encoding, and subword tokenization are LLM functions that effectively predict (or “match”) the best outputs for inputs. In LLMs, these inputs are normally called “prompts” (“write me a 1,500 word essay about Lord Byron”). According to the authors, all of these techniques introduce continuousness to the discrete space of language. In an apparent paradox, the remarkable linguistic capacities of LLMs rely on the fact that these techniques, which are essential to their functionality, “could not be more language-unlike.”

Of these three techniques, subword tokenization strikes me as the most readily graspable metaphor for free indirect text’s contingent relationship with language. This is because it “breaks” language at inconsistent intervals, which indicates that language can be operable beyond the sequencing of irreducible units. The authors contend that subword tokenization “dissolves the very concept of the word,” an observation they qualify as follows:

Subword tokenization emerges from a fundamental challenge of natural language pro- cessing: there are too many word-level tokens to be efficiently encoded, while character- level encoding is too fine-grained to preserve high-level information. The goal of sub- word tokenization is to strike a balance between context-rich, yet inefficient word-level tokenization and efficient, yet semantically impoverished character-level tokenization. Subword tokenization breaks up words into smaller, more manageable units (subwords) representing syllables or strings of characters that occur frequently in a textual corpus. For example, the rare word “refactoring” can be broken down into recurring units such as “re,” “factor,” and “ing.” These more manageable units still maintain contextual relevance and are semantically salient. At the same time, it is already in this very first pre-processing step that we can see the “linguistic” nature of language, its dependence on discrete tokens organized in hierarchical structures, vanish.

“Tokens” are the units of text that large language models predict are designed to predict in comprehensible sequences. Subword tokenization solves the problem that they describe at the beginning of that passage: on one hand, so many words exist that it would be incredibly inefficient to encode each of them as tokens; on the other hand, the vastly more efficient process of encoding individual characters would weaken the LLM’s predictive capacity. Subword tokenization is something of a compromise between the two, as it tokenizes parts of words. This is how we get “subwords.”

This is an example of a subword tokenization of the word “unrelated” (link to the article this comes from):

As Offert et al. put it, subword tokenization is nearer to a “physical” understanding of the world than it is to a “symbolic” one. Language as it’s conventionally understood, and as the authors would have it, comprises systems of discontinuous symbols that are supposed to be linkable to the mental frameworks or categories I’ve referred to throughout this post. The innovation of free indirect writing is in the fact that it is unlinkable to such frameworks while nevertheless perfectly falsifying the appearance of these links — I described this function earlier as the deceptive “staging” of an invocation-act. The innovation of subword tokenization, along with word embedding and positional encoding, is that it yields the appearance of these links while existing in a continuous state. Hence the metaphor.

While it is true that, in practice, subword tokenization can’t “break” beyond the irreducible unit-level of characters, this level doesn't necessarily make up the ground or consistent point of discretion between the other parts of tokenized words. For this reason, subword tokenization treats neither words nor characters as intrinsically and irreducibly meaningful symbols. Instead, it treats both as variables that contain and may contribute to theoretically infinite amounts of subcomponents while lending themselves to the production of text that should only be “meaningful” insofar as its anchoring referents are relatively stable, unmoving.

Conclusion

I don’t believe that LLM-generated text “thinks” in the same manner that free indirect text “thinks.” The metaphor between the continuousness of subword tokenization and the continuousness of free indirect writing has limits. But both point to a creative capacity that may only be achievable by foiling language’s presupposition of necessary links between form and function while at the same time committing to an illusory depiction of this premise’s inevitable necessity. In the domain of fiction-writing, it may be that the poiesis depends vitally on commitments to this particular sort of illusion, or lying at the level of form.

Coda

This post took me a long time to write: a few full — 8+ hours-long— days of work, not counting all of the hours I spent reading the texts that I wrote about. And I will almost certainly revise it after publication.

If I have the energy for a related post, I might consider what, if anything, all of this means for readers. I have a note to myself that says the opposite of reading a novel to decode its meaning is reading a scientific textbook as a work of poetry. That could be a place to start. I might also think about what this means for other forms of art, or for poiesis, since that’s where I left off.

I’m thankful for the reader who found my work on Mastodon and recommended that I read “Synthesizing Proteins on the Graphics Card,” an article I clearly found eye-opening. This was the same person that asked if “digital epistemology” would necessarily have to include all text.

I know this wasn’t an easy read. I am really grateful to anybody who took the time to get through it. I know that I have a lot to learn about AI if I want to continue to investigate it through these lenses (critical humanistic; post-structuralist), so please correct me if I’m getting anything very wrong. If you’re so inclined, point me to stuff I might learn from (books, articles, etc.). I will take your suggestions seriously.

Elf Theory