1 Introduction

In today’s society, human interaction with the natural world and other humans is increasingly and pervasively mediated by networked computing. Human intelligence and capacities to act on that intelligence have been magnified by computing, though the elemental operations of computing machines are exceedingly simple—their capacity to generate signals based on a binary represented in an on/off electrical charge. This is called the digital age, though if our times are to be reduced to a signature technology, these computing machines should be characterized as binary rather than digital.

The motivation for this paper has been questions that have arisen for us in a series of research and development projects funded by the US Government and private philanthropies in the areas of cybersecurity, medical informatics, and learning analytics.Footnote 1 We became involved in diverse projects based on our expertise the areas of digital literacy, AI-supported e-learning platforms and pedagogies, and the theoretical extensions of semiotics required to analyze participation in computer-mediated meaning.

In this paper, we theorize more widely and deeply than was possible within the immediate scope of these projects. Our approach is principally historical and philosophical, taking the form of a cross-disciplinary literature review that weaves between canonical texts and experience gained in our own, more grounded research. In this necessarily synoptic argument, we reference other works we have written that develop the historical and theoretical case at greater length and with additional referencing of sources (Fig. 1).

Fig. 1
figure 1

Overview of the paper

2 Mechanical intelligence since Ada Lovelace

The development of the field of AI—a recent terminological innovation—needs to be set in the broader historical context of the evolution of notions and practices of “mechanical intelligence”. German philosopher Gottfried Leibniz (1646–1716) is credited as the earliest exponent of the modern idea of computability ([1]: 170–72). Leibniz, however, had sought his inspiration in earlier times, identifying a precursor in the Chinese I Ching or “Book of Changes.” Leibniz took the yin and yang of Shao Yong (1012–1077) to represent the elemental binary, zero and one ([2]: 433–36, [3]). If the elemental truths of the world could be represented in mathematical notation, said Leibniz, reasoning the world’s intricate complexity could proceed computationally. Then, to determine the truth, “Let us calculate” ([4]: 50–51) (Fig. 2).

Fig. 2
figure 2

Gottfried Leibniz: “Let us calculate”; Ada Lovelace: “algebraical patterns”; Alan Turing: “Can Machines think?”

It was not until the industrial revolution that the first programmable machines pointed to the potentials of mechanized calculation. Two nineteenth century mathematical geniuses, Ada Lovelace and her mother Anne Byron, had visited English factories using the Jacquard loom to manufacture finely patterned fabric ([1]: 156–59). On these machines, intricately woven patterns were manufactured, whose designs had been programmed with punched cards. In 1833, Lovelace and Byron were invited to attend a high-society event at the London home of the inventor, Charles Babbage. On display in the drawing room was Babbage’s “Difference Engine,” a mechanical calculator. Babbage was planning a more advanced version, the “Analytical Engine.” Lovelace realized that the punched cards of the Jacquard look could be used to program calculating machines like Babbage’s. The two became intellectual partners.

Lovelace and Babbage’s collaboration culminated in 1843 with the publication by Lovelace of a 20,000-word journal article on Babbage’s designs for the proposed Analytical Engine. Though never built, Lovelace was speaking to its concept when she concluded that the Analytical Engine “weaves algebraical patterns just as the Jacquard-loom weaves flowers and leaves” in the patterns of fabric. In these ways, “not only the mental and the material, but the theoretical and the practical in the mathematical world, are brought into more intimate and effective connexion with each other.” By such means, it may be possible to “express the great facts of the natural world, and those unceasing changes of mutual relationship which, visibly or invisibly, consciously or unconsciously to our immediate physical perceptions, are interminably going on in the agencies of the creation we live amidst” ([5]: 693–97). Lovelace’s genius was to connect the programmability of the Jacquard loom and Babbage’s engine with representations of the world mediated by calculation. These were founding insights into the potential of machines to process intelligible meaning.

In 1950, Alan Turing returned to Lovelace’s by-then largely forgotten paper ([1]: 159–69). Can machines think?” Turing asked in dialogue with Lovelace in his pathbreaking article “Computer Machinery and Intelligence” [6]. Famously, Turing had proposed that, in theory, a machine could operationalize a sequence, computing numbers it had already computed [7]. He had written a report for the British National Physical Laboratory on the possibility of creating intelligent machinery [8]. Now he was at the University of Manchester working on one of the world’s first computers, the Manchester I.

“How did this happen?” scrawled Turing, after circling some numerals on a teletype printout from the Manchester I. This was perhaps the first time a machine thought something that was beyond normal human capacity. By 1949, announced The Times of London, “the mechanical brain” in Manchester had done something that was practically impossible to achieve in paper. It had found some previously undiscovered, extremely large prime numbers ([9]: 212–17). Of computing machines generally, Turing wrote, “[a]t my present rate of working I produce about a thousand digits of programme a day, so that about sixty workers, working steadily through the fifty years might accomplish the job, if nothing went into the waste-paper basket.” Meanwhile, “[p]arts of modern machines which can be regarded as analogues of nerve cells work about a thousand times faster than the latter… Machines take me by surprise with great frequency…, largely because I do not do sufficient calculation” ([6]: 455; 450).

When, asked Turing, would we know that a machine could think? For this, he devised a test, now known as the “Turing Test.” A person and a computer are behind a screen. Without knowing the identity of the responder, a person facing the screen asks questions alternately of the machine and the person. These are answered on a teleprinter in order to hide the identity of the answerer. If the questioner can't tell the difference between the computer’s response and the human’s, it must be because the machine has been able think in ways that make it indistinguishable from a human ([6]: 433–34).

The philosopher of language, John R. Searle, later developed a critique of the Turing Test by posing the hypothetical example of a Chinese room. A questioner in the room asks a non-Chinese speaker with a Chinese dictionary and a Chinese speaker, both behind the Turing screen, the meaning of Chinese words. Both can answer correctly. A computer is like the dictionary user—it too could look up and give correct answers to using a Chinese dictionary, but this does not mean that the computer knows Chinese, or at least, not in the way a speaker knows Chinese [10].

But Turing never said this was artificial intelligence, as if it were the same kind of thing as human intelligence [11]. The phrase he used was “machine intelligence” and the most it could do was to play an “imitation game” ([6]: 249). He had a wry sense of humor. The Turing Test is not to find out whether a machine can think like person, but about whether a computer could fool a person into believing that. The joke, as it transpires, is on Searle.

The idea that meanings could be represented by on/off switches in electrical circuits can be traced to the work of Claude Shannon in which he applied the logic of nineteenth-century mathematical philosopher George Boole ([1]: 162–63). When the circuit is closed a proposition could be considered false, Shannon said, and when it is open, it could be considered true [12] (Fig. 3).

Fig. 3
figure 3

Claude Shannon and Boolean algebra; Warren McCulloch and electronic analogues of neurons; John von Neumann and binary computing

The connection between binary calculation and the binary state of neurons in the human brain was first made by Warren McCulloch in an article jointly written with a student, Walter Pitts, publishing in an obscure scholarly journal, The Bulletin of Mathematical Biophysics ([13]: 3–7). McCulloch and Pitts said that neurons could be excited or not, and each of these states could represent a proposition such as yes/no or true/false. The activity of the mind could then be mapped in terms of a mathematical logic of propositions. They were careful to say, however, that the physiological brain was much more than this [14], a qualification that has frequently been lost in subsequent discussions of artificial intelligence.

John von Neumann, a professor at the Center for Advanced Study Princeton University chanced upon the McCulloch and Pitts article. Among the other people working or visiting in the Center in those years early years of computing were some of the field’s most inventive greats, including Alan Turing, Claude Shannon and Alonzo Church. The article proved to be an inspiration for von Neumann’s design for EDVAC, the Electronic Discrete Variable Automatic Computer. Neurons “have all-or-none character, that is two states: quiescent and excited,” said von Neumann. Until now, the computers created at Princeton had been digital, working with base ten notation. But the design now proposed by von Neumann was binary, using base two notation: “It is easy to see that these simplified neuron functions can by imitated by telegraph relays and vacuum tubes” ([15]: 12–13). Von Neumann’s design was to become the foundational architecture of modern computers.

The term “artificial intelligence” was coined by John McCarthy for a small workshop held at Dartmouth College in the summer of 1956 in which Claude Shannon was one of the key participants ([13]: 28–30). McCarthy defined artificial intelligence as “making a machine behave in ways that would be called intelligent if a human were so behaving.” ([16]: 11) Turing said that computers might be able to achieve something like this, but only as a trick. The brain/binary computing parallel is the reason why Artificial Intelligence become the basis of so many overblown claims, whether fueled by techno-enthusiasm or fear.

Our purpose in this brief retelling of the origins of binary computing is to highlight the sources of overgeneralizations and oversimplifications that conflate human and machine intelligence, as if the one can be a version of the other. So, our question as we move into the next stages of our argument: how is the machine intelligence of binary computing machines different from human intelligence? Sometimes these machines perform feats that would not humanly be possible. Other times they fall far short of human intelligence and that will forever be so. Either way, the two “intelligences” are so incomparably different that they hardly warrant the same word.

3 The scope of binary computing

On the subject of human intelligence, we know surprisingly little about how brains work, although we do know that they are complicated in ways neuroscientists have yet barely figured out. Certainly, they are immensely more complicated than the on/off switches proposed in neural-computer analogies [17].

In any event, brains are only the beginning of human intelligence, because they are connected in an inseparably reflexive relationship to our seeing, hearing, smelling, tasting and touching bodies ([18]: 123–39). Bodies and brains are an integrated, bio-physical system whose operation is barely comparable to electronic switching. Intelligence is a joint production of minds and the human sensorium. Together, our bodies and minds have feelings, emotions, intuitions, instincts, needs, desires that are bigger and deeper than thinking that might be reducible instrumental reason, less still binary calculation [19,20,21,22,23].

More than this, human intelligence extends beyond our bodies. It is in our surroundings as well. Our intelligence in practice is also in the meanings that pervade the material world, part humanly made, part natural—the meanings framed by kitchens, material objects, cityscapes or forests, for instance. Our meanings are as much artifactual as they are mental and embodied. Then, connecting the material and the mental, there is no mental meaning whose provenance is not at least in part material; and no material meaning which cannot potentially be drawn into our personal and social meaning systems ([1]: 271–81). Humans are bio-physical learning systems, whose intelligence is in part inscribed externally to their bodies and minds in our physical environments, the tools we use, and our social settings.

We want to propose that computers are machines for the manufacture of meanings that extend human capacities, sometimes in ways that would have been impossible without them. Computers have affordances. But they work in ways that are irreducibly different from human meaning-making processes. In fact, computers become externalized supplement to our intelligence—and their value is in their profound differences rather than any functional parallelism, even analogy. The question then, what are the differences, and how do we put to use machines to use when at an elemental level they can do no more than process on/off signals?

We answer this question with the notion of “transposition.” Computers transpose human meanings into zeros and ones, process in base two (Leibniz was right about calculability and Lovelace about the potentials of machine calculation), then present their calculations back to us in intelligible form (Fig. 4). We want to describe these transpositions by using the language metaphor of “grammar” in a theory we have termed “transpositional grammar” [1, 18, 24]

Fig. 4
figure 4

Computers are machines that transpose human meanings through binary notation

There are, we suggest, just six transpositions of which computers are capable, just six kinds of things that can be squeezed into zeros and ones, processed by the machine, then meaningfully squeezed out again:

  1. 1.

    referencing meanings (with proper versus common nouns, and verbs that have been themselves transposed into nouns);

  2. 2.

    determining properties (finely grained adjectival or adverbial processes);

  3. 3.

    positioning in contexts of time and space (in the fashion of prepositions and tense);

  4. 4.

    enacting procedures (a process that in sentences is expressed by transitivity);

  5. 5.

    materializing meanings in media, or the pragmatics of rendering (which is only grammatical to the extent that media constitutes part of the message, or form affords shape to function).

Only now do we arrive at computation proper:

  1. 6.

    quantifying things on the basis of the distinction between one (grammatically: singular), more-than-one (plural), and zero (ellipsis). Here, among many other functions, computers engage in machine learning by means of which statistical patterns are identified in the order of more-than-one.

Incidentally, before we elaborate, we caution against prioritizing language. We use a grammar metaphor and offer linguistic parallels, but we don't want to privilege language. For meanings can readily be materialized in other forms, across the range of text, image, space, object, body, sound, and speech. Language, moreover, is a fraught category because it aggregates such unlike media: written text (prioritizing space and closely aligned with image and space) is very different from oral/aural speech (which prioritizes time and is closely aligned with sound and body) [24]. We're just using “grammar” as a familiar metaphor in order to point to some underlying, semantic primitives and to consider how these are processed by binary computing machines.

  1. (1)

    Referencing Meanings: Spontaneous speech relies on a vocabulary that is limited by the capacity of long-term memory. Natural language, moreover, is fraught with ambiguity and context-dependencies, relying much of the time on meanings (intelligence, if you like) that are referenced by shared environments, external to language and mind.

    In natural language, proper nouns reference single instances—“Mary Kalantzis” is grammatically a proper noun, pointing to her singularity, but there are others on earth with the same name. Computers handle this problem of referencing instances by applying identifiers, where Mary can be distinguished from any other person by a number of alphanumeric proper nouns that definitively state name her and validate that naming by cross-referencing: a passport number, a mobile phone number, an email address. These alphanumeric forms of reference can be aligned also in machine- or human-readable visual recognition with barcodes, photographs and biometrics, all of which also serve as proper nouns, grammatically speaking ([1]: 83–99).

    Computers can reference instances with near-absolute reliability. The processes of reading and validation can be mechanized, so referenced entities can speak their names with barcodes, QR codes and RFID chips. Today, we live in a world of billions upon billions unambiguously referenced instances: persons, web pages by means of URLs, serial numbers, and objects in the Internet of Things. Many of these identifiers are human-readable in their alphanumerical form, but rarely speakable from memory in spontaneous speech. Some are for practical purposes not humanly readable at all, and even those that might be read by humans are unreadable in their underlying binary notation. In this respect, computers mechanize that very ordinary part of natural language, the proper noun. This is meaning-making process that we might call, if we are to move away from language metaphors towards underlying semantics, “instantiating.” Computers extend our capacity to instantiate, in a universal, translingual vocabulary of unique identifiers.

    A world just of instances, however, would be bedlam. Children in their early years develop thinking that assembles things that are habitually juxtaposed—Vygotsky calls this thinking in complexes [25]. In our everyday lives we remain mostly complex thinkers, bricoleurs of sorts (Lévi-Strauss’s term), putting odds and ends together because they happen for practical purposes to be juxtaposed [26]. But as children grow older and when science is applied to life, a more analytical process of sorting develops in which things are classified by their criterial features [27]. The cognitive rudiments of this process can be found in conceptualization [28]. In grammar, we make the instance/concept distinction in the contrast between a proper noun (instantiating Mary Kalantzis) and a common noun (classifying her as a person).

    At their most systematic, domains of practice develop standardized conceptual vocabularies for the classification of instances, with systems of conceptual relations ordered into ontologies ([1]: 101–19). This is how, in today’s semantic web, the interoperability of concepts is organized [29]. Unicode is a universal record of graphemes—phonemes referencing kinds of sound, and ideographs referencing ideas. When the medical profession talks about our bodies, it uses the shared rubric of the International Statistical Classification of Diseases and Related Health Problems. Jobs are classified by occupation (SIC: Standard Occupational Classification) [30]. Products for sale are identified by article numbers (IAN: International Article Number) and sorted into kinds of product. Languages are classified in Ethnologue. Chemical Markup Language names basic chemistry and the Drug Ontology names pharmaceuticals. Systematized ontologies cover billions of the most useful and important things that can be meant, assembling these into an edifice of standardized concepts. These make the common nouns of natural language pale into conceptual inadequacy ([1]: 301–307).

    Verbs have become secondary in this newly dominant universe of nouns, via a characteristic habit of science to “nominalize,” where actions are represented as states of becoming or the consequences of becoming can be captured as a concluding state [31]. This process is old as Newtonian science ([1]: 239–41). Not that verbs can ever go away, because the action leading to states must at least be implied [32, 33]. Or they have been turned into quasi-nouns—in grammatical terms, infinitives, participles or gerunds. In any event, the ready transposability of nouns and verbs means nowadays there is little loss from nominalization.

    For these referencing functions, “computing” is perhaps the wrong word, for nothing is being calculated. Zeros and ones are just being used to name things, instantiated in their singularity or classified conceptually by their commonality. You can't meaningfully add or subtract the zero and one notation applied in the precise naming of instances or concepts. Reduced to zeros and ones, computers know nothing of what they have represented, but they have nevertheless radically expanded our human capacities to instantiate and conceptualize by means of this unified, translingual semantic system. Going far beyond the capacities of our human minds and natural language, binary computers have become our cognitive prostheses, a powerfully unnatural extension of our primitive grammatical capacities to mean by referencing the stuff of the world.

  2. (2)

    Measuring Properties: To stay with our grammatical analogy for human semantic primitives, computers can perform the job of adjectives, describing the properties of instances or the criterial features of concepts. And to the extent that verbs transpose into nouns via processes of nominalization, secondarily adjectives can do the job of adverbs as well. The world is experienced as qualia, or the descriptive properties of sensuous experience: colors, temperatures, humidity levels, chemistries, smells/tastes and other bodily sensations ([1]: 135–53). Computers can add subtlety and nuance to our assessment of some qualia, but not others. They are limited to the capacities of sensors at the beginning of the transposition process and rendering devices at the end.

    Computers can measure properties in two ways—direct reduction to binaries via computerized sensors, and indirect assessments mediated by language. Thermometers turn temperatures into a cline of ordinal numbers (adjectives) that are much more finely graded than natural language assessments that rely on the human sensorium—“it’s quite warm today” or “brrr, it’s freezing.” Humidistats accurately measure relative humidity. Spectrophotometers identify colors for accurate color matching. Wearable devices measure body functions. Sensors like these generate precise alphanumeric adjectives and adverbs, more precise than we could ever be in ordinary language for the purpose of assessing the qualities of everyday experience.

    Computer sensors, however, have their limits. When sensors cannot directly produce results across a numberable gradient, we must resort to indirect human measurement and data entry. They can’t measure the taste of one wine compared to another, for instance. A sommelier may have words for the differences, though for anyone other than a dedicated believer in the rituals of wine, these are frequently baffling. When it comes to human judgments and feelings, we have to rely indirectly on judgments made by humans and expressed in language. Language, however, is notoriously vague. On a scale of one to five, a survey may ask, are you very unsatisfied, unsatisfied, partly satisfied, satisfied, or very satisfied? But one person’s satisfaction may be another’s dissatisfaction. Natural language processing may seek words that refer to feelings or emotions, jumping to conclusions in what is technically termed “sentiment analysis.” But meanings can be as varied as humans and their situations. The limits of any such analyses are the ambiguities and context-dependencies of natural language descriptors. Computers in this case can do no better than natural language—which often means, not very well.

  3. (3)

    Positioning in Context: In natural language, intelligible meanings are not just in the minds or words of the interlocutors. Their meanings are also in their circumstantial and frequently unremarked position in time and space ([18]: 63–90). Linguists and semioticians call this diectics, where the meaning of “this” or “tomorrow” is not helpfully in the words but in the situation to which the words might be pointing. The meaning is material and experiential; it is outside the minds and bodies of persons. Time and space have, since Kant, been recognized as foundational dimensions of contextual experience, though in Kant’s case, a cognitive imposition on experience. Language manages space grammatically with prepositions and time with tense—in both cases with notorious vagueness.

    Computers have standardized space on earth with geocoordinates, and definitively disambiguated their natural language referents with GeoNames. They bring time and place together as events in iCal. Nowadays, our devices stamp time and place incidental to experience—to be processed for dynamic computer maps, fitness trackers, personal calendars and such like. The measure of time, space and the relations of time to space are available for on-the-fly calculation and visualization with an accuracy of microseconds—for practical purposes, far surpassing most human needs, and certainly beyond the thresholds of ordinary human experience. The incidental recording is massively redundant, an historical record of retrospective value only when a future need-to-know arises. The irony of our supposed age of post-Einsteinian relativity is that, in the contextual grounding of time and space by binary computing, our practical experience is of positioning is universal and absolute. Kant has been proved wrong. Now every human is marching to the same time and their place has been definitively identified according to geospatial co-ordinates. This is no mere cognitive imposition of the relative self as it was in the era of natural language when our meanings were defined by adjectival constructs relative to the self (“left,” “right,” “near,” “far”), tenses (“now” or “then”), or prepositions (“here” or “there”). It is a newly definitive ontological frame.

    Incidentally across these first three transpositions—reference to things, reference to properties and contextualization in time and space, we witness a move of enormous historical significance, from cultures relying on personally-framed natural language to a society dependent in many but not all domains of meaning on the unnatural, multimodal, translingual, social language of binary notation. The groundings of this language are material and in a pragmatic sense absolute, not personal and relative.

  4. (4)

    Automating Agency: In sentences, transitivity connects agents with their actions, subjects with their predicates, and causes with their effects ([1]: 123–26). Computers can automate steps in chains of activity. A person can sell a ticket for a concert; a computer can be programmed to do this too. Computers do this with zeros and ones that have been arranged, after Claude Shannon’s insight, into logical decision paths based on Boolean algebra. Electrical switching translates into a framework of operators: “and” (a conjunction, for instance, when two or more instances share a concept); “or” (a disjunction, for instance, where an instance can be classified by one concept only if it is not classified by another); or “not” (a negation, for instance, where an instance is not present in a concept).

    Because computers only have zeros and ones to work with, when broken into its elemental components their logic is limited. Computers become smart because they can assemble many such simple steps into many-branched decision trees, taking some boringly low-level aspects of human action off our hands. But their limitation is that they can only take action based on these tiny yes/no, true/false, and/or/not procedural steps. Computers only become somewhat smarter when they perform extended sequences of this these simple procedural steps, or anticipate logical chains of action that over a sequence of many simple branches in the decision tree offer a wide range of alternatives. As for the affordances of computers, human action can cross a wide range of conditions, suggesting mere possibility (such as “perhaps I might,” or “it is possible that”). But elemental Boolean procedures of are of one kind—they are commands: if “yes,” then “must.” Again, binary computing has much to offer in automation of procedure and multi-branched possibility, but beyond the elemental binaries, it is of little help.

  5. (5)

    Materializing Meanings in Media: Human meanings are materialized in text, image, space, object, body, sound and speech ([18]: 90–177). Meanings processed by binary computing are practically unintelligible until materialized in renderings of one kind or another. The technologies of text and image converge in the era of binary computing, both arrayed spatially on a two-dimensional plane divided into pixels. For text, Unicode represents the definitive and universal character set for every human language and major graphemic symbology—in its current version, 144,697 graphemes. JPEG and its interoperable variants can, in Red–Green–Blue combination and 24-bit encoding, represent 16,777,216 color alternatives in any one pixel, even though the human eye is able to distinguish at most only ten million color contrasts. After multiplying x and y axes, Megapixel counts on most screens make visual distinctions far finer than the capacity of the human eye. Multiply the color alternatives by the pixel counts by the screen refresh rate and you find a massive amount of painfully laborious but not-very-intelligent counting. Technologies for the processing and rendering of sound and speech converge in the digital era. If text and image are arrayed in space, text and speech are sequenced in time. Sound is recorded as a numbered binary at a rate of 44.1 kHz. The human ear can discriminate sound up to about 20 kHz, so for two-channel sound, this sampling rate is as good as our hearing can get. Video simultaneously captures the space of image/text and the time of sound/speech. Binary computing can perform transpositions in media that, for sight and hearing, exceed to the limits of human sensuous capacities. There is no meaningful point of going any further.

    However, between our capacities to see and hear, there is a lot of meaning and that is not open directly to digitization—including smells, and tastes, and only the crudest of haptics. In the cases of space and object as forms of meaning, computer-aided design is not really three dimensional, at best corroborating two-dimensional plan perspectives across different planes. 3D visualizations apply the standard tricks of realism in two-dimensional rendering, principally by applying linear perspective and framing. In other words, in reality, computer-aided design can only be rendered two-dimensionally. When it comes to object and space, 3D printing is cumbersome, slow and limited by the liquid qualities of its materials. There is no prospect of rendering real-time 3D experience in the foreseeable future. Meanwhile, tastes, smells, embodied feelings and other central aspects of intelligent human experience can only be communicated in language, so in the machine these remain mediated by, and captive to, its vagaries.

    Virtual reality and the immersive experience of the “metaverse,” despite the hype, rely on just another visual trick played on our binocular vision, and an old one of that—to project a pair of two-dimensional images separately, one to each eye. Nineteenth century stereoscopes performed the same trick: a camera with two lenses separated by the distance between human eyes was used to take a pair of pictures with a single release of the shutter. When each eye is directed to each image separately by the viewer, the effect of three-dimensional viewing is created—but it's no more than an optical illusion. Each of the pair of source images is still two-dimensional and the principal effect of three dimensionality is created in the same way two-dimensional images always have: by linear perspective.

    Nevertheless, the fidelity of representation of meanings in binary computing is remarkable, in sound/speech and image/text at least. Its effect is “telepresence.” Other media have achieved telepresence before—writing, painting, gramophone records, telephone, to name a few. But universally networked binary computing takes this process to new heights and in a single technology, helping us to transcend our mortal limits, crossing time and space in magical and terrible ways. However, computer-mediated meaning has its gains as well as its losses and the medium finds its way into our messages. Nevertheless, for all the breathless talk of an impending metaverse, the virtual is still only that, a simulation of only such meaning as can be transposed through binary notation.

    We’ve now passed through five transpositions, and still there has been almost no computation or algorithmic work, or at the least it has been computationally trivial (Fig. 5)—counting pixels, recording names, enumerating properties, or enacting the most elementary of procedural logics. Only now do we compute.

  6. (6)

    Quantifying Relations: In natural language, one instance is represented by the singular, more-than-one instance that may be classified by a plural, and meaningful absence by ellipsis. To be less than vague, more-than-one requires a specialized adjective, number. In binary computers, more than one can be counted and calculated in base two. Computable relations can be managed by algorithm off datapoints defined by referencing meanings, measuring properties, positioning in context, automating agency, and manufacturing media. Hence quantifications, for instance: totals of instances per concept, distances, speeds, and counting computer mediated actions or items manufactured. Quantification is a secondary process, analysis ex post facto (Fig. 6).

Fig. 5
figure 5

A trivial machine, Norbert’s Weiner’s Ancient Greek Steersman interacting with his pair of steering oars

Fig. 6
figure 6

Six transpositions through binary notation, with examples: reference (a URL); properties (a fitness tracker); context (a geolocation pin); agency (diagraming in Unified Modeling Language a computer-supported ticket selling, with or without a clerk); media (pixelated rendering of a Unicode character); quantity (mathematical formulae, for instance for statistical machine learning)

4 From artificial intelligence to cyber-social systems

When “artificial intelligence” is defined narrowly—as mostly it is, nowadays—it refers to the purely quantitative and algorithmic processes of machine learning. Confined to the sixth of the transpositional capacities of binary computing machines, it consists of one of several processes of pattern matching.

One such process is supervised machine learning, where an image or text, for example, is tagged (as an instance) or classified (with a concept) using labels applied by human “trainers.” Statistical methods are used to find similar patterns in new instances of the image or text, then the label applied by the machine. When text is analyzed by the machine, it is not meanings but collocations of characters defined by a limited number of rules that reduce words to morphemes: the spacing of characters, stemming ([18]: 226) and removal of “stop words” that are too context-dependent to be relevant to analysis. In natural language processing, semantics is best “latent” [34]—back to Searle’s Chinese room and Turing’s imitation game. When image is analyzed, it is statistical patterns across the plane of pixels that have been correlated with human labels in natural language ([18]: 158–60).

Unsupervised machine learning is another approach, where statistical patterns are called out by the computer, and human trainers are asked to label the text or images where these patterns occur, surprisingly perhaps, or unsurprisingly ([35]: 34–36). In either case, whether it be supervised or unsupervised machine learning, the power of the reasoning is captive to the adequacy of the labels, and these are limited by natural language when the tags are ad hoc and folksonomic [36]. Or they are radically extended by means of the processes of referencing as instances and concepts and ordering into ontology. In the latter case, the power of the reasoning is in unnaturally granular specification that has become a business model in networked computing, an incidental practice rather than an intrinsic capacity of computers.

“Deep learning” and “neural nets” are multilayered statistical sequences, identifying patterns in patterns [37, 38]. To work, they require vast amounts of data and computing power. Multiple layers of network analysis produce results that are less intuitively explicable than the single layer patterns of first order machine learning—in deep learning, in the second and subsequent layers of reasoning the machine is teaching itself by recalculating its own calculations. As a consequence, the reasoning behind the results becomes “lost in the math” [39].

Quantum computing applies metaphors from quantum mechanics, so that the bits of 0 and 1 are replaced with qubits, where 0 and 1 are interchangeable and determinable as probabilities rather than definite numbers [40, 41]. Given the continued reliance on binary mechanics, quantum computing should be understood as an extension of probabilistic statistics, though many existing machine learning algorithms are already probabilistic.

While the comparison with the human brain is seductive, the laborious, the procedural sequencing of binary calculation of machine learning is nothing like the simultaneous neurochemistry of the embodied, environment-sensing, environment-leveraging brain. The most profound intelligence of computers arises not in the sixth of our transpositions but in their peculiar capacities to extend natural language with only the most rudimentary of mechanical calculation, and this is captured in the other five transpositions. In the narrow definition, artificial intelligence is just a statistical subset of the sixth transposition. It gives too much credit to the math, and too little to the semantic peculiarities that binary computing has achieved in the other transpositions.

The pivotal point of our argument is that “artificial intelligence” too-narrowly focuses on calculation, as if calculability, the sixth transposition, can act as a substitute or a proxy for intelligibility, as if data can speak for itself without its theorization [42], and as if algorithm can be smart without ontology. Restoring the five neglected transpositions, we want to propose a dialectical relation between binary computing machines and persons, capturing this in a notion we will call “cyber-social systems” [13] and the peculiar human–machine relations of “cyber-social intelligence.”

Now, a definition: cyber-social intelligence is a recursive relationship between peculiarly machine intelligence (the six feats of transposition of meaning through binary processing, frequently reaching beyond unmechanized human possibility) and peculiarly human intelligence (the as-yet unfathomably complex biophysics of our brains, plus the feelings of our bodies, plus the meanings-for-us in our material and social contexts).

Norbert Weiner initiated the “cyber” idea in his 1948 book, Cybernetics, or Control and Communication in the Animal and the Machine [43]. The kubernētēs or steersman adjusts his pair of steering oars, one way then another, to maintain the direction of the boat depending on wind and sea currents ([44]: 322). The boat, its oarsmen and its steersman are a human–machine feedback and environmental learning system (Fig. 5).

Cybernetics remained a prominent and highly productive paradigm for the analysis of machine-human relations until the third quarter of the twentieth century, though eventually, with the rise of artificial intelligence, it fell from fashion [45,46,47,48]. John McCarthy says he deliberately coined the term “artificial intelligence” to escape association with cybernetics ([49]: 78). The meaning is different, as well as the terminology.

While we don’t propose to revive the word “cybernetics,” in suggesting “cyber-social systems,” we wish to take from cybernetics a number of things lost in the narrow understanding of artificial intelligence including: its critique of linear causality, replacing this with the idea of recursive feedback; its critique of directive agency, replacing this with the idea that agency is a dialectical play of between the parts of a system and their environment; its critique of the idea that tools are simply instruments in the service of humans, replacing this with the idea that humans establish relations with tools in which the tools become integral to their embodied and cognitive experience, indispensable to their humanity; its critique of hierarchical systems of control, replacing this with the proposition that systems, even the ones insisting on control, are heterarchical and depend for their systematicity on the distributed enactment of protocols; its critique of behaviorism, replacing decontextualized stimulus and presumptively uniform response with the notion that actors bring schemas of experience, identity and history to every new encounter; its critique of individualistic notions of cognition, replacing this with a concept of meaning, distributed conversationally among minds and their collective, sensorimotor experience of the material world; and its critique statistically confined versions of artificial intelligence, supplementing this with semantic understandings of what is being calculated, the representable and irreducibly complex qualities of the world [13].

The main distinction we wish to make between an artificial intelligence and a cyber-social systems paradigm is this: while artificial intelligence is presented as a substitute for human intelligence, of the same elemental form as human intelligence and replicating human intelligence, cyber-social intelligence addresses the complementary relation of two entirely different kinds of intelligence. Binary computing machines perform actions that it would be practically impossible for humans to undertake and futile to attempt—arraying pixels, applying unreadable identifiers, sequencing minimalist procedures, and the other transpositions we have outlined in this paper. In cyber-social systems, human and machine intelligence complement each other in their radical difference rather than substitute for each other.

5 Situating cyber-social systems in a modern socio-technical history

Binary computing is the signature technology of our age. What is its world-historical frame of reference? What are its social consequences?

Nowadays, we’ve taken to counting historical developments in technology and society in the fashion of software versions. The idea of Industry 4.0 has been popularized by Klaus Schwab, whose World Economic Forum meets each year in Davos, an exclusive Swiss ski resort. Here, selected representatives of the global business and political elite presume to plot the future for us all. Industry 4.0 captures the emerging moment of artificial intelligence, automation/robotics, bio-informatics, the internet of things, and blockchain. Its contrasting predecessors were Industry 3.0, dominated by a first wave of computing with mainframe computers (1960s), then personal computers (1980s) and the internet (1990s). Before that Industry 2.0 was marked by the advent of electricity and the assembly line at the turn of the twentieth century. And before that again, Industry 1.0 when, from the late eighteenth century, steam driven machinery and transportation replaced mechanical power began to replace the muscle power of people and animals [50].

The development of the internet has also been popularly periodized by numbered versions. Tim O’Reilly promoted the phrase “Web. 2.0” in a 2004 conference presentation, the year Facebook was launched. In retrospect, Web 1.0 was the original architecture of the web, a hub-and-spoke arrangement of static, server-based web pages. O’Reilly wanted to emphasize the participatory nature of Web 2.0 marked by the rise of platform-based systems open to social participation and interaction, notably social media and the widespread uptake of self-publishing blogs and video sites [51]. The label “Web 3.0” was coined in 2014 by Gavin Wood, co-founder of the cryptocurrency and smart contract platform, Etherium [52]. In contrast to a centralized, platform-based internet, Web 3.0 applications are distributed across multiple sites—blockchain is the archetypical example with its distributed ledger for cryptocurrencies and non-fungible tokens. The meaning of Web 3.0 has since expanded to include a wider range of emerging features of the web, including artificial intelligence, the semantic web, and the internet of things.

We want to align and reconfigure these schemes into three techno-social systems that we term “industrial” (putting together Schwab’s Industry 1.0 and 2.0), “informational” (Schwab’s Industry 3.0 and O’Reilly’s Web 1.0), and “cyber-social” (Schwab’s Industry 4.0, and putting together O’Reilly’s Web 2.0 and Wood’s Web 3.0).

Techno-Social System 1: Industrial. Weiner applied the description “cyber” to machines or natural processes that used self-regulating feedback processes or servomechanisms. One of the earliest such machines in the Industrial era was the Bolton and Watt steam engine, used among other things to pump water from mines. The governor (labeled Q in Fig. 7) consists of a pair of spinning weights. When the machine runs too fast, the balls spin wide and reduce the steam input, and when it runs too slow, they fall inward, thus increasing the input of steam. The governor constantly adjusts the amount of steam delivered to the machine. It is a rudimentary learning machine: too much steam and it adjusts to provide less; too little and it provides more. The cybernetician Heinz Von Foerster made a distinction here between a trivial machine which acts on human command, a non-trivial, learning machines capable of changing its internal state based in the relation of new input and the previous internal state ([45]: 194–96).

Fig. 7
figure 7

A non-trivial machine—the governor (circled) on the Bolton and Watt Steam Engine acts as a servomechanism

The industrial techno-social system develops rapidly from the late eighteenth to the early twentieth centuries without changing its underlying systems logic, steam then electric machines of increasing scale and significance working at the command of humans. Command of machines was supplemented by command of persons and the strict structures of hierarchical control required by the fine division of labor into elementary actions ([53]: 4–5). Henry Ford created a production line to manufacture of his Model T motor car, requiring eight thousand finely differentiated steps. Engineers like Frederick Winslow Taylor designed processes that optimized the productivity of the relations between bodies and machines [54]. Economies of scale were achieved with mass production, a corollary of which was mass consumption. “Any color you like as long as it is black,” said Ford of his cars and the interests of their consumers [55].

The literature frequently calls this system Fordism or Taylorism, after these founders [56]. If the society of the market outside the organization was premised on individual choice and the gradual emergence of democracy, the organization was still patterned internally around command structures originated in the era of kings and feudal lords. The industrial techno-social system became a site of system-threating social conflict and in the twentieth century an engine of horrific industrial warfare.

Techno-Social System 2: Information. After the crises of the first half of the twentieth century, the command systems of the industrial age were loosened, a process greatly assisted by computing technologies—mainframe computers at first, then server-based local area networks of microcomputers. By these means of information access and communication it was possible to delegate control to small work teams, or to even to workers themselves once they had been inculcated in the organization’s mission and trained to take greater responsibility for their work. The deskilling effects of the minutely differentiated division of labor were reduced when multiskilled workers regulated their own work in self-managing teams. In the literature, this system is called post-Fordism or flexible specialization [56].

Though computer information and communication systems are integral to this change, they are still largely configured in a hub-and-spoke, transmission configuration—mainframes to terminals or local area networks. They mainly remain what von Forester would term “trivial machines”— machine command or operationalizing-on-demand human communication. Meanwhile, in the wider social realm, mass media, mass production and mass consumer culture remain dominant. Centralized servers transmit information to consumers of knowledge and culture whose relation to transmitted content is essentially passive, notwithstanding the expansion of their browsing options on the web. Media remain vulnerable to propaganda because content creation is centralized in the hands of media megacorporations ([18]: 215–19, [57]).

Techno-Social System 3: Cyber-Social. Even when using the simplest of tools or trivial machines, all human–machine relations are “cyber” in the sense that they involve a reflexive, learning relationship between persons and the instruments that support their actions—hence Weiner’s steersman. But in the industrial and information systems non-trivial machines that automate feedback and human control were rare—Watt’s steam engine governor was the exception rather than the rule. We want to reserve the term “cyber-social” to an era in which cyber systems of feedback are increasingly automated. In a fully-fledged cyber-social system, computers support our work and lives in an integrated, recursive system of mutual learning, automating key aspects of human relations in cyber-physical systems [58] and social relations with each other. More than this, today’s internet has brought (nearly) every computing device into a single, universal, integrated system of cyber-social relations. No longer a command-and-control system, this is a decentered system of distributed agents. If people comply, it is because they adopt shared technical and social protocols [59].

Stafford Beer was a twentieth century techno-social visionary who at the beginning of the computing revolution envisaged some of what would soon be possible. “Our whole concept of control is naïve,” he said in his 1959 book, Cybernetics and Management. It “is primitive and ridden with an almost retributive idea of causality. Control to most people (and what a reflection this is upon a sophisticated society!) is a crude process of coercion.” ([60]: 21) A cyber-social system by contrast is distributed and adaptive where the mechanisms of control are those of the self-organizing system. Such a system relies on the interaction of people and machines in systems of collective intelligence.

Comparing organizations in a cyber-social system with industrial and informational systems, as Beer recognized long ago, the strength and intelligence of an organization is not at its center or its management. It’s in its system, and the system is distributed as well as coordinated. Effective management plays a less commanding role, and more a background, coordinating role. Change is pervasive, incremental and fast, with rapid cyber-social feedback loops. The whole organization needs to work in ways that would today be characterized in the processes of agile software development [61]. In social terms, agile means democratic, where every person can contribute and that contribution counts. As the old cliché goes, the whole becomes greater than the sum of its parts. Organizations are also increasingly integrated into supply chains and persons into network-mediated affinity groups. This is an unprecedented situation of total cyber-social integration. The cyber-social system is simultaneously a distributed and centralized system of collective intelligence.

Moving to society, if the media of the older media of the Information techno-social system were based on a transmission, hub-and-spoke model of culture, the new media are participatory. We were relatively passive players in the old media in the old media of the Information society, watching television, listening to the radio, or reading a newspaper for instance. We were culture consumers more than we were culture producers, and we stayed quite passive when Web 1.0 entered our lives. Since Web 2.0, however, we spend as much time in cultural creation as we do in cultural consumption (Fig. 8).

Fig. 8
figure 8

From industrial, to informational, to cyber-social systems

6 Cyber-social prospects: from risk to trust

It has been a long-cherished hope that technological progress would bring with it social progress. However, just as previous techno-social systems were accompanied by crisis and conflict, so the cyber-social system is riven with risk and danger. None of these techno-social systems is stable, though the basis of the instability of each is quite different.

The degree of technical and social integration in a universal cyber-social system is one source of danger. Such a system vulnerable to disruption by bad actors—thieves, and terrorists, and ideologues pushing narrow or destructive agendas, for instance. Organizations are vulnerable to malevolent intruders through the cyber connections with of public-facing networks that are necessarily porous, open to potential customers, supply-chain collaborators and other stakeholders. These are the roots of current concerns under the rubrics of “cybersecurity” [62] and “cyber warfare” [63].

Then there is the problem of platforms. Notwithstanding the logic of distributed agency, the system is coordinated by platforms. In the era of “cloud computing” [64], these platforms are becoming fewer. The top ten cloud providers provide as much as eighty per cent of the world’s platform computing power. Moreover, a platform-based system that houses so much information about our personal lives requires strict protocols to limit unwarranted surveillance and ensure privacy.

There are also new problems with work. If the industrial system exploited labor by processes of deskilling, regimentation and discipline, and if the informational system depended on the internalization and activation of corporate ideologies, today the cyber-social economy brings with it a new precarity of employment. The “gig economy” and self-employed platform-coordinated economies are driven by dynamics of fear and uncertainty, sometimes motivating self-exploitation and over-work, while casting people into underwork at other times. Platforms also depend for their profitability on the “reputational” economy, absorbing enormous amounts of unpaid labor [65]. The reasonably paid work of journalists in a previous era, for instance, is being replaced by a mass of unpaid news reporters, even when that news is just our latest meal or a half-substantiated political opinion.

In the social realm, a system so dependent on cyber-social trust can also breed warranted mistrust. Is this news fake? Is this email phishing? Is this website legitimate? And a social paradox: the more we come together in an integrated cyber-social system, the more, it seems, we are falling apart around the fissures of fragmenting social identities and widening inequalities.

Finally, a crisis of environment looms large as an imminent threat to the cyber-social system [66]. Against the impression that this is an immaterial economy whose fuel is mere information, the platform providers own as their means of production heavy-duty, factory-based industrial infrastructures that use huge amounts of electricity. They generate so much heat that a large part of the energy expense is to cool them down to prevent their overheating. Blockchain is a clear example, where an apparently distributed technology is in fact platform-centered, using enormous amounts of electricity to “mine” its ledgers ([18]: 287–91). A literature of imminent collapse foretells great danger in this cyber-social system [67, 68].

“Man has… become a kind of prosthetic God,” said Sigmund Freud. His “auxiliary organs [are] truly magnificent,” but these organs “give him much trouble at times…; present-day man does not feel happy in his Godlike character” ([69]: 219). If the system we have now is, as we have been arguing in this paper, deeply cyber-social, how do we address its endemic crises? Our suggestion is to work to extend the system’s principal underlying strength, its unprecedented sociability. For this, we need to move from a paradigm of narrowly defensive “cybersecurity” to a paradigm of cyber-social transparency and trust.

7 Conclusion

When John McCarthy coined the term “artificial intelligence” in 1956, it was because he wanted to get away from Norbert Weiner’s, Heinz von Foerster’s and Stafford Beer’s notion of “cyber.” We have come back to this notion and reformulated it in the concept of “cyber-social system.” The reason for our return to this idea is because “artificial intelligence” has become too narrowly focused on statistical processes, to the neglect of the frameworks of meaning to which they are integrally connected. It is also because we have at times allowed the machine to assume in our minds a life of its own (over-enthusiastically, or over-fearfully), when the machine is never more than a complement to, and extension of, human intelligence.

Computability is the transposability of meanings into binary notation. The limits of mechanized intelligence are the limits of the transposability of human meanings into binary notation. We have in this paper proposed six vectors of transposition of meaning though binary computing. Our case has been that the collective intelligence of our times is less in the statistical feats of “artificial intelligence” (a subset of our transposition #6, quantification), and more in the sheer amount of finely referenced data consisting of: uniquely identified names classified into ontologies (transposition 1); standardized measures of properties (transposition 2); absolute and incidentally recorded contextual grounds of time and space (transposition 3); the automation of micro-processes in the manner of von Foerster’s non-trivial machines (transposition 4); and the standardized forms of recording and rendering text, image, space, object, body, sound, and speech (transposition 5). These are the core functions of today’s cyber-social system.

Binary computing is the signature technology of today’s cyber-social system, driven by recursive feedback relations between machines and people, ubiquitously mediated these days by computing. As a counterpoint to artificial intelligence, we want to propose cyber-social intelligence. This we define as the recursive relationship between peculiarly machine intelligence (the super-human albeit painfully laborious feats of computation that can be achieved by binary computing machines) and peculiarly human intelligence (the as-yet unfathomably complex biophysics of our brains, connected with the feelings of our bodies, connected into the meanings-for-us in our environments. If computers are useful to humans, it is because their capacities for mechanized intelligence are completely different from, and complementary to, our human, sensuous capacities.