Introduction

For the past two decades, the description of languages other than English, in general, and language typology, in particular, has increasingly garnered much interest in systemic functional linguistics (SFL). The objective of the present study is to give a cartographical overview of the growing literature in this field. The focus of the study is on languages other than English. The practical reason for this is to narrow down the studies to a manageable scope, given the fact that descriptions on English have already produced a large volume of materials (For a review of studies on English grammar, see Matthiessen 2007a).

The primary motivation behind the study is two-fold. The first is to systematically locate and profile available resources, in terms of theoretical guidelines and methodological procedures, in the extant literature in order to guide new research endeavours in this area. The second motivation is to profile developments in systemic language description and typology since the 1960’s for the purpose of showing research areas that have been covered, limitations and challenges, and pointing to gaps for further research. Thus, the approach we adopt here is a meta-analysis rather than making typological generalisations (see e.g. Matthiessen and Christian (2004); Teruya et al. (2007) and Matthiessen et al. (2008) for typological generalisations). In many parts, the paper also discusses other functional approaches to typology and shows how these interact with and contribute to the growing body of studies in systemic functional language typology (hereafter, ‘systemic typology’).

The rest of the paper is organised into four sections. The first section discusses theoretical developments in SFL language description and typology and the second section describes the methods and procedures employed in compiling and analysing empirical studies for the survey. In the third section, we present the meta-analysis of studies in SFL language description and typology since the 1960’s. The final section concludes the paper.

Theoretical developments in systemic language description and typology

As mentioned above, typological research is increasingly becoming prominent as a descriptive component of SFL. For the past decade, a few scholars have mapped out the theoretical and methodological tenets of systemic typology (e.g. Caffarel et al. 2004: Ch. 1; Teruya et al. 2007; Matthiessen et al. 2008; Teruya and Matthiessen 2015). In this section, we outline some of the theoretical issues that have been discussed. We first discuss the conception of the relationship between linguistic theory, language description and application and then examine characteristics of systemic typology and how it interacts with and is influenced by other functional typological approaches.

Linguistic theory, language description and application

Systemic functional linguistics was meant to be a holistic theory of language from the very start. In the 1950’s, Halliday (1957) noted the need for a ‘general linguistic theory’ that would be holistic enough to guide empirical research in the broad discipline of linguistic science:

… the need for a general theory of description, as opposed to a universal scheme of descriptive categories, has long been apparent, if often unformulated, in the description of all languages (Halliday 1957: 54; emphasis in original).

If we consider general linguistics to be the body of theory, which guides and controls the procedures of the various branches of linguistic science, then any linguistic study, historical or descriptive, particular or comparative, draws on and contributes to the principles of general linguistics (Halliday 1957: 55).

This call for a general theory was echoed by other scholars, notably the American psycholinguist Charles Osgood (cf. Osgood 1966). Halliday (1957) aligned the principles of such a theory with the system and structure framework that was being developed by J. R. Firth (see e.g. Firth 1957). As Fig. 1 shows, the demands on the theory he envisaged broadly cover the core aims of linguistic science (i.e. the horizontal dimension) and the scope of the material typically covered in different linguistic investigations (i.e. the vertical dimension).

Fig. 1
figure 1

Empirical scope of General Linguistic theory (Halliday 1957: 56)

There were doubts during the 1950’s whether the universal dimension of language was ‘on the agenda’ of linguistic investigation (Halliday 1957: 56). In the 1960’s, however, the historic Conference on Language Universals put this firmly on the agenda (cf. the contributions in Greenberg 1966a). Recent studies have further opened up new frontiers of linguistic research on the evolution of human language (e.g. Heine and Kuteva 2002a, 2007).

Halliday’s (1961) Categories of the Theory of Grammar was a response to the need for a general theory of language (also see Dixon’s (1963) formulation of it in logical terms). Following J. R. Firth (e.g. Firth 1957), he named the framework sketched in this article ‘General Linguistic theory’ and it gave birth to what became known as scale and category grammar and, subsequently, systemic functional linguistics (cf. Matthiessen 2007b). Further following insights from Hockett’s (1965) notions of ‘deep grammar’ and ‘surface grammar’ and partly in response to transformational-generative grammar (e.g. Chomsky 1966), Halliday (1966a) emphasised the natural relationship between grammatical meaning and grammatical structure and considered meaning as the underlying essence of language, which structure realises (see also Hopper (1987, 1996); Bybee et al. (1994) for similar views).

With this perspective, priority was given to the paradigmatic organisation of language as the primary focus of linguistic description, and structure is analysed subsequently as a realisation of features (i.e. grammatical meanings such as ‘present’, ‘past’ & ‘future’) in systems (such as TENSE and TRANSITIVITY). This theoretical development is very important to language typology. By giving primacy to meaning, the theory is free from the constraints of the structure of any one language (Matthiessen 2014). Linguists are then able to explore the differences and similarities across languages in their realisation of various grammatical meanings since languages tend to be more similar in terms of the range of meanings they construe than in their structural realisation of these meanings. The need for such a meaning-oriented approach to language, particularly in multilingual research, has also been articulated by experienced functional typologists since the 1960’s (cf. Jakobson 1966; Croft 1990; Haspelmath 2010a, b).

The basic theoretical formulations Halliday (1961) outlines were partly informed by his description of Chinese (e.g. Halliday 1959; Halliday 2005a) and became the basis for the descriptions of English (e.g. Halliday 1967a, b, 1968, 1974, 1984a, b; Halliday and Matthiessen 2014) and other languages (e.g. Huddleston and Uren 1969; Hudson 1973). From the very beginning, therefore, theory and description have been kept apart as two complementary resources for linguistic research and its application (Halliday 1996, 2008; Caffarel et al. 2004: Ch. 1; Matthiessen 2007b). The formulation of the relationship between theory and description is presented in Fig. 2.

Fig. 2
figure 2

Analysis, description, comparison and theory in relation to one another (Matthiessen 2013a: 141), used by permission of ©Equinox Publishing Ltd. 2013

Theory is a designed system, consisting of concepts that are systematically organised towards achieving potentially explicit goals (Halliday 1996, 2009). Specifically, SFL theory is designed as an enabling resource to guide particular descriptions, either of individual languages or a number of languages. The theory does not posit universal linguistic categories or structures for language; rather it provides a road map for identifying, describing and profiling categories and structures of particular languages, or any number of languages, in a systematic manner. For instance, although the theory posits that every language organises its lexicogrammatical resources into a fixed, identifiable number of ranks, it does not claim the universality of specific ranks such as clause, phrase/group, word and morpheme (Caffarel et al. 2004: Ch. 1). Every language is considered as a unique manifestation of the semiotic system called language and ‘categorial’ (or class) and structural labels must emerge from the context of the actual description of a language, although these categories are often based on a transfer comparison from existing descriptions of other languages (cf. Caffarel et al. 2004: Ch. 1). Likewise, the extent to which languages are different and/or similar in terms of lexicogrammar and any aspect of language, for that matter, must emerge from the context of typology-oriented descriptions, descriptions that are essentially informed by a comparison of different linguistic systems.

This differentiation of a general theory of language from universal regularities of linguistic phenomena corroborates Osgood’s (1966: 300) conception of theory as “a higher level description”, defined as “a set of principles, which economically and elegantly, encompasses the whole set of functions” displayed by human languages. In this regard, Osgood (1966: 301–302) emphasised the need to distinguish between two kinds of universals in linguistic research as follows:

U 1. Phenotypes: empirical generalizations that hold for all languages. […]

U 2. Genotypes: theoretical generalizations, principles in a theory of language behaviour, that hold for all languages … the fundamental laws governing the production of semantic regularities, the production of grammatical regularities, the source of language change (emphasis in original).

Systemic functional theory maps out the ‘genotypic’ constitution of language. It is in this sense that it is a ‘general theory of language’ (cf. Halliday 1961, 2009). As Fig. 2 shows, the two generalisations, ‘phenotype’ and ‘genotype’, are, however, not unrelated; theory is an abstraction from various descriptions, or, in other words, observable regularities across languages (see also Caffarel et al. 2004: Ch. 1; Matthiessen 2007b, 2013a).

When we move a step further down from theory, we observe that description itself is an abstraction of, or rather, a generalization from the analysis of particular text instances in language (see Fig. 2) – and, in the context of typology, a generalisation from descriptions of various languages. Language is observable as text, defined as spoken or written discourse (and, by extension, other semiotic resources such as sign language or image). Text, therefore, serves as the entry point for investigators into the linguistic system they want to describe. A comprehensive description will be based on texts across different registers in the relevant speech community. However, a description may also be limited to some register (i.e. a sub-system) in the community (e.g. see Patpong (2006a) for a description of Thai folktales). In either case, the description needs to be informed by the analysis of particular discourses and the analyst will have to shuffle between developing general categories and features (i.e. description) and testing them on text instances (i.e. analysis).

Descriptions are considered as resources that can be applied in solving problems that are linguistic in nature. SFL descriptions have mainly been used in the context of education (e.g. Martin et al. 1987; Torsello et al. 2005; Rose and Martin 2012). Other areas they have also been applied to are translation (cf. Steiner 2002; Kunz et al. 2014 and references therein), literary stylistics (cf. Lukin and Webster 2005; Simpson 2014; Mwinlaaru 2014; Lukin 2015) computational linguistics (cf. Fawcett 1981; Henrici 1981; Matthiessen and Bateman 1991; Fawcett et al. 1993; O’Donnell and Bateman 2005), healthcare (e.g. Armstrong et al. 2005; Slade et al. 2008, 2015; Matthiessen 2013b), judicial contexts (e.g. Martin et al. 2013), (critical) discourse studies (e.g. Cloran et al. 2007; Edu-Buandoh & Mwinlaaru 2013) and biblical studies (e.g. Xue 2015).

It is important to distinguish between descriptions and their applications in different contexts (cf. Matthiessen 2007b, 2013a; Thompson 2013). This is because some descriptions are particularly aimed at solving problems in specific contexts and these have the tendency of being narrow and constrained in such a way that they may limit our understanding of the language described. This observation does not undermine the relevance of context-specific descriptions. It rather draws attention to the need to differentiate between what is described, language, and the phenomenon for which the description is done (Halliday 2009). An investigator who does not make this distinction may end up describing the former as though it were the latter.

One area in which this distinction has proved useful in SFL context is computational application and artificial intelligence. In these areas, descriptive categories have had to be converted into mathematical formalizations and abstractions in order to make them explicit and machine friendly (cf. Henrici 1981; Matthiessen and Bateman 1991; Fawcett et al. 1993; O’Donnell and Bateman 2005). These abstract formalizations render the description most effective for application in this context. However, they cannot be taken as the best descriptive representations of language. O’Donnell and Bateman (2005), for instance, note two competing motivations in the computational applications of systemic functional grammar. While text generation is richly enhanced by the complex functional diversity in SFL descriptions, this complexity poses a challenge to computational parsing, which favours a less complex, syntagmatically oriented grammar. Applications in this latter context would, therefore, have to simplify the grammar of natural language by pushing it towards form (or syntagm) or by focusing on one functional dimension of structure such as the modal structure of the clause (which, for English, is Subject + Finite + Predicator + Complement + Adjunct) or, even, some combination of the two.

It must be noted, however, that applications are a testing ground for the power or the reliability and validity of language descriptions. They inform descriptions and guide descriptive linguists in revising their analysis. One thing the various applications of systemic grammars have taught us is to make our description of categories and their identification characteristics as explicit as possible, although explicitness is achievable only to the extend that the fuzzy nature of language allows. As an appliable theory, SFL takes language in its social context as essentially its primary object of inquiry while being sensitive to other systems language interacts with (i.e. physical, biological and social systems) as well as the potential uses of descriptions.

Systemic functional language typology

In this section, we proceed to discuss the characteristics and scope of systemic typology. Due to space constraints, the discussion will mainly be based on issues related to language typology rather than the SFL theory itself. (For a detailed discussion of the theory, see e.g. Halliday 1966a, b, 2003, 2005b, 2008; Martin 1987, 2013, 2016; Matthiessen 1992; 2007; 2015a; Halliday and Matthiessen 1999; Matthiessen and Halliday 2009). Works that are directly related to our present discussion are Caffarel et al. (2004: Ch. 1); Matthiessen et al. (2008); Matthiessen (2014, 2015b); Teich (2002) and Teruya and Matthiessen (2015).

We should mention from the onset that systemic typology is not a separate sub-discipline of linguistics but, rather, a theoretical approach to the broad research agenda of contemporary language typology as developed by Joseph Greenberg (e.g. Greenberg 1966b) and extended by Talmy Givón (e.g. Givón 1983), Bernard Comrie (e.g. Comrie 1976) and others (e.g. Hopper and Thompson 1980; Heine et al. 1991; Dryer and Haspelmath 2013). Systemic typology essentially belongs to this functional-typological tradition (see further below for details). In other words, it is the study of language typology from the perspective of SFL theoretical dimensions of language.

This means that, first, research in systemic typology is sensitive to the metafunctional diversity in language (see Fig. 3). That is, both linguistic meanings and their realisations by form or structure simultaneously embody three core functions, namely ideational (i.e. the representation of experience and logical relations), interpersonal (the enactment of social identities, roles and relationships in discourse) and textual (the packaging of information flow into a consumable text). Second, it recognises that the strata of language are related as a hierarchy of realisation; comprising, in their respective order of realisation, semantics (including pragmatic meaning), lexicogrammar, phonology and phonetics and embedded in social context. In addition, it considers the internal organisation of each strata of language as a rank of units related by constituency (e.g. clause, phrase/group, word and morpheme for English lexicogrammar). Further, it organises the realisation of linguistic systems (e.g. MOOD, INFORMATION FOCUS) into a scale of delicacy such that the description of features within one system tends to be accumulative (e.g. indicative: declarative: non-exclamative/exclamative for the English MOOD system). Finally, it takes description to be a generalisation of the linguistic characteristics of text instances or, in other words, the regularities of the uses of linguistic forms in discourse.

Fig. 3
figure 3

The hierarchy of stratification showing metafunctional diversity and the rank scale of English lexicogrammar

SFL considers language typology to be part of a broader research area, namely multilingual studies (see Fig. 4). In addition to language typology, multilingual studies comprise contrastive analysis, comparative linguistics and, as a limiting case, the description of individual languages other than English (Caffarel et al. 2004: Ch. 1; Matthiessen et al. 2008; Teruya and Matthiessen 2015).

Fig. 4
figure 4

Description, contrastive analysis, comparison and typology in relation to sample size and instantiation (Matthiessen et al. 2008: 149), used by permission of ©Bloomsbury Publishing Plc.

As Fig. 4 shows, these different domains of multilingual studies can be situated along a cline, with typology and description of single languages at extreme poles of the continuum (cf. Matthiessen et al. 2008). The relationship between the two extremes is analogous to that between ‘text analysis’ and ‘language description’ as discussed under “Linguistic theory, language description and application” above. Typology starts with the systematic description of linguistic systems of individual languages and these descriptions are then compared to identify cross-linguistic or universal tendencies for generalisations about language. In essence, the description of individual languages normally has an added objective of making typological statements about the language described (Caffarel et al. 2004: Ch. 1). Languages that have been described from this perspective include inter alia Arabic (Bardi 2008) Bajjika (Kumar 2008), Chinese (Tam 2004; Li 2007), French (Caffarel 2004, 2006), German (Steiner and Teich 2004), Japanese (Teruya 2007), Oko (Akerejola 2005), Pitjantjatjara (Rose 2004a; b), Spanish (Quiroz 2008, 2013), Tagalog (Martin 1996a, b) and Thai (Patpong 2006a).

Systemic linguistics defines the main goal of language typology itself as mainly identifying cross-linguistic regularities in systems (e.g. MOOD, MODALITY) and their realisations. This approach follows the tradition exemplified by Comrie’s (1976) study on aspect and Hopper and Thompson’s (1980) famous study on transitivity in discourse. Within SFL, notable examples of this aspect of typology are Matthiessen (2004); Teruya et al. (2007); Matthiessen et al. (2008); Wang and Xu (2013) and Teruya and Matthiessen (2015). Besides systems, however, Systemic typology, among others, profiles grammatical units across languages (e.g. Sutjaja 1988; Mock 1969; Boxwell 1995; Ochi and Lam 2010; Matthiessen et al. 2016) and the lexicogrammatical resources that construe particular domains of experience such as motion, emotion and space (e.g. Matthiessen and Kashyap 2014; Matthiessen et al. 2015).

Between the typological and language specific poles of linguistic description, we can locate contrastive analysis and comparative linguistics (see Fig. 4). Contrastive analysis dates back to the work of Robert Lado in the 1950’s. It was originally introduced to support the development of materials and techniques for second language education (Lado 1957). In simple terms, it systematically compares a description of the native language of a second or foreign language learner with a description of the target language to be learned in order to predict areas of familiarity and difficulty to the learner. This approach was heavily criticised by transformational-generative theorists and was replaced by error analysis (Corder 1967) and interlanguage studies (e.g. Hyltenstam 1977).

As Matthiessen (2015b) notes, however, since the last two decades, there has been a renewed interest in contrastive analysis, particularly, in the contexts of translation studies (including machine translation) and multilingual language description. This resurgence of contrastive analysis is of interest to SFL as an appliable theory of language. Contrastive analysis is integrated into the SFL’s descriptive scope as part of its multilingual project and, so far, studies have contrasted English with languages such as Spanish (e.g. Lavid et al. 2010), Swedish (Holmberg and Karlsson 2006) and French (Caffarel 2006). There is also a new interest in applying SFL contrastive analysis in developing materials for language education. An example of this is the ongoing Bilingual Academic Language Development (BALD) project on English and Spanish by Andrés Ramírez (Florida Atlantic University) and Luciana de Oliveira (University of Miami) (cf. Ramírez 2015, personal communication).

While contrastive analysis often involves two languages, ‘language comparison’ often involves more than two languages. As Fig. 4 shows, two approaches to language comparison can be identified in linguistic science, namely ‘comparative linguistics’ and ‘cross-cultural pragmatics’ (cf. Matthiessen et al. 2008; Teruya and Matthiessen 2015).

Comparative linguistics began to develop in the 18th and 19th century and became popular in the 1960’s and 1970’s, as an approach to historical linguistics. Its objective is to compare a group of languages that are either genetically or geographically related in order to reconstruct their diachronic development and/or a common ancestor. It has been important in clarifying the grouping of languages into families. Examples are Givón’s (1971a; 1971b; 1975; 1979) studies on Bantu languages. Currently, no comparative linguistic study, in its traditional sense, has been done within the context of SFL (but see Teich (2002) for a comparative study of five European languages in the context of translation). However, as studies on languages within the same language family or region are growing, this could be a rich research area for applying systemic functional theory. For instance, a comparative description of systems and regularities in their realisations as is exemplified by Matthiessen (2004) could be an important contribution to testing and validating the classification of languages into families.

Although language comparison has traditionally been associated with ‘comparative linguistics’, studies in cross-cultural pragmatics have revealed a rich body of knowledge on semantic (including pragmatic meaning) and lexicogrammatical regularities in enacting relationships and identities across languages and cultures (see Matthiessen et al. (2008) for a detailed discussion). Classic research in this area includes Brown and Gilman (1960) and Brown and Ford (1961) on address terms, and Brown and Levinson (1987) on politeness. The preoccupation with interpersonal meaning and lexicogrammatical resources in this tradition shares much with the attention given to MOOD and resources for negotiation and modal assessment across languages in systemic research (cf. Matthiessen 2004; Teruya et al. 2007; Quiroz 2013).

It should be emphasised that whatever approach is adopted, multilingual research (including typology) in SFL emphasises a holistic or an ecologically sensitive approach to the language(s) described. This can be understood in two ways. First, description should ideally be based on (or, at least, be sensitive to) whole systems (e.g. ASPECT, EVIDENTIALITY, PROJECTION), units (e.g. noun class, verbal group, clause), or experiential domains (e.g. motion, possession, space, etc.), just to mention a few. In this way, the description of individual lexicogrammatical items are meaningfully related to their systemic environment rather than been presented as isolated fragments or, to use Weinreich’s (1966: 144) term, in order to avoid reducing description to the ‘atomization’ of linguistic items in want of explanation.

The second way is to have a ‘trinocular’ perspective on the forms described, from above, from below and from roundabout (Halliday 1996). On the dimension on stratification and with the clause as our unit of analysis, this means examining: (i) the meaning (e.g. speech function) realised by configuration of elements such as Subject and Finite in the English clause, (ii) the realisation of these elements by lower units such as group/phrase and word, or phonologically, and (iii) how the elements of the clause itself are patterned for the realisation of the particular meaning. On the dimension of instantiation, a trinocular vision means examining whether a grammatical or linguistic phenomenon is: (i) systemic – applies to the whole language as a system (ii) instantial – limited to a particular context of situation or (iii) register or genre specific – a characteristic of a cultural institution or a situation type. Finally, on the dimension of axis, a trinocular vision requires the analyst to identify: (i) the systemic feature a particular structure or element realises, in Saussurean terms, its valeur; (ii) the morphological and/or phonological realisation of the element (or elements) in a structure; and (iii) other features in the systemic environment. The purpose of a trinocular vision is to ensure a fuller and reliable description.

Systemic typology in relation to other approaches

SFL, in general, and systemic typology, in particular, have developed in interaction with many different approaches to language study. We limit our discussion here to connections with a few other functional approaches that interact with and/or influence typology research in systemic linguistics.

The immediate context for the concern with and development of language typology in SFL is the multilingual orientation of Firthian linguistics in the 1950s and 1960s (cf. Firth 1957) as well as Michael Halliday’s initial preoccupation with the description of Chinese (e.g. Halliday 1956, 1959).

The basic outline of the research agenda for language typology in SFL is, however, provided by Prague school typology. We can identify a few currents that research in systemic typology inherits from the Prague school. These are the importance placed on theory as an enriching and empowering resource in typological generalisations; emphasis on the functional or meaning-oriented interpretation of language phenomena; the orientation towards paradigmatic organisation (i.e. systems); the conception of language typology as a multidimensional mapping of languages; and the emphasis placed on empirically based typology (cf. Jakobson 1966; Teruya et al. 2007). In a classic paper published in the landmark report of the Conference on Language Universals, Jakobson (1966) stated what can be regarded as the primary goal of typology in SFL and is worth quoting here:

A cautious and unremitting search for the intralingual and therewith interlingual semantic invariants in the correlations of such grammatical categories as, for example, verbal aspects, tenses, voices, and moods becomes indeed an imperative and perfectly attainable goal in present-day linguistic science. This inquiry will enable us to identify equivalent grammatical opposition [i.e. systems] within “languages of differing structure” and seek the universal rules of implication which connect some of these oppositions [i.e. systems] with one another (glosses ours, p. 272).

He continues:

We most urgently need a systematic world-wide mapping of linguistic structural properties: distinctive features, inherent and prosodic–their types of concurrence and concatenation; grammatical concepts [i.e. grammatical features, in SFL terms] and the principles of their expression [i.e. realisation, in SFL terms]. (glosses ours, p. 274)

The pursuit of the typological aspect of these goals in systemic linguistics has, however, been slow, partly due to practical reasons (see Caffarel et al. 2004: Ch. 1 for details), and partly because description of individual languages antedates language typology of the kind described here.

Matthiessen’s (2004) classic paper, which maps out systemic generalisations and ‘implicational motifs’ (rather than ‘rules’ as in the quote from Jakobson (1966) above), drawing on at least 160 languages, is an enormous contribution to this goal. Matthiessen (2004: 544 – 553) provides an alternative analysis to Greenberg’s (1966b) word-order typology. He shows that the variations in the syntagmatic sequence of elements in the clause reflect different strategies by which languages manage metafunctional diversity in the clause. This insightful account resonates with Casagrande’s (1966) proposal for the study of cross-linguistic regularities in the sequencing of clause elements as ‘universal alternatives’, which he describes as “a limited set of alternative solutions to a problem, one or more of which may be used in a particular language” (Casagrande 1966: 290; see also Fillmore 1968: 2 for a related position) In our context, the semiotic problem at hand with regards to word-order is the need to manage interaction and enact negotiation in the clause.

Matthiessen (2004), among other things, reveals that, the placement of Predicator in the clause tends to correspond with or suggest the position of the clause where interpersonal meaning is prominent. In languages where the Predicator typically occurs at the beginning such as Arabic and Tagalog, the initial position of the clause is normally interpersonally prominent and, for that matter, interpersonal resources such as finiteness markers and negotiation particles are also typically placed at this position. In languages such as Japanese where the Predicator is final, interpersonal meaning is prominent at the final position. Languages like English present a rather complex phenomenon. The Predicator occurs in medial position, while the Finite element either precedes (Finite ^ Subject ^ Predicator) or follows (Subject ^ Finite ^ Predicator) the Subject to show mood contrast between interrogative and declarative clauses respectively.

Outside SFL, the World Atlas of Language Structures (WALS) developed by a team of 55 scholars and maintained by the Max Planck Institute for Evolutionary Anthropology, is an enormous contribution to the semiotic mapping of languages as envisaged by Jakobson (1966) (cf. Dryer and Haspelmath 2013). This is a database of phonological, grammatical and lexical properties of a large number of languages based on reference grammars and other descriptive materials (see http://wals.info). It also includes rich typological information on a wide range of the world’s languages. Apart from the fact that it is a great scholarly contribution in its own right, WALS serves as a rich databank for research in language typology.

Systemic typology, in its current development, is part of the functional-typological tradition that developed from the work of Joseph Greenberg and follows the historic Conference on Language Universals (cf. Greenberg 1966a). This functional-typological tradition, in its loose sense, includes typological research from different functional approaches to language, such as cognitive grammar (including construction grammar), Functional Discourse Grammar (FDG) and Role and Reference Grammar (RRG). There is however a relatively distinctive research group within this tradition that is normally referred to as West Coast Functionalism (WCF), exemplified by the work of Talmy Givón, Paul Hopper, Sandra Thompson, John Haiman, Bernd Heine and their team of researchers (also see Butler 2003a). This research group has been at the forefront of language typology since the 1980’s. Table 1 outlines some key characteristics of studies in language typology within WCF, FDG and RRG in relation to SFL oriented typology.

SFL theory shares many theoretical principles with West Coast Functionalism such as the lexis-grammar continuum, the polyfunctionality of linguistic items, the relative non-arbitrary relationship between grammar and meaning, the organisation of language on paradigmatic and syntagmatic axis, and the evolving nature of language, in general, and, in particular, the fact that grammar emerges out of language use. These similarities are hardly surprising since both approaches have roots in Prague school linguistics, especially through connections with Roman Jakobson.

Table 1 SFL in relation to other functional approaches to language typology

In terms of typology, both frameworks emphasise empirical typology based on large samples of languages; qualitative and quantitative analysis of texts from comparable registers across languages; and the importance of grammaticalisation in the study of languages (Teruya and Matthiessen 2015). It should, however, be noted that grammaticalisation has not been given enough attention in descriptive and typology research in SFL (cf. Taverniers 2015). SFL contribution in this area would be a system-oriented study of grammaticalisation within and across languages. That is, how do the different features in grammatical systems in languages and their realisations evolve? (See, however, Cummings (2015) analysis of the evolution of tense in English from an SFL perspective).

One difference between SFL and WFC is SFL’s commitment not only to the meanings and functions of lexicogrammatical resources but a systematic analysis of the structure of clauses and other grammatical units (and, indeed, the units of other strata such as phonology) that realise these meanings. Such a structural analysis does not only contribute to a fuller understanding of language form and function, but is very necessary for the description to be useful for various purposes of application.

But this interest in structure is one of the characteristics that connect SFL with FDG and RRG (see Table 1). Butler (2003a, b) highlights this connection by classifying the three frameworks as “structural-functional” within the functional approaches of linguistic science. There is, however, a difference between the conception and analysis of structure in SFL, on the one hand, and FDG and RRG, on the other hand, at least at the level of the clause. SFL analysis prioritises systems such as MOOD, ASPECT, and TRANSITITY and the delicate meanings (or features) that realise them (e.g. declarative and imperative for MOOD). Structure is then analysed as a realisation of a particular feature or a cluster of features, that is, its ‘reflex in form’. For example, in analysing the English clause in terms of MOOD, the first stage of the analyses shows a distinction between the indicative clause and the imperative clause by the presence of the elements Subject and Finite (i.e. a tense/modality verbal operator) in the indicative and the absence of these elements in the imperative, except in the marked realisation of you as imperative Subject. The indicative show further distinction in delicacy by the variation in the order of the clause elements in the declarative (i.e. Subject ^ Finite) and the interrogative (i.e. Finite ^ Subject), except, again, where the Subject of the interrogative clause is a Wh-item (as in What is your name?). On the other hand, both FDG and RRG enter the description through structure or the syntagmatic organisation of the clause rather than systems (see e.g. Van Valin 2000, 2007; Hengeveld and Mackenzie 2008).

In addition, FDG and RRG aim to identify structural elements and develop a model of grammatical structure that is universally valid for all languages (see Haspelmath (2009) for a critique of this approach). SFL theory, on the other hand, does not claim universality for grammatical elements such as Subject, Actor or Theme (and even systems such as ASPECT, TENSE and MODALITY) nor does it claim universality in the order of elements in the clause or any linguistic unit, for that matter, as part of the theory of language. These linguistic phenomena and their universality or regularities across languages are posited as empirical questions and any claims thereby are descriptive motifs and generalisations based on descriptive materials (see e.g. Matthiessen 2004). A detailed comparison of these approaches is beyond the purpose of this study and interested readers are pointed to Butler’s (2003a, b) comprehensive review.

We should, however, indicate that the differences among the approaches identified in Table 1 largely emanate from the different goals of typology within the frameworks as well as differences in the definition of ‘theory’ and the relationship between theory and language description and/or typology. One danger of setting up comparisons like this is that it tends to consolidate differences and prevent useful cross-fertilisation. Our objective here is, however, to foster a reflection on approaches and foster collaboration to advance the common interest of linguistic science.

Design of the survey of empirical studies on systemic language description and typology

As mentioned earlier, the aim of this study is to examine theoretical developments and empirical studies in systemic language description and typology. Having discussed the theoretical issues in the preceding section, we proceed, in this section, to describe the methods and procedures employed in compiling the database for our survey of empirical studies. We first identify the specific objectives of the survey and then describe the sources of the studies in our database and the guiding criteria for selecting them.

Objectives

The survey covers two main aspects of systemic studies on language description and typology. First, it examines general characteristics of the field, comprising its historical progression, mode of publication of research, and the geo-linguistic scope the studies cover. Second, it examines the methodological procedures that the studies have adopted, including sampling across registers and analytical techniques such as quantitative and qualitative analysis.

The database of studies: sources and guiding criteria

The survey is based on a database of 131 studies published between 1969 and July 2014. The compilation of the database follows the methodology developed by Norris and Ortega (2000, 2006) for research synthesis and meta-analysis. The guiding principle of this methodology is to retrieve the relevant literature for the survey in a replicable and systematic manner and widely as much as possible.

The studies were retrieved from four main sources. The first source is the library search engine of The Hong Kong Polytechnic University, OneSearch. In addition to using the interface of the search engine, we also consulted key databases it provides, comprising Scopus, Web of Science, Pro-quest Dissertation and Theses Collection and ERIC. Second, Google and Google Scholar were also used. Notable keywords that were used in the search queries include, ‘systemic functional linguistics AND language description’; ‘SFL AND functional typology’; ‘functional linguistics/grammar AND language description’ and their variants. Further, we purposively searched targeted journals, including Functions of Language, Functional Linguistics, Languages in Contrast, Linguistics and the Human Sciences, Language Sciences, Language Typology, and Studies in Language. Finally, we consulted the web pages of particular experts and researchers in the field and contacted some other scholars through personal communication.

In selecting the studies, we did not take into consideration the prestige of the source of publication such as journal ranking or the perceived status of the publisher. The criteria for including or excluding a study from the database are outlined in Table 2. These criteria are only a strategy to keep our database to a manageable size by focusing on studies that are central to our topic. Particular comment needs to be made of the publication types that were included in the database. The general guiding principle was to include studies that are publicly accessible, which we defined as published materials and those available in online databases. This criterion excluded unpublished conference presentations. Two manuscripts that are in circulation (Matthiessen 1987a, b) among systemic researchers were included in the database. Halliday’s studies on Chinese in the 1950’s (e.g. Halliday and Ellis 1951; Halliday 1956, 1959) are excluded since we take Categories of the Theory of Grammar (Halliday 1961) as the beginning of systemic functional linguistics proper (cf. Matthiessen 2007b; Martin 2016).

Table 2 Guiding criteria for compiling the database

In spite of the efforts made to retrieve as many publications as possible, the database is certainly not exhaustive. For instance, although a few studies published in languages other than English are included, it was generally difficult retrieving non-English medium publications, which was done mainly through expert consultation and personal contacts. In addition, it is difficult to keep track of the increasing number of publications in systemic language description and typology around the world, especially in the turn of the 21st century. Besides these two uncontrollable constrains, there was also the need to keep our references at a reasonable length. Thus, we adopted the qualitative research principle of saturation point, a strategy where data collection and analysis end at the point where further analysis does not yield new patterns or categories. Our saturation point was where the extra studies retrieved neither described languages other than those in our database nor made changes to our quantitative profile of studies across the different variables of our analysis (see Figs. 5, 6, 7, 8 and 9). The studies excluded in this way are mainly studies on Chinese and Spanish and are mostly published between 2004 and 2008, the period with the highest number of studies in our database (see Fig. 5). In fact, we found that there is a large body of systemic typology literature on Chinese and Spanish, many of which are published in these languages rather than in English, and we recommend further reviews on the coverage of studies on these languages in terms of systems and SFL dimensions of language (cf. Matthiessen 2007b). In spite of these constrains, we believe our data set is still representative of the languages covered in systemic language description and our findings should be considered in light of the limitations indicated here.

Fig. 5
figure 5

Number of publications from 1969 – 2014 across groups of five-year periods (total number = 131)

Fig. 6
figure 6

Frequency distribution of studies across publication mode (total number of studies = 131)

Fig. 7
figure 7

Frequency distribution of studies across languages (total number of studies = 131; total number of languages = 38 (excluding the ‘Various’ category). Contrastive studies are counted twice, provided none of the languages described is English)

Fig. 8
figure 8

Percentage distribution of languages described by regions (total number of languages = 38)

Fig. 9
figure 9

Number of languages described per phylum (Total number = 38)

The findings of the survey are discussed in subsequent sections below, in accordance with the research objectives outlined under “Objectives” above.

Meta-analysis of studies

This section examines the characteristics of studies in SFL language description and typology. We first discuss the historical development of the field and then consider the avenues through which the studies are disseminated. We will then examine the descriptive coverage of the field by discussing the representativeness of areal and genetic languages described and the research output on the individual languages. Finally, we will summarise the methodological procedures employed in the studies and highlight some available resources in this area. Our discussion will be supported by quantitative profiles in terms of frequency counts and percentage distribution.

Historical development

Figure 5 presents the frequency distribution of the publications in groups of five-year interval (see Appendix 1 for a list of the studies). It can be observed that systemic linguistics has been engaged with languages other than English since the late 1960’s. However, between the first fifteen-year period, from 1969 to 1983, only ten (7.6 %) studies are recorded. The number of studies only begin to rise around the mid 1980’s and reached their peak between 2004 and 2008, which accounts for 44 (33.6 %) of the 131 studies in our database. Since 2009, however, there has been a decrease in research output although, compared to the earliest periods, a considerable number of studies (23, 17.6 %) are still recorded for this period. These findings generally show that, whereas systemic linguistics has been multilingual from the very start, it was not until a little over a decade ago that systemic linguists gave much attention to language typology and the description of languages other than English.

We place this historical configuration of studies within the context of other developments within SFL since the 1960’s. As mentioned earlier, Michael Halliday’s description of Chinese in the 1940’s and 1950’s (e.g. Halliday and Ellis 1951; Halliday 1956, 1959) as well as the multilingual approach to language study J. R. Firth had established in London set the stage for the evolution of systemic functional theory, in general, and systemic multilingual studies, in particular. The 1960's saw very important publications. Halliday (1961) outlines many of the dimensions of language and, in the mid to late 1960’s, the relationship between the paradigmatic and syntagmatic organisation of language had been presented in greater detail (cf. Halliday 1966a). This framework was further applied to the description of the grammar of TRANSITIVITY and THEME in English (Halliday 1967a, b, 1968), providing a model for the description of the lexicogrammar of other languages.

These developments receive corresponding resonances in the descriptions of languages such as French (Huddleston et al. 1969), Gooniyandi (Barnwell 1969) and Nzema (Mock 1969). These studies show how grammatical systems and/or grammatical units are organised systemically. Huddleston and Uren (1969), for instance, examine the system of MOOD in French, giving a sketch of declarative, interrogative, and imperative clauses and laying the foundation for later comprehensive descriptions of French lexicogrammar (e.g. Caffarel 2004, 2006). Mock’s (1969) study on Nzema (Niger-Congo: Kwa) also focuses on identifying and analysing grammatical units.

One important theoretical development in the 1970’s and into the 1980’s was the explicit articulation of the dimension of metafunction within the theory, although functional diversity was already present in the form of the analysis of the English clause (see Halliday 1961, 1966a). Halliday (1973) expounds the theory of metafunctions and shows the ontogenetic process in its manifestation in the language of the individual meaner. Also in the 1970’s and into the 1980’s seminal publications on cohesion (Halliday and Hassan 1976) and the interpersonal systems of English (Halliday 1984) appeared. These contributions complemented the analysis of transitivity and theme in Halliday (1967a, b, 1968) and finally culminated in the publication of the first edition of IFG (Halliday 1985), the first most comprehensive account of the lexicogrammar of any language using systemic functional theory. Other notable contributions in this period and earlier in the late 1960’s are studies on English phonology, particularly on intonation (e.g. Halliday 1967a, b), Halliday and Martin’s (1981) collection of systemic papers on key theoretical developments and descriptions and Halliday and Fawcett’s (1987) follow-up two volumes of contributions from key systemic scholars.

These developments together gave impetus to works on other languages, corresponding with a fair increase in typological work between 1984–1993 (see Fig. 5). Notable studies in grammar in this period include transitivity in Chinese (Long 1981), the study of transitivity (Indah 1985) and the nominal group (Sutjaja 1988) in Indonesian language, and cohesion in Arabic (Aziz 1988). Key studies in phonology are studies on the phonological ranks and prosodic systems of Akan (Niger-Congo: Kwa), Telugu (Dravidian: South Central) and Zapotec (Oto-Manguean) by Matthiessen (1987a); Prakasam (1972) and Mock (1985) respectively.

In the 1990’s, further developments in SFL theory and descriptions on English continued to impact on multilingual and typological research. Halliday’s (1994) second edition of IFG and other two important works, namely, Halliday and Matthiessen’s (1995) elaborate account of ideational semantics and Matthiessen’s (1995) Lexicogrammatical Cartography of English Systems (affectionately called LexiCart), were published around this time. Halliday (1994) and Matthiessen (1995) particularly give us a rich complementary account of the lexicogrammar of English and a more explicit model for research on other languages. While Halliday (1994) focuses on the metafunctional organisation of lexicogrammar, Matthiessen (1995) uses system networks to demonstrate how lexicogrammatical resources are organised systemically and extend in delicacy.

Multilingual studies in this period correspondingly became more extensive, encompassing more language families, namely, Philippine (Martin 1990), Sinitic (McDonald 1992; Huang and Fawcett 1996); Germanic (Degand 1996), Romance (Caffarel 1992; Downing 1996; McCabe 1999), Finno-Urgic (Shore 1992, 1996); Bunuban (McGregor 1990, 1992a, 1996); Western Desert (Rose 1993, 1996, 1998) and Papuan (Boxwell 1990, 1995). Huang and Fawcett (1996) and Downing (1996) are important as some of the earliest studies to adopt a contrastive approach, involving Chinese and English and Spanish and English respectively.

This extensive work reflects in the increase in the frequency of studies from 1990 into the early 2000’s, as shown in Fig. 5. However, in terms of content, many of these studies are still not as elaborate or holistic as descriptions provided in Halliday (1985, 1994) and Matthiessen (1995). Many studies still focused on transitivity and theme, as in the 1960's and into the 1980's. Boxwell (1990, 1995), however, gives detailed analysis of cohesion in Weri (Trans-New Guinea: Southeast Papuan), specifically on co-referentiality and ellipsis. He shows how Weri’s data differ from Halliday and Hassan's (1976) account on English. Some scholars (e.g. Martin 1990; Caffarel 1995) had also began to give attention to the interpersonal resources of the lexicogrammar. Caffarel (1996) gives a comprehensive overview of French grammar rooted in discourse. Studies also appear on tense in French (Caffarel 1990, 1992) and Spanish (Downing 1996) in this period.

Notably, the year 1992 records the publication of the first edited volume on systemic phonology, which marks a landmark for typological studies in phonology (cf. Tench 1992). This volume includes descriptions of six languages other than English: Arabic (Eddaikra & Tench 1992), Chinese (Halliday 1992; Lock 1992), Gooniyandi (McGregor 1992a), Telugu (Prakasam 1992), Swahili (Maw 1992) and Welsh (Kelly 1992), excluding Oladeji’s (1992) phonostylistic analysis of Yoruba lullabies.

As indicated earlier, the period between 2004 and 2008 is the peak of systemic typology and language description, recording 44 (33.6 %) of the studies in our database. This rise is a result of the interest in typological work in the preceding decade and can be attributed to the publication of the notable contributions on lexicogrammatical profiles in Caffarel et al. (2004), which includes descriptions of nine individual languages and a chapter on a cross-linguistic survey of typological motifs and generalisations (Matthiessen 2004). In addition to Caffarel et al. (2004), other notable publications appear in this year, including Rose (2004b), Lavid and Arus (2004), Arús (2004), and Andersen (2004). Lavid and Arús (2004) and Arús (2004) are particularly important for also contrasting Spanish with English, a further development on Downing (1996).

The period after Caffarel et al. (2004) continues to record a steady increase in studies on systemic typology and language description. Twenty-nine (22.1 %) studies are recorded between the four-year period from 2005 to 2008, and 23 (17.6 %) studies are recorded from 2009 to 2014.

Whereas some of the output in the 21st century continue to be produced by scholars previously engaged in the field, new researchers have also emerged (e.g. Akerejola 2005, 2008; Teruya 2007; Kim 2007; Yang 2007; Patpong 2006a; Bardi 2008; Banks 2010; Quiroz 2008; Choi 2013) and new languages, mostly Asian, such as Japanese (e.g. Teruya 2007; Thomson and Armour 2013a, 2013b); Korean (Kim 2007; Choi 2013; Park 2013), Oko (Akerejola 2005), Bajjika (Kumar 2009), Thai (Patpong 2006a), Telugu (Prakasam 2004), German (Steiner and Teich 2004), Arabic (Bardi 2008) and Swedish (Holmberg and Karlsson 2006) are described.

We proceed, in the next section, to examine the mode of dissemination of these studies within the academic community.

Mode of publication

The genre for disseminating studies in systemic language description and typology is very important. First, it indicates the depth and comprehensiveness of the languages described. SFL places importance on descriptions that are not only contributing to intellectual findings on language typology but also descriptions that are comprehensive enough to provide useful language material for the application in critical contexts of the community life of the language users, such as education, translation, computational applications, forensic applications, and discourse analysis. In addition, the nature of the outlet of publications does affect the extent to which they are visible to the community of linguistic science.

Six publication types are identified in our database, namely, books, book chapters, conference papers, journal articles, manuscripts and theses. As Fig. 6 indicates, the top three outlets of systemic language description and typology are book chapters (50, 38.1 %), journal articles (35, 26.7 %), and theses (24, 18.3 %).

Research articles, apart from being the most prestigious academic literature, are perhaps the best way to widely disseminate research findings. Book collections or edited volumes also allow writers to bring together their descriptions and make them accessible to readers in one document. The space constraints articles and chapters in edited volumes pose, however, do not allow a comprehensive description of languages. Many research articles and book chapters in our database are, nonetheless, summaries and preliminary findings of more comprehensive descriptions (see e.g. the contributions in Caffarel et al. 2004: Ch. 1). It is, however, still useful for systemicists to publish more of aspects of their descriptions in journals in order to make them more visible, especially to scholars working outside SFL.

One way by which systemic language description and typology has expanded and developed is through postgraduate research. A number of scholars, notably Michael Halliday, Christian Matthiessen and Jim Martin have helped produce a number of descriptions through PhD supervision. Following the multilingual tradition established by J. R. Firth, Michael Halliday supervised PhD theses on languages other English in the 1960’s, including Barnwell’s (1969) account of Mbembe and Mock’s (1969) analysis of grammatical units in Nzema (see Teruya & Matthiessen for further discussion). These studies became some of the earliest systemic studies on the description of languages other than English.

In the 1990’s and especially at the turn of the 21st century, Christian Matthiessen supervised a number of PhD theses on several languages at Macquarie University (and now at The Hong Kong Polytechnic University), including, among others, Caffarel (1996) on French, Teruya (1998) on Japanese, Thai (1998) on Vietnamese, Tam (2004) and Li (2007) on Chinese, Akerejola (2005) on OKo, Patpong (2006a) on Thai, Kim (2007) on Korean, Bardi (2008) on Arabic, and Kumar (2009) on Bajjika. Many of these are comprehensive accounts on these languages, using Halliday’s Introduction to Functional Grammar as a model.

Some of these PhD theses continue to produce a number of books on systemic functional grammar.Footnote 1 One of the early works is by McGregor (1990) on his description of the grammar of Gooniyandi. In 2006, Continuum (now part of Bloomsbury Publishing Plc) launched a book series on systemic functional grammars and has since published descriptions of several languages, notably Caffarel (2006); Li (2007); Teruya (2007); and Lavid et al. (2010). Other book length contributions include Andersen et al. (2001) description of Danish, Holmberg and Karlsson’s (2006) grammar of Swedish and Rose’s (2001) study on Pitjantjatjara.

As noted by Caffarel et al. (2004), Ch. 1, one challenge of systemic typology is the fact that it takes a great deal of time and effort to produce a comprehensive, book length description and a lot of collaboration is needed in this area. Comprehensive descriptions, as mentioned earlier, are crucial for application in discourse and text analysis and critical contexts such as language education.

Representativeness: areal and genetic coverage

One important consideration in language typology is a representative sample of languages. This is important since the goal of typology is to make descriptive and theoretical generalisations about language. Representativeness in language typology can be conceived of in three aspects: (1) the sample size should be large enough to allow generalisations about language (i.e. numerical representativeness), (2) the sample should include languages of different regions of the world (i.e. areal representativeness), and (3) the sample should include languages from every language family or, more feasibly, a variety of language families (i.e. genetic representativeness). Each of these three aspects will be addressed in relation to the linguistic coverage of studies in our database.

Figure 7 lists all the languages covered by studies on the description of individual languages in our database and indicates the number of research output on each of them (typology-oriented studies are labelled ‘Various’). In all, 38 languages are covered by studies that describe individual languages. Chinese (21, 16 %) has the highest research output, closely followed by Spanish (15, 11.5 %) and then by Japanese (12, 9.2 %), French (11, 8.4 %) and, finally, Pitjantjatjara (6, 4.6 %) and Telugu (6, 4.6 %). Although these figures cannot be used to judge the relative comprehensiveness of the descriptions of the languages, they indicate that some languages are more productive or more engaged by researchers than others. Out of the 38 languages, those with comprehensive descriptions are only 12 (31.6%), comprising Arabic, Bajjika, Chinese, French, Gooniyandi, Japanese, Korean, Oko, Pitjantjatjara, Spanish, Swedish and Thai. We define comprehensive description as a book length (or PhD thesis length) account of grammatical systems of at least the three metafunctions of language, ideational (either experiential or logical, or both), interpersonal and textual.

Given that there are about 6000 languages spoken in the world today (Lewis et al. 2015; Hammarström 2015), the quantitative coverage of SFL language description is infinitesimal and needs to be expanded. The need to expand the description base of linguistic science is, however, not limited to SFL. It has been noted in the literature that many of the worlds languages are either undescribed or have not been sufficiently described to warrant inclusion in samples for language comparison and typology (see Caffarel et al. 2004: 60–61, Teruya and Matthiessen 2015 for discussions on language sampling). The need to describe more languages has become even more important since many minority languages are increasingly being displaced by dominant and regional ones (see Gikandi 2015; Hammarström 2015). Comprehensive language description, particularly from a functional linguistic perspective of the kind pursued by systemic linguists, is crucial for application in linguistic revitalisation. The inadequacy of formal syntax and documentary linguistics in revitalising critically endangered languages has been lamented (Austin 2015). SFL’s text-oriented approach to language, defined as a semiotic resource, can produce useful resources on endangered languages for revitalisation projects.

Figure 8 presents a percentage distribution of the areal coverage of the 38 languages covered in SFL language description while Fig. 9 gives a frequency distribution of the languages across phyla.Footnote 2 As Fig. 8 shows, there is an uneven distribution of languages across regions (see also Teruya et al. 2007). The majority (12, 31 %) are from Asia, and Europe (11, 28 %), followed by Africa (6, 16 %) and Oceania (6, 16 %). Languages spoken in the Americas and the Middle East are the least. It is obvious that a lot more research and collaboration are needed in order to approach anything close to areal representativeness in SFL description. Strictly speaking, none of the regions is adequately represented.

In addition, with the exception of Oko (Niger-Congo: Benue-Congo; cf. Akerejola 2005), studies on African and American languages cover small aspects of the languages rather than being comprehensive descriptions (see Atoyebi 2010 for a non-SFL reference grammar of Oko). Given that one third of the worlds languages (about 2, 000) are spoken in Africa, many languages in this region need to be represented (cf. Heine and Nurse 2000; Lewis et al. 2015). Also studies on Weri (Trans-New Guinea: South Papuan) are so far limited to cohesion and nominal constructions (Boxwell 1990, 1995). The need for areal representativeness calls for an increased collaboration between systemic typologists and areal specialists or native speaker linguists around the world, including those working outside SFL.

The distribution of the 38 languages across language families yields 17 phyla (see Fig. 9). Most language families are represented by only one language. The exceptions are Germanic (i.e. German, Danish, Swedish and Dutch), Romance (i.e. French and Spanish) and languages of the Niger-Congo phylum (i.e. Akan, Beja, Mbembe, Nzema, Oko, Swahili). As indicated earlier, however, studies on Akan, Beja, Mbembe, Nzema and Swahili are phonological and/or grammatical sketches that need to be expanded.

Nonetheless, the figures show that there has been a growing effort among systemicists in describing languages other than English in the quest for typological generalisations about language.

Recent years have seen a number of studies in typological generalisations, which is a natural follow up on the descriptive work in previous decades. It will be useful to discuss notable studies in this area in more detail. As mentioned earlier, Matthiessen’s (2004) book chapter gives a metafunctional profile of grammatical systems across languages and the different ways in which these systems are realized structurally. It spans 151 pages and makes reference to at least 160 languages, as well as a passing reference to about 27 language families. It is a classic contribution to the science of language, developing on Greenberg (1966b) and research in functional language typology ever since.Footnote 3 He draws linguistic data from systemic language descriptions since the 1960’s and mainly from research in other functional traditions such as WCF, FDG and RRG. It also includes interesting discussion on aspects of Akan lexicogrammar, presented for the first time.

Following this, Teruya et al. (2007), Matthiessen et al. (2008) and Teruya and Matthiessen (2015) continue to give typological generalisations, focusing on interpersonal systems such as MOOD and MODALITY. Teruya et al. (2007) include short descriptions of six languages (Danish, French, Japanese, Oko, Spanish and Thai) across five families (Germanic, Romance, Japanese, Niger-Congo: Kwa, Tai-Kadai: Tai) as well as references to other 24 languages from secondary sources. One unique characteristic is its combination of discourse analysis with ‘universal’ grammar. It also provides a good example of how the view from individual languages and that from the typology pole can be balanced in language description.

Wang and Xu (2013) provides a complementary account by focusing on the experiential metafunction. Specifically, they give a cross-linguistic account of existential and relational clauses (possessive and circumstantial types). While acknowledging that Michael Halliday's IFG is meant to be a description of English, Wang and Xu (2013) problematises the universal applicability of dividing clauses that construe location into two different process types, relational and existential. Based on their cross-linguistic data (examples are given from about 21 languages), they argue for a universal classification of existential clauses as a sub-type of relational processes. We interpret their account as highlighting the competing interest in describing individual languages in their own right and describing languages as a manifestation of the one human semiotic system called language.

One limitation of Wang and Xu's (2013) study, however, is that it limits the discussion on process types in the systemic literature to Michael Halliday’s account on English instead of taking into consideration the many descriptions on other languages that have been produced in the past few decades (see e.g. Caffarel et al. 2004), particularly Halliday and McDonald (2004) and Martin (2004), who similarly identify existential clauses as a sub-type of relational clauses in Chinese and Tagalog respectively. Matthiessen’s (2004: 580) generalisation may be worth quoting in this direction:

‘existential’ processes are variable; for example, while they can be described as one of the primary process types in English, there are reasons for treating them as a subtype of ‘relational’processes in Chinese …

For future research similar to Wang and Xu (2013), therefore, it may be useful to investigate how languages divide up the different domains of experience in their grammar; for instance, how is location grammaticalized and divided up differently across languages?

Finally, Teich (2002) exemplifies an interface between linguistic description and application. Her data are translations of instructional texts from English into four other Indo-European languages: French (Romance), German (Germanic), Bulgarian and Russian (Slavonic). Her presentation of linguistic data in most parts is, however, more illustrative than descriptive and the general goal is to construct a theoretical framework for multilingual studies. From this latter perspective, it predates theoretical discussion in Matthiessen et al. (2008) and Teruya and Matthiessen (2015). Together, these typology-oriented studies complement the description of individual languages and contribute to the typological power of SFL theory.

In the next section, we will examine some methods and procedures adopted in systemic language description and typology and proposals that have been made thereof.

Methodological considerations

We proceed to first discuss general issues on research design and then examine data sources (Section 6.1) and analytical techniques (Section 6.2). For convenience, the discussion in this section will give a summary of methodological issues identified in the literature rather than a profile of the studies based on particular methods and approaches.

Matthiessen (2015d) identifies five strategies in designing a study in language description and typology as follows (also see Haspelmath (2009) on theory and typological guidance):

  1. 1.

    theoretical guidance

  2. 2.

    typological guidance

  3. 3.

    transfer comparison

  4. 4.

    analysis of registerially informed sample of texts in context

  5. 5.

    use of language consultants.

Figure 10 shows how the first four strategies interact in contributing to the description of a ‘new’ language. These strategies (or a combination of some of them) have been used by systemic researchers in the description of various languages. Theory provides a road map that guides the linguist in investigating the meaning potential of a speech community. As a general theory of language, SFL offers researchers the different dimensions of language as a complex system of meaning. As we mentioned earlier, systemic language description and typology have developed hand in hand with theory building. Studies become more comprehensive as the theory expands to cover many dimensions of language.

Fig. 10
figure 10

Strategies in designing the description of a “new” language (This figure is taken from a Keynote presentation by Matthiessen (2015d) available to us through personal communication)

In addition to theory, one crucial resource is descriptive generalisations of languages in typological studies. Here, the researcher deploys attested cross-linguistic tendencies in language as a guide to the analysis and interpretation of the language or languages under description. It also means that any descriptive statements made for a particular language must be typologically valid; that is, it should make sense in terms of what is known about human languages as attested by typological investigations.

Related to typological guidance is the technique of transfer comparison (cf. Caffarel et al. 2004. Ch. 1). Here, the analyst may identify model descriptions that serve as a window into the new language. This is often a feasible strategy to manage the enormity of work involved in language description. As discussed in earlier, early SFL descriptions were modelled on Halliday’s description of Chinese and English in the 1950’s and 1960’s. In recent years, the various editions of IFG (Halliday 1985, 1994; Halliday and Matthiessen 2004, 2014) and Matthiessen’s (1995) LexiCart have provided comprehensive models and guide for the description of languages other than English. As descriptions of many more languages continue to emerge, it is best to work with models from a number of languages in order to avoid the possibility of imposing the categories of one language upon another, a recurrent albeit unfortunate tendency among linguists, even in contemporary times. Apart from descriptive models in the SFL tradition, it is also important to consult descriptions from non-SFL perspectives. These may include earlier descriptions or sketches of the language under consideration, genetically and areal related languages and languages from other regions and families. This approach maximises the reliability of the description.

The fourth criterion on the analysis of registerially informed sample of texts leads us to the next section.

Data sources

One key characteristic of SFL is the value it places on the analysis of texts in context. Following Halliday’s (1959) study, The Language of the Chinese ‘Secret History of the Mongols’, SFL description has mainly been based on naturally occurring texts from the very start. SFL bases description on ethnographic methods of data collection and analysis and interpretation. Thus, SFL descriptions and typology work has been based on extensive analysis of texts in their social contexts. In order to maximise reliable comparisons and generalisations, Christian Matthiessen and Kazuhiro Teruya (e.g. Matthiessen 2015c) has developed a typology of registers based on the contextual parameter of field of activity to support typological work (following Jean Ure’s text typology). This semiotic map consists of eight broad socio-semiotic processes albeit with fuzzy boundaries (see Matthiessen and Teruya (2016) on hybridity in this registerial typology). These socio-semiotic processes are an economic generalisation of the acts of meaning (or semiotic activities) that folk members engage in in the speech fellowship (see Table 3).

Table 3 Register typology based on socio-semiotic processes (field of activity)

Since the 1990’s, this registerial typology has served as a resource for many systemicists in developing text archives for language description. It is normally combined with other contextual variables, namely, medium (written/spoken) and turn (monologic/dialogic) in sampling texts for analysis. Notable applications include work on Japanese (Teruya 1998, 2007), Oko (Akerejola 2005), Arabic (Bardi 2008) and Bajjika (Kumar 2009). Some studies have also focused on particular registers. Patpong (2006a), for instance, focused on folktales (recreating: narrating) in her study of Thai lexicogrammar. The contributions in Thomson and Armour (2013) on Japanese lexicogrammar also focus on particular text types such as news reports (reporting: chronicling) academic texts (expounding) (Thomson and Armour 2013), and text books (enabling: instructing) (Thomson’s (2013); Hayakawa 2013). These contributions, however, do not explicitly deploy this text typology.

One notable consideration is the use of computerised corpus data. For world major languages such as Japanese that have existing corpora, these corpus data have been used in various descriptions. Fukui’s (2013) study, for instance, combines data from a spoken corpus, a transcript of a conversation and children’s story while Teruya’s (2007) description is mainly based on a self-compiled corpus. However, given the lack of existing corpora for many of the languages described, in addition to other practical challenges, the feasibility of using a large corpus for systemic language description and typology has been very limited. An alternative is a text archive, which Halliday and Matthiessen (2014: 70–71) describe as follows:

The difference between a corpus and a text archive is not a sharp one; but the general principle is that a corpus represents a systematic sample of text according to clearly established criteria whereas a text archive is assembled in a more opportunistic fashion …

Apart from naturally occurring texts, however, other sources of data comprise elicited examples from native speakers (i.e. language consultants), illustrative examples from secondary sources as well as constructed examples and paradigms. These are normally used as supporting resources to the analysis of discourse data.

Method of analysis: qualitative versus quantitative

Halliday’s research on Chinese in the 1950’s (e.g. Halliday 1956, 1959) laid a foundation for SFL’s concern with both qualitative and quantitative aspects of language. As mentioned earlier, the analysis of text instances is the fundamental or basic activity in the study of language. In language description, this means shunting between instance and system on the instantiation dimension of language. The objective of analysing discourse data is to find general patterns and systems and test these patterns on texts in new contexts. One tool that has proved very useful in generalising features and their realisations is the system network. The system network is a local map or a drawing board that allows the analyst to move between text and system (i.e. language as system) in describing the grammar of particular systems (such as MOOD). Martin (1987, 2013) presents a very illuminating guidance on how to draw and evaluate system networks.

Generally, however, descriptions in our dataset almost do not include quantitative analysis. Exceptions are Thomson’s (2013) description of THEME in Japanese, whose data consist of 1, 105 clauses from textbooks, and Patpong’s (2006b) study of conjunction in Thai. This absence of quantitative profiles is unsurprising given the enormity of work involved for the kind of description SFL theory demands. However, since grammars have already been produced for many languages in the last two decades, it should be possible, and it is indeed necessary, to develop quantitative profiles for systems in these languages.

Conclusion

In summary, this study has examined theoretical and empirical developments in language description and typology within SFL. It has shown that a multilingual perspective of language, in general, and language typology, in particular, has been a central concern of the theory from the very start. It was, however, not until the beginning of the 21st century that language typology gathered empirical momentum. It should, therefore, be emphasised that, contrary to a popular view outside SFL, systemic functional theory has, since its very beginning, been deployed in describing different languages, with earlier descriptions of languages such as Chinese, English, French, Beja, Nzema and Zapotec empowering the theory and enriching later descriptions of some of these languages as well as new languages. In more recent times, the need for a more systematic sampling of texts that are comparable across languages for descriptions and other multilingual research activities have led to further development of the context stratum (see e.g. Matthiessen (2015c) on socio-semiotic processes).

The paper also shows that systemic typology is a contribution to the broad goals of language typology, using the general dimensions of language the SFL theory offers. In this light, it takes its goals from Prague school typology and interacts closely with other functional approaches to typology that descends from the Greenbergian tradition of the 1960’s.

The study also reveals a few areas that should be considered in future research. First, systemic typology needs to take into account the phenomenon of grammaticalisation and the systematic analysis of grammatical units below the clause (but see Matthiessen et al. 2016) on a typology of verbal units). In terms of methodology, rigorous attention will have to be paid to the quantitative profiles of systems across languages and, as a prerequisite to this, the development of multilingual corpora or a database of comparable registers. The qualitative analysis of many more individual languages, apart from being an important endeavour in itself, is necessary to do a multilingual comparison, both qualitatively and quantitatively. Another area where research energy needs to be channelled is the description of phonology both typologically and on individual languages.

One related aspect that is outside the scope of this study is a typology of semiotic systems in general. What generalisations can be made about language and different semiotic systems that serve as resources for human communication and the success of social life? Jakobson (1966) put this on the typological agenda and it has been a key motif in Michael Halliday’s writings. Many studies on semiotic systems other than language have been investigated both within and outside SFL since the 1980’s (e.g. Kress & van Leeuwen 2001). It is imperative for contemporary linguistic science to investigate the semantic generalisation and motifs across these semiotic systems. The results will be useful for linguistics and fields such as anthropology, psychology and the applied disciplines of design, the visual arts and communication and media studies. In this light, we echo the observation by the anthropologist Casagrande (1966) that it behoves on general anthropologists “to attend to what the linguist has to say, and to ask linguists what light their studies can throw on the nature of man, and especially on man as a symbol maker and user” (p. 280).