Skip to main content
Log in

Semantic mapping of discourse and activity, using Habermas’s theory of communicative action to analyze process

  • Published:
Quality & Quantity Aims and scope Submit manuscript


Our primary objective is evaluation of quality of process. This is addressed through semantic mapping of process. We note how this is complementary to the primacy of output results or products. We use goal-oriented discourse as a case study. We draw benefit from how social and political theorist, Jürgen Habermas, uses what was termed “communicative action”. An orientation in Habermas’s work, that we use, is analysis of communication or discourse. For this, we take Twitter social media. In our case study, we map the discourse semantically, using the correspondence analysis platform for such latent semantic analysis. This permits qualitative and quantitative analytics. Our case study is a set of eight carefully planned Twitter campaigns relating to environmental issues. The aim of these campaigns was to increase environmental awareness and behaviour. Each campaign was launched by an initiating tweet. Using the data gathered in these Twitter campaigns, we sought to map them, and hence to track the flow of the Twitter discourse. This mapping was achieved through semantic embedding. The semantic distance between an initiating act and the aggregate semantic outcome is used as a measure of process effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others


  • Barker, M., Barker, D.I., Bormann, N.F., Neher, K.E.: Social Media Marketing. A Strategic Approach. Cengage Learning, Andover (2012)

    Google Scholar 

  • Bakliwal, A., Foster, J., van der Puil, J., O’Brien, R., Tounsi, L., Hughes, M.: Sentiment analysis of political tweets: towards an accurate classifier. In: Proceedings of the Workshop on Language in Social Media, LASM 2013, Association for Computational Linguistics, pp. 49–58 (2013)

  • Benzécri, J.-P.: L’Analyse des Données, Tome I Taxinomie, Tome II Correspondances, 2nd edn. Dunod, Paris (1979)

    Google Scholar 

  • Benzécri, J.-P.: Correspondence Analysis Handbook. Dekker, Basel (1994)

    Google Scholar 

  • Blake, J.: Overcoming the “Value-Action Gap” in environmental policy: tensions between national policy and local experience. Local Environ. 4(3), 257–278 (1999)

    Article  Google Scholar 

  • Blasius, J., Greenacre, M. (Eds.): Visualization and Verbalization of Data. Chapman & Hall/CRC Press, Boca Raton, FL (2014)

  • Bull, R., Petts, J., Evans, J.: Social learning from public engagement: dreaming the impossible? J. Environ. Plann. Manag. 51(5), 701–716 (2008)

    Article  Google Scholar 

  • Bull, R., Petts, J., Evans, J.: The importance of context for effective public engagement: learning from the governance of waste. J. Environ. Plann. Manag. 8(53), 991–1009 (2010)

    Article  Google Scholar 

  • Chew, C., Eysenbach, G.: Pandemics in the age of Twitter: content analysis of tweets during the 2009 H1N1 outbreak. PLoS One 5(11), e14118 (2010)

    Article  Google Scholar 

  • Collins, J., Thomas, G., Willis, R., Wilsdon, J.: Carrots, sticks and sermons: influencing public behaviour for environmental goals. A Demos/Green Alliance report produced for DEFRA, Demos/Green Alliance, pp. 55 (2003). Retrieved 13 April 2014

  • Finnis, J., Chan, S., Clements, R.: Let’s get real. How to evaluate online success? Report from the culture 24 action research project. Brighton, pp. 40 (2011). Retrieved 13 April 2014

  • Futerra: The Rules of the Game.: The principles of climate change communication. Futerra Sustainability Communications Ltd., London, pp. 5 (2005). Retrieved 13 April 2014

  • Social Media Metrics for Federal Agencies, U.S. General Services Administration. (2013). Retrieved 13 April 2014

  • Hutto, C.J., Gilbert, E.: VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: ICWSM 2014, Proceedings of the Eighth International Conference on Weblogs and Social Media. Ann Arbor, Michigan, June 1–4 (2014)

  • Le Roux, B., Rouanet, H.: Geometric Data Analysis: From Correspondence Analysis to Structured Data Analysis. Kluwer Academic, Dordrecht (2004)

    Google Scholar 

  • Lorenzoni, I., Nicholson-Cole, S., Whitmarsh, L.: Barriers perceived to engaging with climate change among the UK public and their policy implications. Glob. Environ. Change 17(3–4), 445–459 (2007)

    Article  Google Scholar 

  • Matušík, M.B.: Jürgen Habermas, philosophy and social theory. Retrieved 21 March 2014

  • McKee, R.: Story: Substance, Structure, Style, and the Principles of Screenwriting. Methuen, London (1999)

    Google Scholar 

  • Murtagh, F.: Correspondence Analysis and Data Coding with R and Java. Chapman & Hall/CRC, Boca Raton, FL (2005)

    Book  Google Scholar 

  • Murtagh, F., Ganz, A.: Pattern recognition in narrative: tracking emotional expression in context. Preprint, (2015)

  • Murtagh, F., Contreras, P.: Big data scaling through metric mapping: exploiting the remarkable simplicity of very high dimensional spaces using Correspondence Analysis, in preparation (2015)

  • Pearce, W., Holmberg, K., Hellsten, I., Nerlich, B.: Climate change on Twitter: topics, communities and conversations about the 2013 IPCC Working Group 1 report, PLoS One, 9 (4), e94785 (2014)

  • Pennebaker, J.W.: The Secret Life of Pronouns: What Our Words Say About Us. Bloomsbury Press, New York (2011)

    Google Scholar 

  • Pianosi, M., Bull, R., Rieser, M.: Impact, influence and reach: lessons in measuring the impact of social media, preprint, pp. 36 (2013)

  • Séguéla, J., Saporta, G.: A comparison between latent semantic analysis and correspondence analysis. Presentation, CARME, Correspondence Analysis and Related Methods Conference, Rennes (2011).

  • Shove, E.: Beyond the ABC: climate change policy and theories of social change. Environ. Plann. 42, 1273–1285 (2010)

    Article  Google Scholar 

  • Verplanken, B., Walker, I., Daves, A., Jurasek, M.: Context change and travel mode choice: combining the habit discontinuity and self-activation hypotheses. J. Environ. Plann. Manag. 53(8), 991–1009 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Fionn Murtagh.


Appendix 1: Our 8 campaign initiating tweets

The following are the campaign initiating tweets, in full. For campaign 4, the two initiating tweets were merged together. DMU stands for De Montfort University.

  1. Campaign 1:

    Introducing #climatechange! Is the climate changing? What are the observed changes? Are humans causing it? Discuss #dmuCC

  2. Campaign 2:

    Do you feel #climatechange is a distant issue? Read and listen to the climate witnesses in the UK

  3. Campaign 3:

    Goodmorning #DMU!! How was your weekend? Did you participate in the #marathon? We are talking about electricity this week! #dmuelectricity

  4. Campaign 4:

    Goodmorning #DMU!! How was your weekend? We are talking about gas and heating this week! #dmuenergy Wishing you all a nice #ecomonday!

  5. Campaign 4:

    Connect with us to discover what #DMU is already doing to cut its #gas use and tell us what you think we could all do to make it better!

  6. Campaign 5:

    Goodmorning #DMU!! We talk about #sustainable food this week. We have a question for you! What do you think does Sustainable Food mean?

  7. Campaign 6:

    Here I am, fueled with caffeine! This week we will be talking in particular of #transport. How do you get from home to #DMU? #dmutransport

  8. Campaign 7:

    New post! #Sustainable #Water | Are you familiar with the concept of #WaterSecurity? #DMU #climate #sustainabledmu

  9. Campaign 8:

    @SustainableDMU #MeatFreeMonday seems to have latched itself into my brain! Not a big meat eater but like having a dedicated veggie day!

As discussed in Sects. 4.1 and 4.2, a set of 339 terms was ultimately selected as the set of all employable words used in the discourse. The terms retained for these particular initiating tweets, with frequency of occurrence, are as follows. For campaigns 1 through 8, we see that we have, respectively, summed frequencies of occurrence of terms: 4,4,7,14,10,6,7,5.

  1. Campaign 1:

    climate climatechange dmucc http [1 occurrence each]

  2. Campaign 2:

    climate climatechange http read [1 occurrence each]

  3. Campaign 3:

    dmu electricity goodmorning participate talking week weekend [1 occurrence each]

  4. Campaign 4:

    cut dmu dmuenergy ecomonday gas goodmorning heating nice talking tell week weekend [dmu, gas: 2 occurrences; otherwise 1 occurrence]

  5. Campaign 5:

    dmu food goodmorning mean question sustainable talk week [food, sustainable: 2 occurrences; otherwise 1 occurrence]

  6. Campaign 6:

    dmu dmutransport home talking transport week [1 occurrence each]

  7. Campaign 7:

    climate dmu http post sustainable sustainabledmu water [1 occurrence each]

  8. Campaign 8:

    day meat meatfreemonday sustainabledmu veggie [1 occurrence each]

The campaign 4 tweet was a merged one (from original tweets 303, 304). In campaign 4, the term “gas” is both word and hashtag. It is easy to go back to the original tweets and see the hashtags, or the tweeters. We keep the “http” part of the URL since it informs us that a web address is in the tweet.

Appendix 2: Correspondence analysis

Correspondence Analysis provides access to the semantics of information expressed by the data. The way it does this is to define semantically each observation (a tweet here), or row vector, as the average of all attributes (term here) that are related to it. Similarly it defines semantically each attribute, or column vector, as the average of all observations that are related to it.

This semantic mapping analysis is as follows:

  1. 1.

    The starting point is a matrix that cross-tabulates the dependencies, e.g. frequencies of joint occurrence, of an observations crossed by attributes matrix.

  2. 2.

    By endowing the cross-tabulation matrix with the \(\chi ^2\) (chi squared) metric on both observation set (rows) and attribute set (columns), we can map observations and attributes into the same space, endowed with the Euclidean metric.

  3. 3.

    Interpretation is through (i) projections of observations, attributes onto factors; (ii) contributions by observations, attributes to the inertia of the factors; and (iii) correlations of observations, attributes with the factors. The factors are ordered by decreasing importance.

Correspondence analysis is not unlike principal components analysis in its underlying geometrical bases. While principal components analysis is particularly suitable for quantitative data, correspondence analysis is appropriate for the following types of (non-negative valued) input data: frequencies, contingency tables, probabilities, categorical data, and mixed qualitative/categorical data.

The factors are defined by a new orthogonal coordinate system endowed with the Euclidean distance. The factors are determined from the eigenvectors of a positive semi-definite matrix (hence with non-negative eigenvalues). This matrix which is diagonalized (i.e. subjected to singular value decomposition) encapsulates the requirement for the new coordinates to successively best fit the given data.

The “standardizing” inherent in correspondence analysis (a consequence of the \(\chi ^2\) distance) treats rows and columns in a symmetric manner. One byproduct is that the row and column projections in the new space may both be plotted on the same output graphic presentations (the principal factor plane given by the factor 1 and factor 2 coordinates; or other pairs of factors).

From frequencies of occurrence to clouds of profiles, each profile with an associated mass

From the initial frequencies data matrix, a set of probability data, \(f_{ij}\), is defined by dividing each value by the grand total of all elements in the matrix. In Correspondence Analysis, each row (or column) point is considered to have an associated weight. The row weight is the row sum, divided by the overall data matrix total. The column weight is the column sum, divided by the overall data matrix total.

Next row profiles are defined as the row frequencies divided by the row weight (also termed the mass). Similarly we have column profiles. The \(\chi ^2\) distance between profiles is a weighted Euclidean distance. It is an appropriate distance for what are, initially here, categorical data.

We thus look on our row points (our tweets) as a cloud of points endowed with the \(\chi ^2\) distance. Similarly our column points (our words) are a cloud of points that are also endowed with the \(\chi ^2\) distance.

Just like in classical mechanics, we consider the inertia of these clouds. To begin with we have their total inertia, that is the inertia about their centre of gravity. The centre of gravity is the weighted mean. The way that the cloud of row points, and the cloud of column points, have been defined, is that the inertias of these two clouds are identical.

Output: cloud of points endowed with the Euclidean metric in factor space

Decomposing the moment of inertia of the cloud of row points (the cloud of tweets) and the cloud of column points (the cloud of words) furnishes the principal axes of inertia, defined from a singular value decomposition. The inertia about the principal axes is given by the eigenvalues. The principal axes themselves are defined from the eigenvectors. The principal axes are termed factors. Latent variables, or latent semantic axes, are also terms that can be used.

There is the following invariance relationship. The \(\chi ^2\) distance between two rows (two tweets), or between two columns (two columns), is identical to the Euclidean distance between the two rows, or respectively the two columns, in the factor space. The latter, the factor space, allows us to display the data.

The projection of row points and column points on the factors express the information. The total information content of either row set, or column set, is the cloud inertia. Associated with the factors is the information in our data, arranged by decreasing importance. The information importance is measured by inertia about the axes, or factors.

In addition to projections on the factorial axes, in Correspondence Analysis, we also consider the contributions to the inertia, and the correlations (of rows, or of columns, with the factors).

Analysis of the dual spaces, and supplementary elements

The factors in the two spaces, of rows/observations and of columns/attributes, are inherently related. Each row (tweet) coordinate in the factor space is defined by the barycentre (or centre of gravity) of the coordinates of the column (word) coordinates; and vice versa. Not only can we pass from one cloud to the other, but the two clouds (of rows, and of columns) are displayable on the same graphic output. This is because the two clouds that are endowed with the \(\chi ^2\) distance to start with, are projected into (or embedded in) the factor space. The factor space, as noted above, is endowed with the Euclidean distance. The Euclidean distance is particularly appropriate for display or visualization.

Qualitatively different elements (i.e. row or column profiles), or ancillary characterization or descriptive elements may be placed as supplementary elements. This means that they are given zero mass in the analysis, and their projections are determined using the transition formulas. This amounts to carrying out a correspondence analysis first, without these elements, and then projecting them into the factor space following the determination of all properties of this space.

In summary

Correspondence analysis is thus the inertial decomposition of the dual clouds of weighted points. It is a latent semantic decomposition, where the role of the term frequency and inverse document frequency (TF-IDF) weighting scheme is instead through the use of (i) profiles and masses, (ii) with the \(\chi ^2\) distance. See Séguéla and Saporta (2011) for a discussion of both methods, correspondence analysis and latent semantic indexing. Further background description on correspondence analysis can be found in (Benzécri 1979, 1994), (Le Roux and Rouanet 2004; Murtagh 2005).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Murtagh, F., Pianosi, M. & Bull, R. Semantic mapping of discourse and activity, using Habermas’s theory of communicative action to analyze process. Qual Quant 50, 1675–1694 (2016).

Download citation

  • Published:

  • Issue Date:

  • DOI: