Reflections on study abroad: a computational linguistics approach

  • Peter GrajzlEmail author
  • Cindy Irby
Research Article


Study abroad and the associated sociocultural experience has been a subject of substantial interest to social science scholars and university administrators. Shedding novel light on the phenomenon, we draw on a corpus of student-authored reflective essays and apply machine learning methods for analysis of text-as-data to examine the features and the determinants of salient themes emphasized by students in their study abroad reflections. Our analysis identifies 18 different topics spanning the domains of distinctly cultural cognition, interaction with people, physical environment, and personal change. Specifics of the experience such as duration and location, timing of reflections, and observable student characteristics including gender, major, academic performance, extracurricular involvement, and socioeconomic status are all important determinants of student’s reflections. Different factors, however, matter differently with respect to students’ emphases on particular topics, a finding indicative of the complex nature of the study abroad experience.


Study abroad Reflections Culture Text-as-data Machine learning Structural topic model 


The demand for international education and the volume of international student mobility have been steadily increasing over the past decades and are currently at an all-time high [18]. Studying abroad is nowadays deemed a one-of-a-kind opportunity for cultural immersion and an ideal preparation for future productive engagement in the increasingly intercultural and globally interdependent world. Consequently, study abroad has become an integral part of higher education both in the USA and internationally.

Study abroad, however, is an expensive endeavor. Spurred by internal budgetary pressures as well as external requirements imposed by higher education accreditation bodies, administrators and scholars across colleges and university campuses have become increasingly interested in systematic appraisals of study abroad programs and their effects on students’ cultural, academic, and personal competencies (see, e.g., Anderson and Lawton [1]). As a result, a growing volume of research scattered across social sciences, humanities, and policy literature on international education has attempted to characterize and assess students’ study abroad experiences.1

With regard to substantive findings, the resultant copious yet still largely inchoate body of research is strongly suggestive of the potential of the study abroad experience to exert an effect on a wide range of student outcomes such as intercultural communication skills, foreign language proficiency, appreciation of cross-cultural differences, and interest in international affairs [2, 3, 6, 7, 32, 33, 37]. In addition, time spent studying abroad has been shown to be associated with an elevated interest in scholarly pursuits, increased self-awareness, enhanced ability to cope with challenges, greater willingness to engage in prosocial behavior, and better informed career choice considerations [15, 25, 26]. As a further step in this line of inquiry, the literature has attempted to investigate how particular student outcome measures correlate with study abroad experience-related and student-specific variables, such as study abroad duration and student characteristics [3, 9, 37]. While there in general exist few settled findings, and even fewer comprehensive analyses of the role of potentially pertinent factors, existing evidence indicates that both experience- and student-specific traits can matter for study abroad outcomes.

Methodologically, the existing study abroad literature has predominantly relied on data generated on the basis of closed-ended surveys (among many others, see, e.g., Carlson and Widaman [6], Basow and Gaugler [3], Terzuolo [37]) or qualitative interviews (e.g., Mendelson [22], Dolby [8], Walsh [39]). The resulting analyses, combined with increasingly careful application of statistical methods, have undoubtedly notably improved our understanding of students’ study abroad experiences. Yet, reliance on closed-ended surveys and interview-based methods also renders the resulting studies inherently vulnerable to well-known methodological drawbacks that come with the use of those methods. Closed-ended surveys, for example, are sensitive to researchers’ subjective decisions concerning specific topical emphases as reflected in the design of the survey questions and can be plagued by participants’ response biases. On the other hand, qualitative interviews are resource-intensive and consequently often result in small sample sizes with adverse implications for the statistical power of ensuing studies.

Therefore, in addition to further addressing challenges associated with the use of existing methods, scholarship on the study abroad experience and its ramifications would benefit from incorporating new methodological tools for analysis of large textual datasets that are gradually becoming an integral part of the methodological toolkit of social scientists and humanities scholars [11, 14, 24, 31]. The application of such tools in the context of study abroad research holds promises to, on the one hand, alleviate (at least some of) the concerns with current methodological approaches and, at the same time, complement the existing research in terms of the nature of generated insights.

In the present paper, we follow this line of reasoning in an attempt to provide a hitherto unexplored perspective on students’ study abroad experience and cultural immersion. Instead of relying on close-ended survey questionnaires or small-scale interviews, we use novel computational methods for analysis of text-as-data to quantitatively examine a comparatively large textual corpus of mandatory, open-ended reflection essays authored by a group of students who are enrolled at a selective US liberal arts college and who recently studied abroad. Our analysis first identifies groupings of salient themes, referred to as ‘topics’, about the study abroad experience as emphasized by students in their own words. The topics are uncovered by an unsupervised machine learning algorithm. They are thus devoid of human researchers’ preconceived notions about the study abroad experience that may affect the design of closed-ended survey questionnaires and the conduct of qualitative interviews. We then use the estimated topics in a regression-like framework to investigate whether, and if so how, students’ emphases on particular topics vary systematically with a comparatively broad range of observable experience and student characteristics available in our data. The resulting analysis casts a novel light on the question of what factors shape study abroad experiences. More generally, our investigation illustrates how machine learning-based computational techniques applied to text-as-data can be utilized to investigate the features and the determinants of cultural, societal, and personal reflections.


To examine students’ study abroad reflections, we estimate a topic model, a machine learning-based statistical tool for analysis of large textual corpora. As a class of generative probability models, topic models require a researcher to postulate a model of the data generating process and then use the data to determine the most likely values for the parameters within the model. To estimate the parameter values, topic models view texts as ‘bags of words’ and then apply an unsupervised machine learning algorithm that exploits the co-occurrence of words across documents to classify groups of words that tend to co-occur [4].

The resulting ‘topics’ are formally conceptualized as probability distributions over corpus vocabulary, while documents (chunks of text, for us student essays) are modeled as mixtures of topics.2 The name of each topic is assigned by the researcher upon scrutiny of key words most closely associated with the topic and after reading of the documents that feature a given topic particularly prominently. The topics themselves, however, are solely a product of model estimation. They are not obtained by matching words and documents to concrete thematic issues specified by the researcher prior to estimation (as they would be in a supervised model).

Topic modeling is of course not a substitute for careful reading and nuanced interpretation of text. As a complement to conventional textual analysis, topic models are particularly suitable for analyses of large textual corpora when the principal aim of the analysis is to provide a macroscopic guide to the salient themes emphasized in a corpus. With the emergence of ‘big data’ and researchers’ interest in text-as-data methods, the use of topic models has become increasingly common across a broad range of academic disciplines. The Latent Dirichlet Allocation (LDA) model [5] in particular has been fruitfully applied by both social scientists and humanities scholars [14, 16, 17, 23]. We use the Structural Topic Model (STM; Roberts et al. [27, 28]), a recent innovation that, unlike the LDA model, integrates document-level data into the analysis (see, e.g., Lucas et al. [20], Farrell [10], Law [19], Grajzl and Murrell [13]).3 The incorporation of metadata into the analysis produces the best available estimates of the topics as well as, importantly, allows the researcher to examine the effect of metadata covariates on topical prevalence.4

In our analysis, we first estimate a set of topics, the principal themes in the corpus as identified by the unsupervised machine learning algorithm and not readily apparent to a human reader of many disparate documents (student essays). We then make use of the defining characteristic of the STM, the inclusion of metadata covariates, to investigate the effect of metadata covariates on students’ emphases on particular topics in a regression-like framework. The resulting statistical analysis allows us to address a central question for assessment of study abroad programs: What experience- and student-specific factors determine students’ perceptions of their study abroad experience, and in what way?


The corpus

Our textual corpus consists of study abroad reflective essays written by students at a selective private liberal arts college, located in the South Atlantic region of the USA. About 22% of students at the university under our consideration choose to study abroad at some point during their undergraduate career.5 Upon expressing their intention to study abroad, students complete an internal application and meet with a study abroad advisor who assists them with selecting a suitable program. Prior to departing abroad, approved students undergo a comprehensive online as well as in-person orientation about studying and living abroad.

Each deployed study abroad student is formally required to produce two reflective essays about their study abroad experience. (Our analysis is therefore not subject to selective response bias that often plagues studies utilizing data collected on the basis of voluntary surveys or interviews.) The essay requirements are made clear to students prior to departing for study abroad. The students are mandated to turn in their essays soon after returning on campus.

The first essay (early reflection) must be completed during the first week after the arrival at the study abroad location. Students are expected to write a brief (approximately, 250–500-word) commentary on any notable cultural and related experiences that they underwent upon their arrival at the study abroad location, taking into account the cultural goals they had set for themselves. The second essay (ex-post reflection) is a longer (roughly 1000-word) reflection to be completed no sooner than during the last 2 weeks of students’ study abroad. In that reflection, students are asked to review and reflect upon the first essay (early reflection) and comment on how the time away might have changed their initial cultural and other impressions. Students are further requested to comment on their cultural goals, how their time abroad changed them, and how their experience abroad is expected to affect their future on-campus experience.

For purposes of our analysis, we have collected 340 reflective essays written by the complete set of 170 students (each student in the sample completed both the early and the ex-post reflection essay) who studied abroad between winter 2017 and winter 2018.6 The total word count for our pre-processed corpus is 243,084 words. Table 1 provides the basic descriptive statistics for the length of our reflection essays.
Table 1

Length of reflection essays, in words



Std. Dev.



Early reflection essays





Ex-post reflection essays





All reflection essays





We imported the corpus into R for pre-estimation processing using R’s stm package [29]. We processed the corpus using the textProcessor function to convert the text to lowercase, apply the Porter stemming algorithm, and remove stop words (common natural language words with very little meaning, such as ‘a’, ‘and’, ‘the’, etc.) as well as numbers and punctuations. The resulting dataset consist of 71,042 word tokens.

Figure 1 presents a first glance at the data in the form of a word cloud. Notably, the most prominently emphasized words (or rather their stems) in the corpus are ‘cultur’, ‘peopl’, ‘time’, ‘learn’, ‘differ’, ‘experi’, and ‘abroad’. This suggests that cultural cognition, broadly construed, lies at the heart of students’ study abroad experience. In our subsequent analysis, we provide a more nuanced, macroscopic view of students’ reflections by applying computational methods that make use of an unsupervised machine learning algorithm to identify the salient themes emphasized by the students.
Fig. 1

Word cloud for corpus as a whole


With student essays conceptualized as mixtures of topics, the prevalence of a particular topic will tend to vary across essays and students. Because different essays pertain to different timing of reflections and are authored by different students, who come from heterogeneous backgrounds and who studied abroad in different locations and for varied lengths of time, one would like the data generating process that underpins the computational identification of the topics to let topical prevalence vary with available essay- and student-level metadata. This is exactly what STM allows for, enabling the researcher to incorporate metadata into the estimation of topics and subsequently assess the relationship between topical prevalence and the metadata.

Table 2 lists essay- and student-level metadata variables that we utilize in our analysis. In the STM estimation (see below), we model topical prevalence as a simple linear function of metadata covariates (see Roberts et al. [29]).7 In what follows, we briefly describe the composition of our sample based on the values of available metadata variables.
Table 2





Essay level


 Reflection essay timing

Timing of student reflections

Early [170], ex-post [170]

Student level


 Year of study abroada

Calendar year of studying abroad

2017 [120], 2018 [50]

 Length of study abroad

Number of days spent studying abroad

Min. 40, max. 251, mean 114, std. dev. 18


Country of study abroad

Argentina [3], Australia [23], Brazil [1], Chile [4], China [7], Costa Rica [1], Czech Republic [6], Denmark [10], France [8], Germany [2], Greece [3], Hungary [2], India [1], Nepal [2], Indonesia [1], Ireland [2], Italy [26], Japan [1], Jordan [3], Nepal [2], New Zealand [6], Peru [1], Russia [1], Rwanda [1], Singapore [1], South Africa [1], Spain [31], Sweden [1], UAE [4], Uganda [1], UK [21]


World region of study abroad

Africa [5], Asia [20], Eastern Europe [10], Western Europe [103], Central America [1], South America [8], Oceania [29]


Student’s gender

Male [55], female [115]


Part of university where the student has declared the choice of (first) major

Commerce school [106], the college [64]

 Second major

Binary variable for whether the student has a second major or not

Has second major [53], does not have second major [117]


Cumulative GPA prior to studying abroad

Min. 2.97, max. 4.00, mean 3.52, std. dev. 0.25

 Greek life affiliation

Binary variable for whether the student is affiliated with Greek life organizations or an independent

Greek [137], independent [33]

 Varsity athlete status

Binary variable for whether the student is a varsity athlete or not

Varsity athlete [30], not varsity athlete [140]

 Foreign/domestic status

Binary variable for whether the student is a domestic (US) student or a foreign student (on an F-1 visa or a resident alien)

Foreign [20], domestic [150]

 Financial aid

Binary variable for whether the student is recipient of financial aid from the university or not

On financial aid [70], not on financial aid [100]

 QuestBridge student

Binary variable for whether the student is a QuestBridge student or not

QuestBridge student [3], not QuestBridge student [167]

The numbers in square brackets are frequency counts for specific values of categorical variables. For the essay-level variable, frequency counts refer to the number of essays. For student-level variables, frequency counts refer to the number of students. For variables country and region, the sum of frequency counts exceeds 170 because some students studied abroad in multiple countries in multiple regions

aVariables included in the prevalence equation when estimating the topics, but whose effects we do not highlight when assessing the effect of metadata covariates on topical prevalence. The effect of year of study abroad on topical prevalence yields no noteworthy results while the effect of study abroad location is better captured by region than by country

With each student in the sample having completed both required essays, exactly half of the essays are early reflections and the other half ex-post reflections. The mean length of time that students-authors spend studying abroad is 114 days, with the minimum and the maximum equal to 40 and 251 days, respectively. 71% of the students in our sample studied abroad in year 2017 and the rest studied abroad in 2018.

The students studied in 31 different countries covering all of the world’s major regions. The countries where students studied most commonly are Spain, Italy, Australia, UK, and Denmark. Western Europe is thus the most widely visited world region. One percent (two students) studied in multiple world regions.

Consistent with several prior study abroad studies (see, e.g., Stroud [34], Dotta [36]), a disproportionately large number of students (68%) in our sample are females. 63% are majoring in business, accounting, economics, or politics (majors offered in the commerce school) and the remaining students in humanities, sciences, the arts, or social sciences other than economics and politics (majors offered in the college). 32% of students have a second major. The mean cumulative GPA as measured prior to studying abroad is 3.52, with standard deviation equal to 0.25. 81% of the students are affiliated with Greek life organizations (fraternities and sororities), a proportion broadly consistent with the overall membership in Greek life organizations on campus as a whole. 18% of students are varsity athletes, a percentage lower than the percentage for campus as a whole.

12% of students are on a student visa or have a resident alien status, a number that slightly exceeds the campus-wide proportion. 41% of students are recipient of some amount of financial aid and 2% (three students) are recipients of the QuestBridge scholarship for exceptional students from low-income families. The sample proportions of students on financial aid and QuestBridge scholars are lower than the campus-wide proportions of financial aid recipients and QuestBridge students, respectively.

The complete dataset (anonymized textual corpus and corresponding metadata) that we utilize in our analysis is not publicly available. For replication and further research purposes, it is available from the authors upon reasonable request.

What do students emphasize in their reflections?

Choosing the number of topics

An important modeling decision in estimating an STM is the choice of the number of topics to be estimated. There exists no definite approach to determining the optimal number of topics for a corpus. The literature advocates the use of both computational statistical measures and human judgment (Roberts et al. [27:1068–1070; 29]). We first estimated models featuring between 5 and 30 topics. We compared the resulting models based on the measures of goodness of fit (in particular held-out likelihood and size of residuals). We then narrowed our focus on the subset of estimated models that fit the data particularly well [29, 35, 38]. We contrasted these models based on the average scores on semantic coherence (a measure of internal consistency of topics) and exclusivity (a measure of the extent to which topics can be differentiated one from another). This allowed us to identify models that are not strictly dominated by other models based on the average semantic coherence and exclusivity scores. We then carefully inspect the estimated topics for a small set of models on the resulting semantic coherence-exclusivity frontier [27]. We ended up selecting the model with 18 topics.

The topics

Table 3 summarizes the detailed results in the form of word lists for the resulting 18-topic STM. (Recall that a topic is formally a distribution over vocabulary.) The 18 topics and associated word lists presented in Table 3 have been identified by an unsupervised machine learning algorithm as opposed to being somehow pre-determined by the researcher. In this sense, the identification of topics is devoid of any kind of preconceived notions and biases held by the researcher. The assignment of specific names to topics, however, reflects our (the researchers’) interpretation of ideas expressed in each topic. For each topic, each of us independently examined the words most associated with the topic and read through exemplar documents (essays) that feature the topic most prominently. (Recall that documents are formally modeled as mixtures of topics.) For each topic, we discussed our respective naming suggestions and eventually selected a mutually agreed-upon label.
Table 3

Topics and top words for estimated 18-topic STM

1. Comparing Cultures

Highest Prob cultur, london, peopl, differ, also, american, much, countri, citi, experi, learn, mani, british, one, interest, time, danish, class, semest, student, live, studi, environment, state, interact, divers, like, howev, can, understand

FREX london, environment, german, germani, british, danish, divers, applic, influenc, psycholog, varieti, freiburg, recycl, behavior, viewpoint, event, anticip, sustain, urban, exhibit, interact, berlin, reaction, westminst, design, respect, dessert, emphas, york, issu

2. Food Culture

Highest Prob spanish, spain, time, cultur, differ, eat, peopl, speak, also, sevilla, languag, day, famili, live, citi, meal, first, week, host, use, much, learn, dinner, get, spaniard, food, even, realli, think, abroad

FREX spaniard, spain, spanish, madrid, sevilla, lunch, sevill, eat, siesta, schedul, host, meal, dinner, mom, lifestyl, mother, andalusian, carmen, bread, tapa, semana, toledo, speak, santa, slower, argentina, dialect, feria, famili, late

3. Social Habits

Highest Prob peopl, time, danish, dane, denmark, can, get, cultur, much, like, also, realli, first, citi, way, bike, copenhagen, see, one, still, around, will, just, day, know, thing, famili, make, differ, state

FREX dane, denmark, danish, bike, copenhagen, smoke, babi, birthday, hygg, welfar, tax, host, nightlif, fashion, metro, children, young, age, drunk, belong, implement, drink, lane, happiest, januari, cozi, destin, dress, guarante, trust

4. Immersing in New Culture

Highest Prob rome, peopl, differ, cultur, life, time, italian, way, thing, new, place, experi, live, home, realiz, much, like, will, itali, first, citi, take, one, now, can, week, get, languag, american, learn

FREX rome, itali, roman, italian, trastever, chilean, lifestyl, sandwich, memori, shop, superfici, cabot, john, wash, siena, groceri, lack, european, acquir, money, miss, eastern, sunset, oppos, simpl, rack, piazza, simplic, valu, adjust

5. Work Culture & Experience

Highest Prob: work, cultur, peopl, time, experi, learn, australia, class, differ, australian, think, also, one, day, abroad, will, student, first, understand, studi, life, new, mani, like, much, lot, countri, realli, way, see

FREX: australia, rwanda, internship, australian, aborigin, collabor, irish, work, project, costa, late, team, assign, ngo, rica, lab, cathol, infrastructur, environ, compani, surf, colleg, rwandan, corpor, cowork, indigen, group, relax, creat, campus

6. Indigenous People & Land

Highest Prob new, zealand, cultur, time, maori, place, learn, differ, life, home, ive, much, peopl, abroad, like, make, studi, way, countri, state, see, back, student, mani, take, feel, one, howev, citi, think

FREX maori, zealand, dunedin, kiwi, auckland, island, otago, pacif, geolog, flat, land, farm, laid-back, outdoor, meat, new, cook, pakeha, refresh, recognit, landscap, protect, popul, sourc, mara, earth, groceri, respect, degre, healthi

7. Social Divides

Highest Prob dubai, peopl, differ, time, countri, uae, cultur, student, south, one, class, also, live, friend, howev, experi, arab, middl, women, life, jordan, see, american, mani, studi, new, abroad, first, like, east

FREX: uae, dubai, arab, jordan, islam, segreg, durban, african, gender, east, south, middl, uganda, cape, aud, africa, debbi, femal, triniti, color, irish, women, lebanon, traffic, men, homestay, market, cast, kampala, ireland

8. Arrival & First Impressions

Highest Prob first, one, arriv, week, time, cultur, street, differ, citi, feel, even, day, walk, italian, american, experi, apart, student, scotland, new, florenc, town, seem, peopl, just, thing, immedi, abroad, almost, andrew

FREX valencia, scotland, duomo, andrew, confront, immedi, pull, crowd, varanasi, driver, town, tree, florenc, scottish, thirti, foot, rural, hot, anxieti, board, nyu, apart, arriv, highland, kathmandu, bus, bother, began, saturday, space

9. History & Art

Highest Prob class, cultur, histori, learn, experi, also, art, studi, abl, abroad, travel, differ, citi, time, scotland, andrew, florenc, univers, new, student, countri, peopl, understand, made, opportun, one, take, much, place, mani

FREX: andrew, scotland, art, scottish, medic, colosseum, florenc, histori, highland, ancient, edinburgh, medicin, artist, modern, mosqu, wlu, museum, connect, entranc, sevilla, monument, lectur, scenic, debat, scienc, undergradu, castl, ruin, cathedr, knowledg

10. Institutions & Prosperity

Highest Prob chines, china, cultur, citi, differ, first, shanghai, one, languag, experi, time, peopl, week, mani, also, roommat, western, state, unit, expect, countri, nation, life, understand, live, econom, learn, feel, even, can

FREX shanghai, chines, china, singapor, beij, singaporean, western, roommat, inde, econom, skin, edit, cet, commerci, growth, toilet, undoubt, strict, incom, bed, whiten, govern, freedom, subway, achiev, room, prosper, selfi, nation, million

11. Exploring the Surroundings

Highest Prob sydney, australia, australian, time, student, differ, experi, cultur, much, univers, week, citi, one, first, get, live, like, abroad, new, studi, travel, day, peopl, mani, american, also, come, walk, know, will

FREX sydney, australian, australia, cairn, reef, aussi, bag, thailand, starbuck, univers, melbourn, mountain, beach, rainforest, hub, startup, coffe, surf, bay, opera, hall, hostel, outdoor, dive, music, backpack, uni, snorkel, although, intern

12. Travel

Highest Prob thing, realli, time, experi, one, get, peopl, like, citi, just, abroad, much, home, friend, new, will, think, travel, place, learn, class, differ, live, first, day, way, studi, even, also, back

FREX sweden, flight, florenc, term, honest, cancel, amsterdam, dublin, plane, realli, sinc, sat, pretti, stockholm, homesick, switzerland, love, glad, weekend, didnt, ice, figur, harder, anyth, fun, got, boss, cool, airport, nervous

13. Academics

Highest Prob class, student, learn, like, differ, citi, get, live, first, cultur, week, studi, realli, peopl, program, professor, much, abroad, also, one, school, feel, thing, even, sinc, can, mani, new, time, milan

FREX milan, math, budapest, hungari, hungarian, engin, edinburgh, low, professor, unsw, station, transport, navig, milanes, materi, scienc, third, nutrit, letter, greek, admit, argentina, mathemat, athen, sinc, librari, teach, min, program, unlik

14. Discovering Society & Friends

Highest Prob czech, first, peopl, pragu, friend, one, differ, learn, week, experi, languag, program, get, realli, time, cultur, will, even, citi, though, day, like, new, make, abl, also, much, place, use, food

FREX czech, pragu, republ, communism, buddi, bara, teacher, check, intens, tram, pari, card, nazi, post-communist, bed, program, cet, charl, regim, michael, alarm, server, orient, camp, strang, czechoslovakia, though, restaur, ethnic, communist

15. Conversing with People

Highest Prob time, cultur, peopl, will, much, russian, one, can, languag, learn, friend, back, differ, citi, even, experi, also, polit, like, world, feel, semest, abroad, convers, countri, japan, studi, made, greec, state

FREX: russian, russia, japan, japanes, templ, greec, waiter, greek, preserv, athen, restroom, wont, west, custom, servic, surfac, literatur, partner, convers, tip, barcelona, waitress, water, afraid, hill, mediterranean, architectur, olya, rubl, polit

16. Relating to People

Highest Prob peopl, time, differ, life, thing, cultur, live, will, much, experi, like, feel, know, place, learn, way, one, can, think, just, come, want, countri, friend, greec, mani, see, world, also, understand

FREX: chile, greec, india, soccer, game, cusco, aborigin, barcelonan, poverti, indian, peru, play, career, generous, sport, barcelona, cuba, catalonia, catalan, sexual, kid, refer, stadium, seen, path, movement, relationship, happier, context, valu

17. Personal Growth

Highest Prob abroad, time, experi, learn, studi, cultur, will, abl, differ, new, one, friend, first, french, student, semest, life, also, countri, travel, much, washington, lee, take, back, way, mani, class, goal, visit

FREX thank, pari, independ, abroad, climb, franc, busi, lee, washington, teach, classroom, french, aix, european, matur, bless, engag, intern, achiev, term, econom, latin, lesson, reward, alon, contin, union, set, polit, enorm

18. Coping with Challenges

Highest Prob french, like, peopl, time, cultur, one, learn, experi, way, differ, week, much, semest, walk, say, mani, franc, languag, feel, american, abroad, made, will, food, first, situat, friend, take, part, live

FREX nant, housem, french, bath, indonesia, mistak, franc, balines, bali, homeless, romain, oxford, thesi, hello, mother, tutor, honor, phase, terror, bakeri, linguist, smoke, macron, brief, croissant, verb, relish, situat, unlock, porch

We present two distinct word lists of the 30 most important words for each topic. The highest probability (highest prob) words are those that are most frequent for a given topic, but also non-exclusive and hence may be featured as highest probability words for multiple topics (e.g., ‘cultur’, ‘abroad’, ‘experi’). In contrast, FREX words reflect two criteria: they are on the one hand frequent for a given topic (as highest probability words) and at the same time relatively exclusive to that topic, with our choice of relative weights assigned to the former and the latter criteria equal to 0.25 and 0.75, respectively. As such, FREX words are particularly informative for purposes of identifying topic names and distinguishing between topics.8

The topics summarized in Table 3 therefore provide a macroscopic, statistical, machine learning-based overview of the central ideas emphasized by the students in their study abroad reflections. Inspection of Table 3 suggests that the emphasized reflections entail one or more of the following broad groups of themes: commentaries entailing distinctly cultural perceptions, thoughts on interactions with people, observations on the physical environment, and considerations of a more personal nature. Further scrutiny of Table 3 reveals that, congruent with Fig. 1, cultural elements permeate nearly all of the topics; indeed, ‘cultur’ is among the top 30 highest probability words for 17 out of 18 topics. In what follows, we briefly discuss each topic in turn, justifying the assigned topic names. Supplementary Appendix further justifies the choice of topic names and illustrates the underlying ideas expressed by the students by providing sample quotes from essays featuring a particular topic most prominently.

The first among seven topics that are distinctly centered on cultural reflections is Comparing Cultures. This topic entails comparative discussions of cultural diversity, variety, and different outlooks on life (FREX words include ‘divers’, ‘varieti’, ‘viewpoint’) in a wide range of local contexts (‘german’, ‘british’, ‘danish’), often with reference to either American culture (‘american’ is among top-ranked highest prob words) or to other cultures. The city of London is highlighted as a particularly prominent example of a place rich in cultural diversity. In contrast, Denmark and Germany are perceived as countries with a distinct environmentally oriented culture (‘environment’, ‘recycl’, ‘sustain’), especially when contrasted with the USA.

Food Culture encapsulates students’ reflections on food, eating habits, and corresponding lifestyle that they have observed during their study abroad (top-ranked highest prob words include ‘cultur’, ‘eat’, ‘food’, ‘meal’). Documents featuring this topic most prominently discuss food in a Spanish and Latin American context. Students reflect on eating schedules, food choices, as well as lifestyle and traditions that include gatherings of the entire family for meals during important holidays (FREX words include ‘siesta’, ‘famili’, ‘santa’, ‘semana’, ‘feria’). A number of reflections appear in the context of students’ interaction with their host families (FREX words include ‘host’, ‘mother’).

Social Habits feature a description of local social habits and customs that readily stood out in students’ perceptions of the prevailing culture at their study abroad location. Many of the essays featuring this topic most prominently reflect on Denmark, but top-ranked essays also include observations from other countries and world regions. Consistent with the top-ranked FREX words for this topic, students discuss the use of bicycles as a common means of transportation, approach to attending to young children, the tradition to celebrate birthdays, the lack of conversations among people on the metro, the practice of spending a lot of time with one’s family, local smoking and drinking habits, the vibrant city nightlife, and fashion styles.

The next topic, Immersing in New Culture, encapsulates reflective discussions of how a first unfamiliar place with its distinct cultural norms and customs eventually became the students’ new home, resulting in embracement of the local lifestyle (‘home’, ‘new’, ‘lifestyle’, ‘life’ are featured among the top-ranked highest prob or FREX words) and immersion into the local culture. In essays featuring this topic most prominently, students reflect on choosing to “live as the Romans do”, “adjust as opposed to stick to the old ways”, and “assimilate into a new culture” with respect to a variety of contexts, including shopping, ordering of meals, strolling through local markets (piazzas), having to dry clothes on a rack (rather than rely on a dryer), using local currency (as opposed to credit cards), and gradually gaining an understanding of local values, such as the appeal of slowing down one’s lifestyle.

Work Culture & Experience features students’ reflections on work culture at places they visited and on work experience that they gained during their study abroad. ‘work’, ‘culture’, ‘experi’ are all among the top-ranked highest prob words. This topic encapsulates students’ cultural insights based on work conducted in both in-class and out-of-class settings, such as during internships and in a lab. Reflections center on how different people and collaborators they have interacted with approach completing assignments and engage in group and team work.

Indigenous People & Land is heavily focused on indigenous, in particular Maori, culture (FREX words include ‘maori’ and ‘mara’ as stem of marae, the focal point of Maori village community) and on the relationship between indigenous groups and local population (FREX words include ‘pakeha’, the Maori word for white New Zealander of European descent; as well as ‘respect’, ‘recognit’). The discussion of indigenous culture is often linked to reflections on importance of land and preservation (‘land’, ‘landscape’, ‘protect’ are among top FREX words). Multiple students tied their thoughts expressed in this topic to the US treatment of American Indian populations.

Social Divides is about the many dimensions of social cleavage that the students have witnessed during their study abroad. Students comment and reflect on the manifestations of observed social segregation with respect to race, gender, caste, and social class (highest prob and FREX words include ‘segreg’, ‘gender’, ‘color’, ‘cast’). Many of the contexts that feature this topic prominently are from Islamic nations, hence ‘islam’ is among top-ranked FREX words. The discussion of (unequal) treatment of men and women, however, are also prominently featured in multiple essays discussing European countries.

The next six topics entail observations on the physical environment and the society. Cultural reflections are featured in a number of these topics, although often less evidently so than in the first seven topics. Arrival & First Impressions encompasses reflections upon arrival at the study abroad location (highest prob words include ‘first’, ‘arriv’, ‘immedi’) and the corresponding observations of the physical and social environment (FREX words include ‘driver’, ‘town’, ‘rural’, ‘bus’, ‘walk’, ‘crowd’, as well as notable tourist locations such as ‘duomo’). Students further reflect on settling in at their new place (‘apart’ among FREX words is stem of apartment) as well as the associated emotional responses (highest prob words include ‘feel’ and FREX words include ‘anxieti’).

History & Art is unmistakably about reflections on the history and the art of the places where the students were deployed (‘histori’ and ‘art’ are among both highest prob and FREX words). These observations stem both from the guided tours and students’ independent visits of notable locations (including museums) as well as through the history and history of art classes that they took in their study abroad program (while ‘medic’ among FREX words refers to the Medici dynasty from Florence, ‘medicin’ refers to observations made by pre-med students).

Institutions & Prosperity entails observations about societal institutions and economic prosperity, often in a comparative context vis-à-vis the Western world (hence ‘western’ among highest prob and FREX word; ‘skin’ and ‘edit’ refer to Chinese youth’s desire to edit selfies via whitening one’s skin in order to appear western). Multiple top-ranked highest prob and FREX words depict the functioning and prosperity of the economy (‘econom’, ‘commerc’, ‘growth’, ‘income’, ‘prosper’), as well as existence of an advanced metro system (‘subway’) and types of toilets (‘toilet’) as indicators of economic development. At the same time, the students comment on the role and involvement of the government (‘govern’) and the conceptualizations of freedom.

Exploring the Surroundings contains comments based on students’ exploration of the city and the wider surroundings of their study abroad location. The availability of coffee shops is one example of such exploration (hence ‘coffe’ and ‘starbuck’ among FREX words). This topic further features students’ description of the many outdoor activities they had the opportunity to participate in (e.g., backpacking, diving, snorkeling) and the places they were able to visit (e.g., rainforest, beach, mountains, coral reef, opera hall).

The topic we named Travel is about students’ reflections on the logistics and the feelings associated with travel (‘travel’ is among highest prob words and FREX words include ‘flight’, ‘cancel’, ‘plane’, ‘airport’, ‘fun’, ‘love’, ‘nervous’, ‘homesick’). Students further emphasize the unique opportunity during their study abroad to travel to multiple locations over long weekends (hence ‘weekend’ among FREX words). Importantly, Travel is the only topic for which ‘cultur’ is not among the top 30 highest prob words.

Academics is unambiguously about students’ academic experience, narrowly defined. Top-ranked highest prob and FREX words for this topic include ‘class’, ‘student’, ‘learn’, ‘program’, ‘school’, ‘professor’, as well as ‘math’, ‘scienc’, ‘engin’, ‘librari’. This topic encapsulates students’ reflections on the nuances of academic programs they participated in and the academic culture as it pertains to the conduct of classes, professors’ teaching styles, use of facilities, and academic interactions with fellow students.

The following three topics are about different dimensions of students’ interaction with people. Discovering Society & Friends is a topic where students reflect on their early learning about the society of their study abroad location. Such early learning (‘orient’ as the stem of orientation is among top-ranked FREX words) often took place upon discovering an initial set of friends soon after arrival to the study abroad location. [‘friend’ is among top-ranked highest prob words and the stem of buddy (‘buddi’) is among the top-ranked FREX words; further FREX words entail first names of persons (Bara and Michael).] These individuals introduced students to the basic characteristics of their society, such as the Czech Republic’s pre-communist, communist, and post-communist history and ethnic makeup. Essays that feature this topic prominently further refer to study abroad experience in Spain, France, and Germany.

Conversing with People depicts the many acts of holding conversations, a key channel of interaction with people. This is the only topic where ‘convers’ is among both highest prob and FREX words. This topic entails students’ descriptions of interactions with staff in the restaurants and customer service (‘waiter’, ‘waitress’ are among FREX words), as well as with and among new acquaintances and friends (‘friend’ is among highest prob words and ‘olya’ among FREX words captures the name of a person, Olya, with whom a student has had many conversations). The necessity to understand and speak the local language facilitates such interaction (‘languag’ is among top highest prob words and ‘russian’, ‘japanes’, ‘greek’ are among top-ranked FREX words) and allows one to learn about the society and culture.

In contrast, Relating to People captures ways of connecting with people at a deeper level, emphasizing the forming of relationships. Indeed, while ‘people’ is the top-ranked highest prob word, ‘relationship’ is among top-ranked FREX words. This topic is not focused on language and conversation per se, but rather on the many contexts within which students related to different individuals they have met. A commonly mentioned context is sports, with students commenting on how they played soccer or other games with local children or visited a sports match (FREX words include ‘soccer’, ‘game’, ‘play’, ‘stadium’, ‘kid’). It is through relating to people that students also learned about social and cultural issues such as poverty and importance of local identity, respectively (‘povert’, ‘catalan’ and ‘barcelonan’ are all among FREX words).

The final two topics in the corpus are students’ personal reflections on how the study abroad experience affected them as individuals. Personal Growth encapsulates students’ thoughts on ways in which studying abroad allowed them to mature and become more independent as well as reflect on lessons learned, goals achieved, and life more broadly (highest prob and FREX words include ‘abroad’, ‘time’, ‘experi’, ‘learn’, ‘independ’, ‘mature’, ‘life’, ‘lesson’, ‘reward’). In describing their growth as a person, many students express gratitude to individuals whom they met and who facilitated their experience (‘thank’ and ‘bless’ are among FREX words), as well as comment on how their personal growth will facilitate their on-campus engagement upon their return.

The last topic, Coping with Challenges, refers to the variety of challenges that the students faced and the approaches they took in striving to overcome them. This is the only topic where ‘situat’ (the stem of situation) is both among highest prob and FREX words. A further noteworthy FREX word is ‘mistak’. Essays featuring this topic most prominently highlight the initial lack of mastery of local language (hence ‘french’ is both among highest prob and FREX words) that presented a challenge in both academic and social settings. The notion of experience (‘experi’ is among highest prob words) in this topic can have either a negative connotation, such as when a student was harassed by a homeless person (in one context), or a positive connotation, such as when interaction with a homeless person (in another context) allowed a student to overcome grief due to passing of a parent. Students further illuminate stressful situations such as having to resolve differences with a housemate (hence ‘romain’ and ‘smoke’ among FREX words), the lack of comfort in Indonesian bathrooms (hence ‘bali’ and ‘indonesia’ among FREX words), concerns about terrorism and political unrest (hence ‘terror’ and ‘macron’ among FREX words), and even the challenges involved in trying to figure out the local custom when shopping in a bakery (hence ‘bakeri’ and ‘croissant’ among FREX words).

Figure 2 ranks topics based on their relative importance in the corpus. The corpus is relatively balanced when it comes to the prominence of different categories of topics with regard to the broad subject matter of reflections. The top six most prominently featured topics encompass each of the four general categories of reflections: on culture (Food Culture; Immersing in New Culture), the physical environment (Travel; Exploring the Surroundings), interaction with people (Relating to People), and personal change (Personal Growth). The only purely academic topic (Academics) is not among the most prominent topics in the corpus. This is evidence in favor of the argument that students view studying abroad primarily as a means of exposure to different cultures, people, and physical environments, as well as an opportunity for personal growth, rather than as an academic experience per se.
Fig. 2

Ranking of topics based on expected topic proportions

Assessing the role of experience- and student-specific factors

Empirical approach

We next make use of the defining characteristics of the STM, the inclusion of metadata covariates into the estimation of topics, to examine the association between topical prevalence (a measure of the degree of authors’ emphasis on particular topics) and metadata covariates. We are thereby able to address the question of whether, and if so how, the students’ emphases on specific topics, summarized in Table 3, vary systematically with observable experience-specific factors and student-level characteristics.

Multiple theoretical perspectives on the processes of sociocultural adjustment, social learning, and development of intercultural competence suggest that study abroad-related perceptions, experiences, and outcomes may vary with participant- and experience-specific characteristics [3, 21, 32, 37]. We would thus expect both the student’s background and the nature of his or her study abroad experience to influence the student’s perceptions of the study abroad experience, and hence the extent of emphasis on particular topics. In the absence of comparable empirical studies we, however, refrain from articulating specific ex ante hypotheses concerning the relationship between particular metadata covariates and students’ emphases on individual topics. Instead, we let the data speak for themselves and then rely on an inductive approach to summarize the gist of our findings based on the obtained empirical evidence.

To conduct the analysis, upon estimating our 18-topic STM, we make use of the estimateEffect function available in R’s stm package to estimate regressions with essay-level proportion devoted to a topic as the outcome and metadata variables as covariates (see Roberts et al. [29, 30]). We present our results in the form of figures. Specifically, we plot mean differences in estimated topic proportions for two different values (a ‘treatment’ and a ‘control’) of a given document-level covariate of interest.9 We display the point estimates and the corresponding 95% confidence intervals. For easier readability, we customize the horizontal axis for each figure.

The resulting machine learning-based statistical analysis is informative of what factors, and in what way, may be influencing students’ perceptions of their study abroad experience. However, we caution against readily interpreting our results as purely causal in nature. Despite a wide range of covariates that we include in our analysis, it is possible that there exist omitted variables that are on the one hand correlated with our metadata covariates and at the same time exert an effect on student’s perceptions of various study abroad experiences. Moreover, the sample of students who choose to study abroad is likely a non-random sample of all students (see, e.g., Goldstein and Kim [12], Stroud [34]). (Indeed, as noted in the description of metadata above, the composition of our sample with respect to various student characteristics does not fully reflect the composition of the entire student body at the university of our consideration.) If so, the unobservables that influence students’ perceptions of their study abroad experience may be correlated with unobservables that determine whether students opt to study abroad in the first place (an example may be student-specific extent of extraversion), a scenario leading to classic sample selection bias.

With these caveats in mind, we proceed as follows. We first illuminate the role of the experience-specific covariates. We then turn to examining the role of student-specific socio-demographic factors and other observable student characteristics.

Timing of reflections

We first explore if students’ emphasis on specific topics varies systematically with the timing of their study abroad reflections. Figure 3 demonstrates that it does, at least for a subset of the topics. All else equal (i.e., controlling for all other observable metadata), essays that reflect on students’ experience ex-post (nearing or after completion of their study abroad) comparatively emphasize the topics Work Culture & Experience, Relating to People, History & Art, and Personal Growth, that is, themes that one can coherently reflect on only after adequate exposure abroad. On the other hand, essays that express students’ early reflections comparatively emphasize the topics Social Habits, Arrival & First Impressions, and Discovering Society & Friends, that is, themes featuring reflections that one is able to form relatively soon after the arrival on the study abroad destination.
Fig. 3

The effect of timing of reflections

Length of time spent studying abroad

We next examine whether, and if so how, the length of time spent abroad matters for students’ emphases on specific topics. Existing evidence in the scarce existing literature on the subject suggests that the length of time may exert an effect on a variety of study abroad outcomes (see, e.g., Dwyer [9]). Because students were expected to record early reflections during their first week abroad, early reflections should not be influenced by the total length of time spent abroad. In exploring the effect of the length of time spent abroad, we therefore condition the analysis only on students’ ex-post reflection essays. We model the length of time abroad with a simple binary variable that splits the sample of student-authors into a subsample of authors for whom the length of time abroad exceeds the sample median value (114 days) and a subsample for whom the length of time abroad is smaller than the sample median value.

Figure 4 summarizes the results. We find that the length of time students spend abroad indeed matters for what students emphasize when reflecting ex-post on their study abroad experience. Conditional on ex-post reflections, essays authored by students who spent comparatively more time abroad feature statistically significantly more prominently the topics Immersing in New Culture, Social Habits, and Exploring the Surroundings, that is, themes that either involve relatively more advanced cultural cognition (e.g., Immersing in New Culture) or that one can ex-post reflect on particularly well if spending abroad a sufficient amount of time (e.g., Exploring the Surroundings). On the other hand, essays written by students who spent comparatively less time abroad feature more prominently the topic Food Culture, a theme highlighting a set of relatively more easily perceptible cultural phenomena that students are able to register even if studying abroad for a relatively short period of time.
Fig. 4

The effect of the length of time spent studying abroad for ex-post essays

Study abroad location

Are students’ study abroad reflections highly location-specific or, alternatively, do students emphasize particular reflections to the same extent regardless of which part of the world they studied in? Our results demonstrate that study abroad reflections are considerably location specific. Furthermore, different study abroad locations exhibit different effects on the prevalence of different topics. Controlling for other metadata covariates, the topic Social Divides is emphasized comparatively more in essays authored by students who studied in Africa and in Asia (Figs. 5, 6), as well as by students who experienced studying abroad in multiple regions of the world (Fig. 7). Studying in Asia is further associated with increased emphasis on the topic Institutions & Prosperity (Fig. 6).
Fig. 5

The effect of study abroad location: Africa

Fig. 6

The effect of study abroad location: Asia

Fig. 7

The effect of study abroad location: multiple world regions

In contrast, studying in Eastern Europe is associated with comparative emphasis on Academics and Discovering Society & Friends (Fig. 8), while reflections written by students who studied abroad in Western Europe comparatively emphasize History & Art, Immersing in New Culture, and Comparing Cultures, and de-emphasize Work Culture & Experience, Indigenous People & Land, Institutions & Prosperity, and Exploring the Surroundings (Fig. 9). Studying in South America is associated with increased emphasis on Food Culture (Fig. 10). Finally, studying in Oceania is on the one hand associated with increased emphasis on the topics Work Culture & Experience, Indigenous People & Land, and Exploring the Surroundings and on the other hand associated with a decreased emphasis on the topics Social Habits, Immersing in New Culture, and Personal Growth (Fig. 11).
Fig. 8

The effect of study abroad location: Eastern Europe

Fig. 9

The effect of study abroad location: Western Europe

Fig. 10

The effect of study abroad location: South America

Fig. 11

The effect of study abroad location: Oceania

As noted in the discussion of our empirical approach, the non-random character of students’ choices with respect to study abroad programs and locations prevent us from being able to ascertain to what extent these associations capture the causal effect of the study abroad location versus the effect of unobserved student characteristics that determine where a student chooses to study abroad. In other words, it is certainly possible that studying in less developed regions of the world renders a student comparatively more attentive to the many manifestations of the social cleavages that are on average less apparent in other parts of the world. Alternatively, however, students who are inherently relatively more receptive to social issues may be particularly eager to study abroad in locations where societal divides are especially ubiquitous.

Student characteristics

We investigate the effect of multiple student characteristics. Student’s gender exhibits a statistically insignificant effect on topical prevalence for all but one of the 18 estimated topics: all else equal, reflections authored by male students, interestingly, emphasize Immersing in New Culture comparatively more than reflections authored by female students (Fig. 12). Academic performance as measured by the grade point average (GPA) prior to studying abroad exerts an effect on students’ emphasis on two topics. Reflections by students in the bottom half of the sample GPA distribution (median GPA equals 3.53) comparatively emphasize the topic Arrival & First Impressions. On the other hand, reflections by students in the top half of the sample GPA distribution comparatively emphasize the topic Relating to People (Fig. 13).
Fig. 12

The effect of student’s gender

Fig. 13

The effect of student’s GPA

The students’ broad choice of (first) major exhibits an effect on prevalence of multiple topics (Fig. 14). Essays authored by students who have chosen to major in the school of commerce, economics, and politics (commerce school) comparatively emphasize the topics Exploring the Surroundings, Discovering Society & Friends, and Personal Growth. In contrast, essays written by students majoring in the part of the university that hosts sciences, humanities, the arts, and social sciences other than economics and politics (the college) comparatively emphasize the topics Indigenous People & Land, History & Art, and Coping with Challenges. The decision to pursue more than one major exhibits an effect as well, albeit for fewer topic and for a different set of topics than the broad choice of first major. Reflections by students with a second major comparatively emphasize Institutions & Prosperity, while the topics Indigenous People & Land and Personal Growth are comparatively emphasized by students pursuing a single major (Fig. 15).
Fig. 14

The effect of student’s major

Fig. 15

The effect of student having a second major

Reflections authored by varsity athletes, relative to reflections authored by students who are not varsity athletes, do not comparatively emphasize or de-emphasize any of the topics (Fig. 16). This finding is perhaps not surprising in light of the fact that student-athletes at the university of our consideration compete in NCAA Division III, where academics has (de facto) priority over sports. Student involvement (or lack thereof) in Greek life, however, does have an effect. Interestingly, essays authored by students who are unaffiliated with Greek life organizations (the ‘independents’) comparatively emphasize Personal Growth (Fig. 17). In contrast, essays authored by active members of Greek life organization comparatively emphasize the topic Work Culture & Experience.
Fig. 16

The effect of student’s status as varsity athlete

Fig. 17

The effect of student’s affiliation with Greek life organizations

The student’s status as a domestic or foreign student matters as well. The topic Exploring the Surroundings is all else equal emphasized comparatively more in reflections by domestic students than in reflections by students with an F-1 visa or resident alien status (Fig. 18). One possible explanation for this finding is that domestic students have on average had comparatively less exposure to geographic scenery and physical manifestations of culture outside of the USA and thus studying abroad particularly draws their attention to those topics.
Fig. 18

The effect of student’s status as foreign vs. domestic student

The final set of figures summarizes the results based on the examination of the role student’s socioeconomic status as proxied by whether the student is recipient of financial aid from the university (Fig. 19) or a QuestBridge student (Fig. 20). Student’s financial aid status matters for topical prevalence in the case of three different topics. Reflection essays authored by students on financial aid comparatively emphasize the topic History & Art, while reflections essays authored by full fee-paying students comparatively emphasize Personal Growth. Reflections by QuestBridge students distinctly emphasize the topic Social Divides; the resulting effect is large in magnitude and statistically significant despite wide confidence intervals (there are only three QuestBridge students in our sample). Mindful of potential omitted variables and the non-experimental nature of our research design, we interpret these findings as suggestive evidence that a student’s socioeconomic background influences students’ perceptions of their study abroad experience.
Fig. 19

The effect of student’s status as recipient of financial aid

Fig. 20

The effect of student’s status as QuestBridge student


Four broad conclusions can be drawn on the basis of the above-surveyed evidence on the effect of metadata covariates on topical prevalence. First and foremost, the timing of reflections, experience-specific factors, and student characteristics in general clearly exhibit an effect on students’ study abroad reflections. In other words, our empirical results suggest that students’ study abroad experience is critically shaped both by the study abroad environment and by the student’s individual background.

Second, different experience- and student-specific factors affect the emphasis on particular topics differently. For example, the emphasis on observations about the many dimensions of societal cleavages (Social Divides) is significantly influenced both by study abroad location and by student’s socioeconomic status. Neither study abroad location nor student’s socioeconomic status, however, exhibit an effect on topical prevalence of Arrival & First Impressions. The prevalence of the latter topic, however, is in turn significantly determined by the timing of reflections, a metadata covariate that, together with having studied in Western Europe, exhibits a statistically significant effect on topical prevalence for the largest number of topics (altogether seven).

Third, the particular observable student characteristic that exhibits an effect on topical prevalence for the largest number of topics (altogether six) is, interestingly, student’s broad choice of academic major. In the absence of more detailed student-level controls, this finding is at least in part likely explained by existence of unobservable student characteristics that shape both the student’s choice of major as well as his or her perceptions while studying abroad.

Fourth, prevalence of all but two among the 18 estimated topic is statistically significantly shaped by at least some experience-related or student-specific factor. Only prevalence of the topics Conversing with People and Travel is statistically insignificantly related to specific values of any of the metadata covariates that we had explored. The emphasis on two further topics, Comparing Cultures and Coping with Challenges, is statistically significantly influenced by a single respective covariate. These findings are consistent with the interpretation that especially travel and holding conversations with people, but also engaging in cultural comparisons and encountering challenges, are ubiquitous elements of virtually any study abroad experience, regardless of the location, timing of reflections, and student’s background. Among these topics, Travel in particular is the second most prominent topic in the corpus (see Fig. 2).


In this paper, we have taken a new route to analyzing students’ study abroad experiences, a subject of direct interest to scholars across multiple social science disciplines as well as international education practitioners. Drawing on a corpus of mandatory essays authored by students at a selective, private liberal arts college in the USA in order to reflect on their study abroad experiences, we have applied tools for quantitative analysis of text-as-data to characterize the salient themes emphasized in students’ reflections.

Our analysis uncovers 18 different topics that span over multiple domains, including reflections on culture, observations on the physical environment, interaction with people, as well as comments on personal challenges and change. We then demonstrate that both the specifics of the study abroad experience, such as the length of time spent abroad and the study abroad location, as well as the deployed student’s background, including his or her socioeconomic status, are important determinants of the student’s emphasis on particular topics and, thus, define his or her study abroad experience. Furthermore, different experience- and student-specific factors affect students’ emphases on particular topics differently. Our analysis thereby provides a unique insight into the complex nature of the study abroad experience and the web of factors that influence it.

Future work should attempt to address issues of causality, as well as examine to what extent our findings apply to study abroad students from other universities and colleges. We hope that our application of computational methods for analysis of text-as-data as a thus far unexplored lens for investigation of study abroad experiences will stimulate further research on study abroad, international education, and intercultural immersion more generally.


  1. 1.

    A Scopus search on 'study abroad' appearing in the title, abstract, or keywords identifies more than 2000 published contributions, with the vast majority of publications dated after year 2008. An analogous search using Google Scholar reveals many more works. For a necessarily limited set of sample contributions and further references, see, e.g., Carlson and Widaman [6], Ryan and Twibell [33], Dwyer [9], Rundstrom Williams [32], Hadis [15], Anderson et al. [2], Paige et al. [26], Collentine [7], Norris and Gillespie [25], Basow and Gaugler [3], and Terzuolo [37].

  2. 2.

    For example, the words 'dog' and 'bark' will appear more often in a topic about dogs, 'cat' and 'purr' in a topic about cats, while 'pet' and 'vet' may appear roughly equally in both. Documents feature multiple topics in different proportions. A document that is 20% about cats and 80% about dogs will tend to feature four times as many dog words as cat words.

  3. 3.

    See for an updated list of published applications of STM.

  4. 4.

    For an exposition of the formal statistical structure of the STM and computational aspects of estimation, see Roberts et al. [28].

  5. 5.

    This figure does not include shorter spells abroad as part of regular coursework offered by resident faculty.

  6. 6.

    We dropped four essays of students who by the time of completion of our data collection had not yet turned in both the early and the ex-post reflection essay in the required format.

  7. 7.

    The STM also allows for the possibility to model topical content as a function of metadata (see, e.g., Roberts et al. [27, 28]). We do not utilize this feature of STM.

  8. 8.

    In studying the words lists, it is important to keep in mind that STM-based estimates of topics are driven by correlations across documents in the occurrence of words. Thus, estimated word lists will also contain words that are on their own not particularly informative about the core ideas underlying a topic. (For example, 'can' and 'will' are among highest probability words for several topics.) Indeed, this is an aspect of STM that human readers cannot easily match. An author's use of a topic might rely on specific combinations of words in patterns that a human reader might find hard to discern.

  9. 9.

    Roberts et al. [30: 12] note that to implement the regressions, "…the topic model should contain at least all the covariates contained in the estimateEffect regression". Accordingly, the set of metadata variables that we utilize to estimate the effect of specific metadata covariates on topical prevalence conceptually coincides with the set of metadata covariates that we utilize to estimate the topics. Practically, to estimate the effects associated with categorical and numeric variables that take on multiple values (such as, e.g., Region and GPA; see Table 2), we define and utilize in the analysis corresponding binary variables that highlight the effects of interest (e.g., Africa vs. other regions; above median GPA vs. below median GPA).



We are grateful to Mark Rush and Marc Conner for making this project possible. Griffin Noe provided excellent research assistance. An anonymous reviewer offered valuable comments and suggestions on an earlier draft of the manuscript.

Compliances with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Supplementary material

42001_2019_38_MOESM1_ESM.pdf (157 kb)
Supplementary material 1 (PDF 157 kb)


  1. 1.
    Anderson, P. H., & Lawton, L. (2011). Intercultural development: study abroad vs. on-campus study. Frontiers: The Interdisciplinary Journal of Study Abroad, 21, 86–108.Google Scholar
  2. 2.
    Anderson, P. H., Lawton, L., Rexeisen, R. J., & Hubbard, A. C. (2006). Short term study abroad and intercultural sensitivity: A pilot study. International Journal of Intercultural Relations, 30, 457–469.CrossRefGoogle Scholar
  3. 3.
    Basow, S. A., & Gaugler, T. (2017). Predicting adjustment of U.S. college students studying abroad: Beyond the multicultural personality. International Journal of Intercultural Relations, 56, 39–51.CrossRefGoogle Scholar
  4. 4.
    Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.CrossRefGoogle Scholar
  5. 5.
    Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.Google Scholar
  6. 6.
    Carlson, J. S., & Widaman, K. F. (1988). The effects of study abroad during college on attitudes toward other cultures. International Journal of Intercultural Relations, 12, 1–17.CrossRefGoogle Scholar
  7. 7.
    Collentine, J. (2009). Study abroad research: Findings, implications, and future directions. In M. H. Long & C. J. Doughty (Eds.), The Handbook of Language Teaching (pp. 218–233). Hoboken, NJ: Wiley-Blackwell.CrossRefGoogle Scholar
  8. 8.
    Dolby, N. (2004). Encountering an American self: study abroad and national identity. Comparative Education Review, 48(2), 150–173.CrossRefGoogle Scholar
  9. 9.
    Dwyer, M. M. (2004). More is better: The impact of study abroad program duration. Frontiers: The Interdisciplinary Journal of Study Abroad, 10, 151–163.Google Scholar
  10. 10.
    Farrell, J. (2016). Corporate funding and ideological polarization about climate change. Proceedings of the National Academy of Sciences, 113(1), 92–97.CrossRefGoogle Scholar
  11. 11.
    Gentzkow, M., Kelly, B.T., & Taddy, M. (2017). Text as data. Journal of Economic Literature. (forthcoming).
  12. 12.
    Goldstein, S. B., & Kim, R. I. (2006). Predictors of U.S. college students’ participation in study abroad programs: A longitudinal study. International Journal of Intercultural Relations, 30, 507–521.CrossRefGoogle Scholar
  13. 13.
    Grajzl, P., & Murrell, P. (2018). Toward understanding 17th century English culture: A structural topic model of Francis Bacon’s ideas. Journal of Comparative Economics. (forthcoming).Google Scholar
  14. 14.
    Grimmer, J., & Stewart, B. M. (2013). Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis, 21(3), 267–297.CrossRefGoogle Scholar
  15. 15.
    Hadis, B. F. (2005). Why are they better students when they come back? Determinants of academic focusing gains in the study abroad experience. Frontiers: The Interdisciplinary Journal of Study Abroad, 11, 57–70.Google Scholar
  16. 16.
    Hansen, S., & McMahon, M. (2016). Shocking language: Understanding the macroeconomic effects of central bank communication. Journal of International Economics, 99, S114–S133.CrossRefGoogle Scholar
  17. 17.
    Hansen, S., McMahon, M., & Prat, A. (2018). Transparency and deliberation within the FOMC: A computational linguistics approach. Quarterly Journal of Economics, 133(2), 801–870.CrossRefGoogle Scholar
  18. 18.
    Institute for International Education. (2018). A World on the Move Trends in Global Student Mobility. New York, NY: Institute for International Education (IIE).Google Scholar
  19. 19.
    Law, D. S. (2016). Constitutional archetypes. Texas Law Review, 95, 153–243.Google Scholar
  20. 20.
    Lucas, C., Nielsen, R. A., Roberts, M. E., Stewart, B. M., Storer, A., & Tingley, D. (2015). Computer-assisted text analysis for comparative politics. Political Analysis, 23(2), 254–277.CrossRefGoogle Scholar
  21. 21.
    McLeod, M., & Wainwright, P. (2009). Researching the study abroad experience. Journal of Studies in International Education, 13(1), 66–71.CrossRefGoogle Scholar
  22. 22.
    Mendelson, V. G. (2004). ‘Hindsight is 20/20’: student perceptions of language learning and the study abroad experience. Frontiers: The Interdisciplinary Journal of Study Abroad, 10, 43–63.Google Scholar
  23. 23.
    Mohr, J. W., & Bogdanov, P. (2013). Introduction-topic models: What they are and why they matter. Poetics, 41, 545–569.CrossRefGoogle Scholar
  24. 24.
    Mützel, S. (2015). Facing big data: Making sociology relevant. Big Data & Society, 2, 1–4.CrossRefGoogle Scholar
  25. 25.
    Norris, E. M., & Gillespie, J. (2009). How study abroad shapes global careers: Evidence from the United States. Journal of Studies in International Education, 13(3), 382–397.CrossRefGoogle Scholar
  26. 26.
    Paige, R. M., Fry, G. W., Stallman, E. M., Josić, J., & Jon, J. E. (2009). Study abroad for global engagement: The long-term impact of mobility experiences. Intercultural Education, 20(sup1), S29–S44.CrossRefGoogle Scholar
  27. 27.
    Roberts, M. E., Stewart, B. M., Tingley, D., Lucas, C., Leder-Luis, J., Kushner Gadarian, S. (2014). Structural topic models for open-ended survey responses. American Journal of Political Science, 58(4), 1064–1082.CrossRefGoogle Scholar
  28. 28.
    Roberts, M. E., Stewart, B. M., & Airoldi, E. M. (2016). A model of text for experimentation in the social sciences. Journal of the American Statistical Association, 111(515), 988–1003.CrossRefGoogle Scholar
  29. 29.
    Roberts, M.E., Stewart, B.M., & Tingley, D. (2016). stm: R package for structural topic models. Journal of Statistical Software. (forthcoming).
  30. 30.
    Roberts, M.E., Stewart, B.M., Tingley, D., Benoit, K. (2018). Package ‘stm’. Reference manual, version January 28, 2018.
  31. 31.
    Rockwell, G., & Berendt, B. (2016). On big data and text mining in the humanities. In S. ElAtia, D. Ipperciel, & O. R. Zaïane (Eds.), Data mining and learning analytics: Applications in educational research (pp. 29–40). Hoboken, NJ: Wiley.CrossRefGoogle Scholar
  32. 32.
    Rundstrom Williams, T. (2005). Exploring the impact of study abroad on students’ intercultural communication skills: Adaptability and sensitivity. Journal of Studies in International Education, 9(4), 356–371.CrossRefGoogle Scholar
  33. 33.
    Ryan, M. E., & Twibell, R. S. (2000). Concerns, values, stress, coping, health and educational outcomes of college students who studied abroad. International Journal of Intercultural Relations, 24, 409–435.CrossRefGoogle Scholar
  34. 34.
    Stroud, A. H. (2010). Who plans (not) to study abroad? An examination of U.S. student intent. Journal of Studies in International Education, 14(5), 491–507.CrossRefGoogle Scholar
  35. 35.
    Taddy, M.A. (2012). On estimation and selection for topic models. In Proceedings of the 15th International Conference on Artificial Intelligence and Statistics, pp. 1184–1193.Google Scholar
  36. 36.
    Terra Dotta. (n.d.). Tackling the Gender Gap in Study Abroad.
  37. 37.
    Terzuolo, E. R. (2018). Intercultural development in study abroad: influence of student and program characteristics. International Journal of Intercultural Relations, 65, 86–95.CrossRefGoogle Scholar
  38. 38.
    Wallach, H.M., Murray, I., Salakhutdinov, R., & Mimno, D. (2009). Evaluation methods for topic models. In ICML ‘09 Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1105–1112.Google Scholar
  39. 39.
    Walsh, R., & Walsh, M. (2018). In their own words: American students’ perspectives on study abroad experiences. Humanistic Psychologist, 46(2), 129–146.CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Economics, The Williams School of Commerce, Economics, and PoliticsWashington and Lee UniversityLexingtonUSA
  2. 2.CESifoMunichGermany
  3. 3.Center for International Education, Washington and Lee UniversityLexingtonUSA

Personalised recommendations