1 Introduction

1.1 Background of this study

In an era of globalization, the Ministry of Education, Culture, Sports, Science and Technology in Japan (MEXT 2011) began implementing ‘Foreign Language Activities’ as a compulsory class beginning in fifth grade, which is based on the new courses of study that were introduced in April 2011. The ‘Foreign Language Activities’ aim to familiarize children with English by focusing on intonation and pronunciation, listening and speaking. The MEXT (2014) has considered beginning ‘Foreign Language Activities’ in third grade which aims to improve ‘listening’ and ‘speaking’ skills, and implementing an ‘English Course’ in fifth grade, which aims to not only improve ‘listening’ and ‘speaking’ skills, but also ‘reading’ and ‘writing’ skills.

They have also proposed adopting effective ICT materials in order to help children recognize alphabetical letters and notice differences in intonation, characteristics, and structure between Japanese and English as a guide for the teaching support material. Many researchers study and survey teaching support materials used for teaching elementary school children.

Although many studies on tools used for early English education have been conducted, there are few studies on English education that have analysed longitudinal learning data of children using tools and modelled the effectiveness of the tools in consideration of individual differences.

1.2 Review of previous studies related with the use of technology in English language learning and modelling the educational effectiveness in consideration of individual differences using the Linear Mixed-Effect Model

According to Pourhosein Gilakjani (2017), technology assists learners in adjusting their own learning process and they can have access to a lot of information that their teachers are not able to provide. Parvin and Salam (2015) carried out a study and declared that by using technology, learners get the chance to increase their exposure to language in a meaningful context and make their own knowledge. Pourhosein Gilakjani (2014) maintained that using technology can create a learning atmosphere centered around the learner rather than the teacher that in turn creates positive changes.

However, their papers did not analyse longitudinal learning data of children using tools and model the effectiveness of the tools in consideration of individual differences.

On the other hand, there are many educational studies that have used the Linear Mixed-Effect Model (LME) in order to take individual differences into account in the modelling of the educational effectiveness. In Japan, Kawaguchi (2009) propose to use a LME to analyse school effects. His models include school level variables and children level variables as fixed effect and random effect variables accompanying fixed effects. In other countries, Xu, Yuan, Xu, and Xu (2014) analysed Chinese high school students’ time management with regard to their math homework using the LME. Their models depict class level and children level variables such as ‘Motivation’, ‘Arranging environment’, ‘Family homework help’, and ‘Gender’, and they analysed these factors in detail. Hsu and Kuan (2013) explore the factors that influence the elementary or junior high school teacher ICT integration by analysing a detailed model according to the level of schools and teachers in Taiwan. Roman and Murillo (2012) used the model to analyse achievements in math and language of the elementary school in Latin America according to factors such as country level, school level and family socio-economic level. Kwok, Lai, Tong, Lara-Alecio, Irby, Yoon and Yeh (2018) analysed complex longitudinal data of project of English Language and Literacy Acquisition (ELLA) in educational research.

Although many educational studies using the LME have been conducted, there are few studies that have analysed longitudinal data according to detailed modelling of individual differences of children, for students’ detailed English educational experience.

1.3 The purpose of the study

Therefore, the purpose of this study is to analyse longitudinal data in order to propose modelling the effectiveness of a Speaking-pen in support of four English skills (reading, writing, listening, and speaking) in consideration of individual differences depending on the detailed experiences of English learning. The learning that was conducted during the investigation utilized two tools; namely, a speaking-pen and an audio CD, which were used to enhance English learning. Using a speaking-pen, children can learn English in a similar way as one uses a pencil, without prior knowledge of, and preparation for, PC. This method differs from learning using CALL materials, which are generally used for English education. The speaking-pen adopted in this study has an extraordinary function. It can record and play back users’ voices in addition to its conventional function, in which English pronunciations are already recorded. Users can compare between their own voices and English pronunciations already recorded (the speaking-pen was made by Gridmark Inc.).

In section 2, the investigation method of this study is explained. In subsections 2.1, 2.2, and 2.3, we describe the experimental design, the construction of the test, and the construction of textbook, respectively. In section 3, we propose modelling the effectiveness of a speaking-pen in support of four English skills in consideration of individual differences depending on the detailed experiences of English learning. In section 3.1, the model of the total score is shown, and in section 3.2, the model of each of four skills is shown. In section 4, the results of the effectiveness of speaking-pen based on each model of total score and four skills are shown and discussions are presented.

2 Investigation method and constructions of implemented test and textbook

2.1 Investigation method

The research for this study was conducted from October 2013 to March 2014. Ninety second-grade private school students at Shukutoku elementary school participated in this study with the consent of their guardians. In this school, children learn English twice a week beginning in first grade. They learn it by focusing on conversation skills with a native English teacher. Therefore, they have high proficiency in spoken English, which increases their motivation to continue learning. This differs from children in general public elementary schools. In this study, a two-period (2 × 2) cross-over design was adopted as the experimental design. The children were divided into two groups in which both groups were able to use both a speaking-pen and an audio CD in different periods. A pre-questionnaire was implemented before the research experiment began. The pre-questionnaire included some items for investigating the children’s English learning background, which can be found in Chapter 3. Based on the pre-questionnaire, the children’s responses were categorized into eight categories in accordance with the results of three categories of responses, ‘Experience of English Learning’, ‘Practice of Home Learning’ and ‘Experience of Using a Speaking-Pen’. The children in each group were allocated to two groups using a Bernoulli trial, in which the probability of success was set at 0.5 as the Bernoulli probability parameter, so that there would be no differences among the children’s background between the two groups. As a result, 45 children were assigned to Group 1 and the other 45 children were assigned to Group 2 (Fig. 1).

Fig. 1
figure 1

Allocation method

The children in Group 1 learned in their home using a speaking-pen during the first six weeks of the investigation (the first period), and using an audio CD during the last six weeks (the second period). Both six-week sessions were separated by a four-week long inactive term. The children in Group 2 learned in their home using an audio CD during the first period and using a speaking-pen during the second period. In terms of home learning during the investigation, the children were not forced to use either a speaking-pen or an audio CD. Their learning conditions, which included frequency, time, and the means of use depended on their own independence of will and volition to learn. There are four achievement tests that measured the initial skills or improvements in their learning. The first test was implemented before the first period, the second test was implemented after the first period, the third test was implemented before the second period, and the fourth test was implemented after the second period. A post-questionnaire was administered to the children after the research experiment was complete. The post-questionnaire included some items for investigating timing, frequency and the hours of use. This can be seen in Chapter 3.

Fig. 2
figure 2

Experimental design of investigation

2.2 Construction of achievement test

The construction of four tests looked similar to each other. This paper cites the second test in the explanation. All of the tests are composed of six sections, and their total scores add up to 100. In terms of the first test, we referred to the previous study conducted by Tsubaki, Gonda, Kato and Maeda (2015). In Part 1 of the test, after the children read the spelling of a word and see an accompanying picture, they connect the word and the pictures with a line. This is considered to be an appropriate test for measuring a child’s reading ability. This section has fifteen questions and one point is given per one accurate combination so that fifteen possible points can be earned in total. In Part 2 of the test, after the children see a picture, they fill in the blank with one letter for each question. This is considered an appropriate test for measuring a child’s writing ability. This section has ten questions, and two points are given for each correct answer. In Part 3 of the test, after the children read a question and see a picture, they choose the correct answer. This is considered to be an appropriate test for measuring a child’s reading ability. This section has five questions, and two points are given to each correct answer. In Part 4 of the test, after the children hear a question, they choose an appropriate answer sentence. This is considered an appropriate measurement of a child’s listening ability. This section has five questions, and four points are given to each correct answer. In Part 5 of the test, after the children listen to a sentence that contains one blank in the place of a missing word, they fill in the blank with a letter. This is considered an appropriate measurement of a child’s listening and writing abilities. This section has five questions, and four points are given to each correct answer. In Part 6 of the test, a native English teacher asks each child three questions in English, and each child answers the question in English. This is considered an appropriate measurement of a child’s speaking ability. Some examples of questions are, ‘Is this a dog?’ (accompanied by a picture of a dog); ‘What colour is this?’ (accompanied by a picture of a yellow cat); ‘What’s this?’ (accompanied by a picture of an umbrella). (Fig. 3).

Fig. 3
figure 3

Construction of achievement test

2.3 Construction of textbook

The textbook is composed of four units, and each unit is composed of seven sections. In the first period, the children study from the first and second units of the textbook, and in the second period, the children study from the third and fourth units. In this section, we refer to the second unit in order to describe the components of the textbook. One may refer to the study by Tsubaki et al. (2015) for a further understanding of the first unit. The components of the four units in the textbook are similar.

In Section 1, the children learn basic conversational phrases that align with the theme of the unit. The children read and listen to conversational questions and their corresponding answers, such as ‘What colour is this?’; followed by the response: ‘It’s blue.’ They can compare their pronunciation with the native English speaker’s when they use the speaking-pen to record their pronunciation.

In Section 2, the children learn a set of words that corresponds with the theme of this unit. The theme of the second unit is colour. The children learn pronunciations of colour words, such as ‘red’ and ‘yellow’. The themes of the other units are animals, food, and a birthday party. In this section, children can practice their pronunciation as they did in Section 1.

In Section 3, the children can listen to question sentences that correspond with the theme of the unit and choose correct answers after seeing a set of pictures. In this unit, the children can learn the words of colours by listening to their names. In the other units, the children can learn their numbers, as well as how to answer ‘yes’ or ‘no’.

In Section 4, after the children listen to a word, they use the speaking-pen to spell the word. This section is only available to the children who use the speaking-pen. For example, the children in Group 1 can practice spelling words during the first period, and the children in Group 2 can practice spelling words during the second period.

In Section 5, after the children see pictures of objects and listen to their corresponding names, they can practice writing the correct spelling of the words.

In Section 6, after the children read question sentences and listen to questions using a speaking-pen, they can practice choosing correct answers. This section contains two questions and is only available to the children who use the speaking-pen.

In Section 7, after the children listen to a group of words that align with the theme of this section, they can practice spelling the words. In terms of using the speaking-pen, they can learn the correct pronunciations of words by comparing their own pronunciations with those of a native English speaker.

The Common European Framework of Reference for Languages (CEFR) is a set of guidelines used to describe the achievements of students of foreign languages throughout Europe. In Japan, CEFR-J is based on CEFR, has been proposed by Touno et al. (2010, 2012a, b), and was adjusted for Japanese English learners. CEFR-J descriptions corresponding to each unit of the test and text in this study are shown in Table 1. In the first column, reading, writing, and listening are denoted as R, W, and L, respectively. Speaking is divided into two ability categories. S1 refers to ‘Spoken relationship’, and S2 refers to ‘Spoken production’. In the second column, proficiency levels are arranged in numerical order. For example, if we consider listening, PreA1 corresponds to ‘perception of pronunciation with which Japanese are familiar as a katakana word (loan word)’; A1.1 corresponds to ‘greeting, name, date, day of the week, numbers, words, and expressions which are used in daily life’; A1.2 corresponds to ‘words, short sentences, question, familiar and personal requests and preferences (like or dislike, route guiding, etc.)’; and A1.3 corresponds to ‘informal speech in daily conversation (personal questions, daily instructions, requests, etc.)’. In the third and fourth column, the numbers correspond to the section number in the test or textbook respectively.

Table 1 Correspondence between Sections and CEFR-J Levels

To illustrate this table, the sections of the test and text principally focus on the A1 level of ‘a beginner who just began learning English’.

3 Modelling the effectiveness of speaking-pen in consideration of individual differences using a linear mixed-effect model

In this section, we construct and propose models that can analyse the effectiveness of learning based on variables given in the pre- and post-questionnaire data, variables of time and variables of tools, such as the speaking-pen and audio CD.

In this study, we are interested in the effect of child i, the effect of time j, the effect of tool k, and the interaction effect between time j and tool k for the test score.

Then, we model the test score of the child i at time j with tool k (yij(k)) by the parameter δi of each child i, the effect βj of time j, the effect γk of the tool k, the interaction βγjk between time j and tool k., and the error εij(k) at the first part of Table 2.

Table 2 Variables and parameters of linear mixed-effect models

Further, we are interested in the effect of gender and the fixed effect depending on the experiences of English learning for the parameter δi of each child i, then we model the parameter δi of each child i by the parameter μ of ‘Mean over individual,’ the fixed effect α1 of ‘Gender,’ the fixed effect depending on the experiences of English learning (like the fixed effect α2 of ‘Private English School’((1) in Table 3), the fixed effect α3 of ‘Tutor’ ((2) in Table 3), the fixed effect α4 of ‘kindergarten with English Lesson’ ((4) in Table 3), the fixed effect α5 of ‘Parents Speaking English Very Well’ ((5) in Table 3), the fixed effect α6 of ‘Speaking-pen Experiences’ ((6) in Table 3), the fixed effect α7 of ‘Home Learning’ ((7) in Table 3), the fixed effect α8 of ‘Homework from Private English School’ ((8) in Table 3), fixed effect α9 of ‘Favour’ ((9) in Table 3),) and the parameter ωi of ‘Individual Differences’ at the second part of Table 2. A pre-questionnaire was implemented before the research experiment began. The pre-questionnaire included items for investigating the children’s English learning background, which can be found in Table 3. We show above the correspondence between the fixed effect αm and pre-questionnaire item number in Table 3. We are interested in the effects of their children’s English learning backgrounds of the test scores.

Table 3 Items on Pre-questionnaire

And also, we model the effect βj of time j by the parameter πj of ‘Mean of Time j,’ and the interactions ‘Gender effect at Time jα1j, the interaction effect between time j and variables depending on the experiences of English learning (α2j - α8j) at the third part of Table 2.

Furthermore, the effect γk of tool k is modelled by the parameter οk of ‘Mean of Tool k,’ ‘Gender effect using Tool kα1k, the interaction effect between tool k and variables depending on the experiences of English learning (α2k - α8k) at the forth part of Table 2.

Finally, we model the interaction effect βγjk by the ‘Mean of Time j × Tool k’ parameter ξjk, the fixed effect of ‘Gender effect in Time j × Tool kα1jk, the interaction effect between time j, tool k and variables depending on the experiences of English learning (α2jk, − α10jk) at the last part of Table 2. A post-questionnaire was administered to the children after the research experiment was complete. The post-questionnaire included some items for investigating timing, frequency and the hours of use. This can be seen in Table 4. The fixed effect α10jk shows the interaction among Frequency×Time j × Tool k in Table 2. We are interested in these interactions.

Table 4 Items of Post-questionnaire

And then, we analysed the variables using a one-way analysis of variance (one-way ANOVA) in order to choose effective variables among all 69 variables. In the one-way ANOVA, we set the variables of the total test scores and the four English skills’ scores on the first test as response variables. We adopted the variables in which clear trends were observed significantly, and used them to construct Linear Mixed-Effect Models. Furthermore, in the first period, the improvements of each child’s score were calculated by subtracting the scores of the first test from the scores of the second test. In the second period, the improvements of each child’s score were calculated by subtracting their scores from the third test from the scores of the fourth test. The variables chosen using a one-way ANOVA were included in models as the effect of time, tools, and interactions between time and tools in the sections 3.13.2.

Table 2 represents the variables and parameters of five models (Total score model and four English skills models) proposed in this section.

3.1 Total score modeling

In this section, we propose a Model (T) of the total score.

The total score is modelled as follows:

$$ {\displaystyle \begin{array}{l}{y}_{ij(k)}={\delta}_i+{\beta}_j+{\gamma}_k+\beta {\gamma}_{jk}+{\varepsilon}_{ij(k)}\\ {}\kern14em {\varepsilon}_{ij(k)}\sim N\left(0,{\sigma}^2\right)\end{array}} $$
(1)

yij(k) is the total test score of the child i at time j with tool k. δi is defined as the parameter of each child i, βj as the effect of time, γk as the effect of the tool, and βγjk as the interaction between time j and tool k. εij(k) is the error.

Furthermore, the parameter δi of each child i in Eq. (1) is modelled by the variables whose clear trends were observed significantly using a one-way ANOVA of the total score of the first test.

$$ {\displaystyle \begin{array}{l}{\delta}_i=\mu +{\alpha}_1+{\alpha}_2+{\alpha}_3+{\alpha}_6+{\alpha}_7+{\alpha}_8+{\alpha}_9+{\omega}_i\\ {}\kern14em {\omega}_i\sim N\left(0,{\sigma}_{\delta}^2\right)\end{array}} $$
(2)

The parameter μ is defined as ‘Mean over individual’, α1 as fixed effect of ‘Gender’, α2 as fixed effect of ‘Private English School’, α3 as fixed effect of ‘Tutor’, α6 as fixed effect of ‘Speaking-pen Experiences’, α7 as fixed effect of ‘Home Learning’, α8 as fixed effect of ‘Homework from Private English School’ and α9 as fixed effect of ‘Favour’. The parameter ωi expresses ‘Individual Differences’ and is assumed to be \( {\omega}_i\sim N\left(0,{\sigma}_{\delta}^2\right) \).

The effect βj of time j in Eq. (1) is modelled by the variables whose clear trends were observed significantly using a one-way ANOVA of the total improvements in test scores.

$$ {\beta}_i={\pi}_j+{\alpha}_{1j} $$
(3)

The parameter πj is defined as ‘Mean of Time j’ and α1j as ‘Gender × Time j’, which means the fixed effect of ‘Gender effect at Time j’.

The effect γk of tool k in Eq. (1) is modelled by the variables whose clear trend was observed significantly using a one-way ANOVA of the total test score improvement.

$$ {\gamma}_k={o}_k+{\alpha}_{1k} $$
(4)

The parameter οk is defined as ‘Mean of Tool k’ and α1k as ‘Gender × Tool k’, which means the fixed effect of ‘Gender effect using Tool k’.

The effect βγjk in Eq. (1) is modelled by the variables whose clear trends were observed significantly using a one-way ANOVA of the total improvements in test scores.

The result of the one-way ANOVA showed that ‘Frequency’ was significant regarding the improvement of first period students, and ‘Gender’ was significant regarding the improvement of second period students. We considered this to be the result of interactions between these variables and the effect of time. Accordingly, these interactions are included in the model.

$$ \beta {\gamma}_{jk}={\xi}_{jk}+{\alpha}_{1 jk}+{\alpha}_{10 jk} $$
(5)

The parameterξjk is defined as ‘Mean of Time j × Tool k’, α1jk as ‘Gender × Time j × Tool k’, which means the fixed effect of ‘Gender effect in Time j × Tool k’, and α10jk as ‘Frequency × Time j × Tool k’, which means the fixed effect of ‘Frequency effect in Time j × Tool k’.

3.2 Four English skills Modelling

In this section, we propose the models of four English skills. The four models are constructed in the same way as Model (T).

  1. (1)

    The Model of reading score, Model (R)

Reading score is modelled using parameters in Table 2 as follows:

$$ {\displaystyle \begin{array}{l}{y}_{ij(k)}={\delta}_i+{\beta}_j+{\gamma}_k+\beta {\gamma}_{jk}+{\varepsilon}_{ij(k)}\\ {}\kern14em {\varepsilon}_{ij(k)}\sim N\left(0,{\sigma}^2\right)\end{array}} $$
(6)
$$ {\displaystyle \begin{array}{l}{\delta}_i=\mu +{\alpha}_1+{\alpha}_2+{\alpha}_3+{\alpha}_6+{\alpha}_7+{\alpha}_8+{\alpha}_9+{\omega}_i\\ {}\kern14em {\omega}_i\sim N\left(0,{\sigma}_{\delta}^2\right)\end{array}} $$
(7)
$$ {\beta}_i={\pi}_j+{\alpha}_{5j}+{\alpha}_{7j}+{\alpha}_{8j} $$
(8)
$$ {\gamma}_k={o}_k+{\alpha}_{5k}+{\alpha}_{7k}+{\alpha}_{8k} $$
(9)
$$ \beta {\gamma}_{jk}={\xi}_{jk}+{\alpha}_{5 jk}+{\alpha}_{7 jk}+{\alpha}_{8 jk}+{\alpha}_{10 jk} $$
(10)
  1. (2)

    The Model of writing score, Model (W)

Writing score is modelled using parameters in Table 2 as follows:

$$ {\displaystyle \begin{array}{l}{y}_{ij(k)}={\delta}_i+{\beta}_j+{\gamma}_k+\beta {\gamma}_{jk}+{\varepsilon}_{ij(k)}\\ {}\kern14em {\varepsilon}_{ij(k)}\sim N\left(0,{\sigma}^2\right)\end{array}} $$
(11)
$$ {\displaystyle \begin{array}{l}{\delta}_i=\mu +{\alpha}_2+{\alpha}_3+{\alpha}_7+{\alpha}_8+{\alpha}_9+{\omega}_i\\ {}\kern14em {\omega}_i\sim N\left(0,{\sigma}_{\delta}^2\right)\end{array}} $$
(12)
$$ {\beta}_i={\pi}_j+{\alpha}_{1j} $$
(13)
$$ {\gamma}_k={o}_k+{\alpha}_{1k} $$
(14)
$$ \beta {\gamma}_{jk}={\xi}_{jk}+{\alpha}_{1 jk} $$
(15)
  1. (3)

    The Model of listening score, Model (L)

Listening score is modelled using parameters in Table 2 as follows:

$$ {\displaystyle \begin{array}{l}{y}_{ij(k)}={\delta}_i+{\beta}_j+{\gamma}_k+\beta {\gamma}_{jk}+{\varepsilon}_{ij(k)}\\ {}\kern14em {\varepsilon}_{ij(k)}\sim N\left(0,{\sigma}^2\right)\end{array}} $$
(16)
$$ {\displaystyle \begin{array}{l}{\delta}_i=\mu +{\alpha}_1+{\alpha}_2+{\alpha}_3+{\alpha}_8+{\alpha}_9+{\omega}_i\\ {}\kern14em {\omega}_i\sim N\left(0,{\sigma}_{\delta}^2\right)\end{array}} $$
(17)
$$ {\beta}_i={\pi}_j+{\alpha}_{1j}+{\alpha}_{2j} $$
(18)
$$ {\gamma}_k={o}_k+{\alpha}_{1k}+{\alpha}_{2k} $$
(19)
$$ \beta {\gamma}_{jk}={\xi}_{jk}+{\alpha}_{1 jk}+{\alpha}_{2 jk} $$
(20)
  1. (4)

    The Model of speaking score, Model (S)

Speaking score is modelled using parameters in Table 2 as follows:

$$ {\displaystyle \begin{array}{l}{y}_{ij(k)}={\delta}_i+{\beta}_j+{\gamma}_k+\beta {\gamma}_{jk}+{\varepsilon}_{ij(k)}\\ {}\kern14em {\varepsilon}_{ij(k)}\sim N\left(0,{\sigma}^2\right)\end{array}} $$
(21)
$$ {\displaystyle \begin{array}{l}{\delta}_i=\mu +{\alpha}_1+{\alpha}_2+{\alpha}_3+{\alpha}_4+{\alpha}_5+{\alpha}_6+{\alpha}_7+{\alpha}_9+{\omega}_i\\ {}\kern14em {\omega}_i\sim N\left(0,{\sigma}_{\delta}^2\right)\end{array}} $$
(22)
$$ {\beta}_i={\pi}_j+{\alpha}_{2j}+{\alpha}_{4j}+{\alpha}_{5j} $$
(23)
$$ {\gamma}_k={o}_k+{\alpha}_{2k}+{\alpha}_{4k}+{\alpha}_{5k} $$
(24)
$$ \beta {\gamma}_{jk}={\xi}_{jk}+{\alpha}_{2 jk}+{\alpha}_{4 jk}+{\alpha}_{5 jk} $$
(25)

Model (R), Model (W), Model (L), and Model (S) are adopted among all the possible models regarding all combinations of variables chosen using a one-way ANOVA. Comparing all the possible models’ value of AIC, the models that had the minimum value of AIC were adopted.

3.3 Discussions for the models

The variables and parameters composing all models indicated in Section 3.1 and 3.2 are shown in Table 5. The variables and parameters included in the models are represented as ‘1’ and those not included in the models are left blank.

Table 5 Variables and parameters of linear mixed-effect models

In all the five models, the effects of ‘Private English School’ α2, ‘Tutor’ α3 and ‘Favour’ α9 were modelled in the parameter δi as fixed effects. The distinct feature of Model (T) modelled the interactions between the effect of ‘Gender’ and ‘Time’ α1j, ‘Tool’ α1k, ‘Time × Tool’ α1jk respectively. In addition, the effect of ‘Frequency’ α10jk affects Model (T). The distinct feature of Model (R) modelled the interactions between the effects of the children’s backgrounds, such as ‘Parents’ (α5j, α5k, α5jk), ‘Home Learning’ (α7j, α7k, α7jk), and ‘Homework of Private English School’ (α8j, α8k, α8jk), and ‘Time’, ‘Tool’, ‘Time × Tool’ respectively. The effects of Model (W) were fewest compared to all the other models. As can be seen in Table 1, it seems that Model (W) became the simplest model of all the models since the level of writing in the textbook was only PreA1. The distinct features of Model (L) and Model (S) modelled the interactions between the effects of ‘Private English School’ and ‘Time’ α2j, ‘Tool’ α2k, ‘Time × Tool’ α2jk respectively. It is noted that the interactions (α1j, α1k, α1jk) between ‘Gender’ were modelled in Model (L) and Model(W), while these interactions were not modelled in Model (S).

4 Results and discussions for all the models

An analysis of this study was carried out in the restricted maximum likelihood estimation using the MIXED PROCEDURE of SAS. One of the levels of each variable was assumed to be 0 in order to estimate values of the other levels. For example, for estimating the variable ‘gender’, the estimated value of ‘girl’ is estimated on condition that the estimated value of ‘boy’ (level 0) is set at 0.

Table 6 represents the estimated values of each model that were significant. In Table 6, ‘Model (T)’, ‘Model (R)’, ‘Model (W)’, ‘Model (L)’, and ‘Model (S)’ are denoted ‘T’, ‘R’, ‘W’, ‘L’, and ‘S’ respectively. Table 7 shows the mark of the p values in Table 6.

Table 6 The estimated significant values of five models
Table 7 Mark of the p value

4.1 Results of model (T)

The interaction (8.781) of ‘Time (j = 2) × Tool (k = 2: speaking-pen)’ was significantly positive. The results show that the children who learned using a speaking-pen during the first period were, on average, able to improve their own overall ability more than the children who used an audio CD. It seems that the speaking-pen is more effective than an audio CD in the early stage of English learning.

 In terms of variances, the estimated value of the variance of ‘Individual Differences’ (283.92) was larger than ‘Error’(86.42). It seems that the individual difference was large as a factor of a variance in Model (T).

4.2 Discussion for four English skills models

4.2.1 Results of model (R)

The effect (3.719) of ‘Frequency (five-seven times / week)’ on the interaction between ‘Time (j = 1) × Tool (k = 2: speaking-pen)’ was significantly positive. The result shows that the children who used a speaking-pen five to seven times a week tended to get a high score on the first test more than those who seldom used a speaking-pen.

In terms of variances, the estimated value of the variance of ‘Individual Differences’(24.43) was larger than ‘Error’(12.39). It seems that the individual difference was a significant factor of a variance in Model (R).

4.2.2 Results of model (W)

A few significant effects were observed in Model (W) compared to the other models. It seems that the structure of writing is simpler than the other models. Any significant effect of speaking-pen was not observed more than an audio CD.

In terms of variances, the estimated value of the variance of ‘Individual Differences’(42.14) was larger than ‘Error’(19.07). It seems that the individual difference was large as a factor of a variance in Model (W).

4.2.3 Results of model (L)

The interaction (3.598) between ‘Time (j = 2)’ and ‘Tool (k = 2: speaking-pen)’ was significantly positive. The listening abilities of the children who used a speaking-pen during the first period tended to increase more than the abilities of the children who used an audio CD. The speaking-pen provides the children with an opportunity to listen to the same words repeatedly, while using an audio CD requires one to listen to the entire track. Thus, children can listen to pronunciations slowly and clearly using speaking-pen, which may account for these results.

The effect (−6.667) of ‘Gender (girl)’ in the interaction between ‘Time (j = 2)’ and ‘Tool (k = 2: speaking-pen)’ was significantly negative. Therefore, the boys were able to improve their own listening ability more than the girls in the first period.

In terms of variances, both of the estimated values, ‘Individual Differences’ (14.13) and ‘Error’(13.40), were at the same level. It seems that individual difference (14.13) was relatively small in listening since the tools focused on improving listening ability.

4.2.4 Results of model (S)

The interactions between ‘Tool (k = 2k = 2: speaking-pen)’, ‘Private English School (past)’ and each of the ‘Time (j = 2)’(9.441), ‘Time (j = 3)’(6.601), and ‘Time (j = 4)’(5.455) factors were significantly positive. It seems that the children could remember knowledge or experiences when they learned at a private English school.

In terms of variances, the estimated value of the variance of ‘Individual Differences’ (4.38) was smaller than ‘Error’ (7.67). It seems that the children could consistently acquire speaking skills using tools such as a speaking-pen and an audio CD.

5 Summary of the study

In this study, with the era of the effects of globalization, we modelled and analysed the effects of learning tools such as a speaking-pen and an audio CD, in consideration of the children’s backgrounds and individual differences, using the Linear Mixed-Effect Model in order to investigate how the children’s experiences of English learning affect their potential English skills and improve their learning.

In section 1, we revealed that it was important for Japanese to adopt ICT tools in order to learn four skills (reading, writing, listening, speaking) related to English learning, because it is an era of globalization. The purpose of the study was explained after reviews of previous studies related to the use of technology in English education.

In section 2, the investigation method was illustrated and detailed explanations of the test and the textbook were provided based on the four English skills to be assessed.

In section 3, we modelled the effects of the children’s backgrounds based on the pre-questionnaire and ‘Time’ and ‘Tool’ by using a Linear Mixed-Effect Model after the effective variables, whose clear trends were observed using a one-way analysis of variance, were selected.

In section 4, the estimated values of each model were represented and the discussion of the significant interaction effects between the use of speaking-pen and the children’s background, and ‘Time’ were provided.

We found these four significant results and they are an indication that students studied more with the pen.

The interaction (8.781) of ‘Time (j = 2) × Tool (k = 2: speaking-pen)’ in Model (T) and the interaction (3.598) of ‘Time (j = 2) × Tool (k = 2: speaking-pen)’ in Model (L) was significantly positive. The results show that a speaking-pen was effective for improving overall skills and particularly listening ability in the first period.

In addition, it should be noted that the differences of effects depending on children’s individual backgrounds were observed in each skill. In terms of reading, the effect (3.719) of ‘Frequency (five-seven times / week)’ on the interaction between ‘Time (j = 1) × Tool (k = 2: speaking-pen)’ was significantly positive. The result shows that the children who used a speaking-pen five to seven times a week tended to get a high score on the first test more than those who seldom used a speaking-pen.

In terms of listening, the speaking-pen was effective for boys who had learned in the first period, because the effect (−6.667) of ‘Gender (girl)’ in the interaction between ‘Time (j = 2)’ and ‘Tool (k = 2: speaking-pen)’ was significantly negative.

In terms of speaking, the speaking-pen was effective for the children who had previously learned at a private English school, because the interactions between ‘Tool (k= 2: speaking-pen)’, ‘Private English School (past)’ and each of the ‘Time (j = 2)’(9.441), ‘Time (j = 3)’(6.601), and ‘Time (j = 4)’(5.455) factors were significantly positive.

In terms of variances, the individual differences of the total model (283.92), reading model (24.43), and writing model (42.14) were larger than the error variances, whereas the individual differences of the listening model (14.13) and speaking model (4.38) were smaller than the error variances. It seems that the tools, such as the speaking-pen and the audio CD, provide stable effects on listening and speaking.