A Study on DNA Sequence of Rice Using Scoring Matrix Method and ANOVA Technique

Dutta, Anamika; Das, Kishore K.

doi:10.1007/978-981-13-1223-6_2

Anamika Dutta³ &
Kishore K. Das³

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 244))

Included in the following conference series:

Platinum Jubilee International Conference on Applications of Statistics

700 Accesses

Abstract

In this paper, 12 accession numbers of rice has been used. The accession numbers have been taken from the article Cho et al. where it has already been used for other studies. The accession number for DNA, i.e., A, C, G and T along with the gap character (–) have been converted into alignment matrix with 5 rows and 7473 columns. The alignment has been done using ClustalX software. The 7473 columns have been alienated into 5 parts with different dimensions. Later for each part scoring has been done separately. Highest scores from all the 5 parts have been noted down. To minimize the data, the common regions between these 5 parts have been taken into consideration. Later one way ANOVA (Huck and McLean in Psychological Bulletin, 82(4), 511–518,1975; Mukhopadhyay in Applied statistics. Books and Allied (P) Ltd., Kolkata, 2011) has been constructed and conclusions are drawn accordingly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cho, Y. G., Ishii, T., Temnykh, S., Chen, X., Lipovich, L., McCouch, R. S., Park, D. W., Ayres, N., & Cartinhour, S. (2000). Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice. (Oryza sativa L.) Theor Appl Genet, 100, 713–722. Springer-Verlag.
Google Scholar
Hertz, Z. G., & Stormo, D. G. (1999). Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics, 15(7/8), 563–577.
Article Google Scholar
Huck, W. S., & McLean, A. R. (1975). Using a repeated measures ANOVA to analyze the data from a pretest-posttest design: A potentially confusing task. Psychological Bulletin, 82(4), 511–518.
Article Google Scholar
Pei, J. (2008). Multiple protein sequence alignment. In Current opinion in structural biology (Vol. 18, pp. 382–386). Elsevier.
Google Scholar
Shu, J. J., Yong, Y. K., & Chang, K. W. (2012). An improved scoring matrix for multiple sequence alignment. In Mathematical problems in engineering (Vol. 2012, no. 490649, pp. 1–9).
Google Scholar
Mukhopadhyay, P. (2011). Applied statistics. Books and Allied (P) Ltd.
Google Scholar
Wallace, M. I., Blackshields, G., & Higgins, G. D. (2005). Multiple sequence alignments. In Current opinion in structural biology (Vol. 15, p. 261–266). Elsevier.
Google Scholar
Williams, J. L., & Abdi, H. (2010). Fisher’s least significant difference (LSD) test. In N. Salkind (ed.), Encyclopedia of research design (pp. 1–6).
Google Scholar

Download references

Acknowledgements

The author Miss. Anamika Dutta thank to Department of Science and Technology (DST), India for providing financial assistance for carrying out this work as an INSPIRE Fellow. Also we thank the reviewer for their thorough review and highly appreciate the comments and suggestions which substantially contributed to improving the class of the paper.

Author information

Authors and Affiliations

Department of Statistics, Gauhati University, Guwahati, India
Anamika Dutta & Kishore K. Das

Authors

Anamika Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Kishore K. Das
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anamika Dutta .

Editor information

Editors and Affiliations

Department of Statistics, University of Calcutta, Kolkata, West Bengal, India
Asis Kumar Chattopadhyay
Department of Statistics, University of Calcutta, Kolkata, West Bengal, India
Gaurangadeb Chattopadhyay

Appendix

The alignment of matrix (Hertz and Stormo 1999) has been shown with an example. Let us take some DNA sequences of different length say:

A – A C G T T C C
A C A C G T A C A
G C A A G A T – C
A C A C G T T C C

Gap character (–) come to view when ClustalX software is used. It happens due to multiple sequence alignment.

The above alignment has been created by ClustalX software. Now from the above DNA sequences, the alignment matrix can be formed which has been shown below:

$$ \left[ {\begin{array}{*{20}c} \text{-} & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ A & 3 & 0 & 4 & 1 & 0 & 1 & 1 & 0 & 1 \\ C & 0 & 3 & 0 & 3 & 0 & 0 & 0 & 3 & 3 \\ G & 1 & 0 & 0 & 0 & 4 & 0 & 0 & 0 & 0 \\ T & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 0 & 0 \\ \end{array} } \right] $$

Weight matrix using for the above example is given by:

$$ \left[ {\begin{array}{*{20}c} \text{-} & { - 3.912} & { - 1.040} & { - 3.912} & { - 3.912} & { - 3.912} & { - 3.912} & { - 3.912} & { - 1.040} & { - 3.912} \\ A & {0.759} & { - 1.609} & {1.023} & { - 0.168} & { - 1.609} & { - 0.168} & { - 0.168} & { - 1.609} & { - 0.168} \\ C & { - 1.609} & {0.702} & { - 1.609} & {0.702} & { - 1.609} & { - 1.609} & { - 1.609} & {0.702} & {0.702} \\ G & {0.488} & { - 1.609} & { - 1.609} & { - 1.609} & {1.777} & { - 1.609} & { - 1.609} & { - 1.609} & { - 1.609} \\ T & { - 1.609} & { - 1.609} & { - 1.609} & { - 1.609} & { - 1.609} & {1.374} & {1.374} & { - 1.609} & { - 1.609} \\ \end{array} } \right] $$

The highest weights of the above weight matrix are:

$$ \left[ {\begin{array}{*{20}c} {0.759} & {0.702} & {1.023} & {0.702} & {1.777} & {1.374} & {1.374} & {0.702} & {0.702} \\ \end{array} } \right] $$

Hence the score of the above matrix is:

$$ 0.759 + 0.702 + 1.023 + 0.702 + 1.777 + 1.374 + 1.374 + 0.702 + 0.702 = 9.115 $$

This was a counter example of alignment and weight matrix.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dutta, A., Das, K.K. (2018). A Study on DNA Sequence of Rice Using Scoring Matrix Method and ANOVA Technique. In: Chattopadhyay, A., Chattopadhyay, G. (eds) Statistics and its Applications. PJICAS 2016. Springer Proceedings in Mathematics & Statistics, vol 244. Springer, Singapore. https://doi.org/10.1007/978-981-13-1223-6_2

Download citation

DOI: https://doi.org/10.1007/978-981-13-1223-6_2
Published: 17 August 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1222-9
Online ISBN: 978-981-13-1223-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

A Study on DNA Sequence of Rice Using Scoring Matrix Method and ANOVA Technique

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation