Skip to main content

A Study on DNA Sequence of Rice Using Scoring Matrix Method and ANOVA Technique

  • Conference paper
  • First Online:
Statistics and its Applications (PJICAS 2016)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 244))

  • 700 Accesses

Abstract

In this paper, 12 accession numbers of rice has been used. The accession numbers have been taken from the article Cho et al. where it has already been used for other studies. The accession number for DNA, i.e., A, C, G and T along with the gap character (–) have been converted into alignment matrix with 5 rows and 7473 columns. The alignment has been done using ClustalX software. The 7473 columns have been alienated into 5 parts with different dimensions. Later for each part scoring has been done separately. Highest scores from all the 5 parts have been noted down. To minimize the data, the common regions between these 5 parts have been taken into consideration. Later one way ANOVA (Huck and McLean in Psychological Bulletin, 82(4), 511–518,1975; Mukhopadhyay in Applied statistics. Books and Allied (P) Ltd., Kolkata, 2011) has been constructed and conclusions are drawn accordingly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Cho, Y. G., Ishii, T., Temnykh, S., Chen, X., Lipovich, L., McCouch, R. S., Park, D. W., Ayres, N., & Cartinhour, S. (2000). Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice. (Oryza sativa L.) Theor Appl Genet, 100, 713–722. Springer-Verlag.

    Google Scholar 

  • Hertz, Z. G., & Stormo, D. G. (1999). Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics, 15(7/8), 563–577.

    Article  Google Scholar 

  • Huck, W. S., & McLean, A. R. (1975). Using a repeated measures ANOVA to analyze the data from a pretest-posttest design: A potentially confusing task. Psychological Bulletin, 82(4), 511–518.

    Article  Google Scholar 

  • Pei, J. (2008). Multiple protein sequence alignment. In Current opinion in structural biology (Vol. 18, pp. 382–386). Elsevier.

    Google Scholar 

  • Shu, J. J., Yong, Y. K., & Chang, K. W. (2012). An improved scoring matrix for multiple sequence alignment. In Mathematical problems in engineering (Vol. 2012, no. 490649, pp. 1–9).

    Google Scholar 

  • Mukhopadhyay, P. (2011). Applied statistics. Books and Allied (P) Ltd.

    Google Scholar 

  • Wallace, M. I., Blackshields, G., & Higgins, G. D. (2005). Multiple sequence alignments. In Current opinion in structural biology (Vol. 15, p. 261–266). Elsevier.

    Google Scholar 

  • Williams, J. L., & Abdi, H. (2010). Fisher’s least significant difference (LSD) test. In N. Salkind (ed.), Encyclopedia of research design (pp. 1–6).

    Google Scholar 

Download references

Acknowledgements

The author Miss. Anamika Dutta thank to Department of Science and Technology (DST), India for providing financial assistance for carrying out this work as an INSPIRE Fellow. Also we thank the reviewer for their thorough review and highly appreciate the comments and suggestions which substantially contributed to improving the class of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anamika Dutta .

Editor information

Editors and Affiliations

Appendix

Appendix

The alignment of matrix (Hertz and Stormo 1999) has been shown with an example. Let us take some DNA sequences of different length say:

  • A – A C G T T C C

  • A C A C G T A C A

  • G C A A G A T – C

  • A C A C G T T C C

Gap character (–) come to view when ClustalX software is used. It happens due to multiple sequence alignment.

The above alignment has been created by ClustalX software. Now from the above DNA sequences, the alignment matrix can be formed which has been shown below:

$$ \left[ {\begin{array}{*{20}c} \text{-} & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ A & 3 & 0 & 4 & 1 & 0 & 1 & 1 & 0 & 1 \\ C & 0 & 3 & 0 & 3 & 0 & 0 & 0 & 3 & 3 \\ G & 1 & 0 & 0 & 0 & 4 & 0 & 0 & 0 & 0 \\ T & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 0 & 0 \\ \end{array} } \right] $$

Weight matrix using for the above example is given by:

$$ \left[ {\begin{array}{*{20}c} \text{-} & { - 3.912} & { - 1.040} & { - 3.912} & { - 3.912} & { - 3.912} & { - 3.912} & { - 3.912} & { - 1.040} & { - 3.912} \\ A & {0.759} & { - 1.609} & {1.023} & { - 0.168} & { - 1.609} & { - 0.168} & { - 0.168} & { - 1.609} & { - 0.168} \\ C & { - 1.609} & {0.702} & { - 1.609} & {0.702} & { - 1.609} & { - 1.609} & { - 1.609} & {0.702} & {0.702} \\ G & {0.488} & { - 1.609} & { - 1.609} & { - 1.609} & {1.777} & { - 1.609} & { - 1.609} & { - 1.609} & { - 1.609} \\ T & { - 1.609} & { - 1.609} & { - 1.609} & { - 1.609} & { - 1.609} & {1.374} & {1.374} & { - 1.609} & { - 1.609} \\ \end{array} } \right] $$

The highest weights of the above weight matrix are:

$$ \left[ {\begin{array}{*{20}c} {0.759} & {0.702} & {1.023} & {0.702} & {1.777} & {1.374} & {1.374} & {0.702} & {0.702} \\ \end{array} } \right] $$

Hence the score of the above matrix is:

$$ 0.759 + 0.702 + 1.023 + 0.702 + 1.777 + 1.374 + 1.374 + 0.702 + 0.702 = 9.115 $$

This was a counter example of alignment and weight matrix.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dutta, A., Das, K.K. (2018). A Study on DNA Sequence of Rice Using Scoring Matrix Method and ANOVA Technique. In: Chattopadhyay, A., Chattopadhyay, G. (eds) Statistics and its Applications. PJICAS 2016. Springer Proceedings in Mathematics & Statistics, vol 244. Springer, Singapore. https://doi.org/10.1007/978-981-13-1223-6_2

Download citation

Publish with us

Policies and ethics