Divide and Conquer Machine Learning for a Genomics Analogy Problem

(Progress Report)
  • Ming Ouyang
  • John Case
  • Joan Burnside
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2226)

Abstract

Genomic strings are not of fixed length,but provide one- dimensional spatial data that do not divide for conquering by machine learning into manageable .xed size chunks obeying Dietterich independent and identically distributed assumption.We nonetheless need to divide genomic strings for conquering by machine learning in this case for genomic prediction. Orthologs are genomic strings derived from a common ancestor and having the same biological function.Ortholog detection is biologically interesting since it informs us about protein divergence through evolution, and,in the present context,also has important agricultural applications. In the present paper is indicated means to obtain an associated (fixed size)attribute vector for genomic string data and for dividing and conquering the machine learning problem of ortholog detection herein seen as an analogy problem.The attributes are based on both the typical string similarity measures of bioinformatics and on a large number of differential metrics,many new to bioinformatics.Many of the differential metrics are based on evolutionary considerations,both theoretical and empirically observed,in some cases observed by the authors. C5.0 with AdaBoosting activated was employed and the preliminary results reported herein re complete cDNA strings are very encouraging for eventually and usefully employing the techniques described for ortholog detection on the more readily available EST (incomplete)genomic data.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Ming Ouyang
    • 1
  • John Case
    • 2
  • Joan Burnside
    • 3
  1. 1.Environmental and Occupational Health Sciences Institute UMDNJ Robert Wood Johnson Medical School and RutgersThe State University of New JerseyPiscatawayUSA
  2. 2.Department of CISUniversity of DelawareNewarkUSA
  3. 3.Department of Animal & Food SciencesUniversity of DelawareNewarkUSA

Personalised recommendations