Kugaonkar MS defense: Finding Associations among SNPs for Prostate Cancer
MS Thesis Defense
Finding associations among SNPs for
prostate cancer using collaborative filtering
Rohit Kugaonkar
9:00am Wed. 18 July 2012, Room ITE 325b
Prostate cancer is the second leading cause of cancer related deaths among men. Because of the slow growing nature of prostate cancer, sometimes surgical treatment is not required for less aggressive cancers. Recent debates over prostate-specific antigen (PSA) screening have drawn new attention to prostate cancer. Due to the complicated nature of prostate cancer, studying the entire genome is essential to find genomic traits. Due to the high cost of studying all Single Nucleotide Polymorphisms (SNPs), it is essential to find tag SNPs which can represent other SNPs. Earlier methods to find tag SNPs using associations between SNPs either use SNP's location information or are based on data of very few SNP markers in each sample. Our study is based on 2300 samples with 550,000 SNPs each. We have not used SNP location information or any predefined standard cut-offs to find tag SNPs. Our approach is based on using collaborative filtering methods to find pair wise associations among SNPs and thus list top-N tag SNPs. We have found 25 tag SNPs which have highest similarities to other SNPs. In addition we found 16 more SNPs which have high correlation with the known high risk SNPs that are associated with prostate cancer. We used some of these newly found SNPs with 5 different classification algorithms and observed some improvement in prediction accuracy over using the original known high risk SNPs. The classifier can be used in a decision to perform further testing in case of a "yes" answer by the classifier.
Committee: Drs. Yelena Yesha (chair), Anupam Joshi, Aryya Gangopadhyay and Micheal Grasso.
Posted: July 11, 2012, 5:30 PM