Haplotyping Populations: Complexity and Approximations

Lancia, Giuseppe and Pinotti, Maria Cristina and Rizzi, Romeo (2002) Haplotyping Populations: Complexity and Approximations. UNSPECIFIED. (Unpublished)

Preview

PDF
Download (963Kb) | Preview

Abstract

We study the computational complexity of the following haplotyping problem. Given a set of genotypes G, find a minimum cardinality set of haplotypes which explains G. Here, a genotype g is an n-ary string over the alphabet {A,B,-} and an haplotype h is an n-ary string over the alphabet {A,B}. A set of haplotypes H is said to explain G if for every g in G there are h_1, h_2 in H such that h_1 + h_2 = g. The position-wise sum h_1 + h_2 indicates the genotype which has a '-' in the positions where h_1 and h_2 disagree, and the same value as h_1 and h_2 where they agree. We show the APX-hardness of the problem even in the case the number of '-' symbols is at most 3 for every g in G. We give a $\sqrt{|G|}$-approximation algorithm for the general case, and a $2^{k-1}$-approximation algorithm when the number of '-' symbols is at most k for every g in G.

Item Type:	Departmental Technical Report
Department or Research center:	Information Engineering and Computer Science
Subjects:	Q Science > QA Mathematics > QA075 Electronic computers. Computer science
Uncontrolled Keywords:	Computational biology, SNPs, haplotyping, approximation algorithms.
Report Number:	DIT-02-080
Repository staff approval on:	12 Dec 2002

Actions (login required)

View Item

Università degli Studi di Trento

Unitn-eprints.PhD

Haplotyping Populations: Complexity and Approximations

Abstract

Actions (login required)