The invention comprises algorithms implemented in software for "structured combinatorial queries" that may be used for analyses of relatedness and information content in any textual information, and especially in biological sequences. The invention also includes experimental methods for isolating and comparing DNA fragments ("Structured Query Fragments" or SQFs) obtained using site-specific cleavage effectors acting on substrate DNA that is asymmetrically end-immobilized on a solid support. A small, structured array of such cleavage effectors may be used in a combinatorial fashion to generate progressively expanding sets of asymmetrically end-immobilized, double-stranded DNA. This ultimately yields extremely large numbers of SQFs, which typically have lengths in the range of 100-700 nucleotides (and are termed ranged SQFs). Thus, each SQF is defined by a method (a specific combinatorial pathway required to isolate it) and one or more properties (typically its length). These attributes yield sufficient information to identify and assign ranged SQFs to specific locations in known sequences automatically using the software disclosed in the invention. The invention shows how millions of individual ranged SQFs distributed throughout the human genome may be unambiguously identified at nucleotide resolution using a fragment analysis instrument. Accordingly, the invention provides a computational method that is flexible and efficient at comparing large amounts of textual information (typically biological sequence data), and a unique laboratory strategy that emulates the computational method and provides a highly scalable approach for physical analyses of polynucleotides. This laboratory strategy allows for the analysis and isolation of large numbers of specific SQFs of interest, without the use of cloning techniques or polynucleotide amplification protocols that require locus-specific primers.