Close and finish your microbial genome with two libraries
- Flexible insert sizes: gel-based 20 kb and gel-free < 8 kb
- Enable microbial genome closure using 20 kb protocol with your existing Illumina sequencer
- Ideal for NGS de novo genome assembly, closure and finishing, chromosomal rearrangement detection, haplotyping, and BAC sequencing
- Increased N50s and larger scaffolds from your assembly in Next Gen Sequencing
Table of Contents
- More accurate assemblies to close and finish your genome
- How does it work?
- Chimera Code™ sequences
- Insert size flexibility
- Indices for multiplexing libraries
- NxSeq Scripts and Sample Data Set
- NxSeq Genome Closure Services
More accurate assemblies to close and finish your genome.
Fragment libraries are not sufficient to fully assemble genomes due to repetitive elements that make the correct order and orientation of contigs impossible to determine. Long span read data (obtained through mate pair libraries, jumping libraries, or informative mate pairs) can be combined with fragment libraries to properly assemble next generation sequencing data into large scaffolds, enabling easier genome closure and finishing. The addition of mate pair libraries can make cost-effective genome closure a reality, with limited manual sequencing required for small genomes.
Figure 1: De novo assembly and closing Thermus aquaticus
JGI Permanent Draft Genome | Fragment library | NxSeq® Mate Pairs + fragment library | Manually finished Thermus aquaticus genome | |
# Contigs > 500 bp | 22 | 54 | 22 | NA |
Contig N50 | 106 kb | 79 kb | 144 kb | NA |
Max contig achieved | 343,213 | 167,687 | 260,181 | NA |
Genome scaffolds > 5kb | 0 | 0 | 1 | 1 |
Max scaffold achieved | 343,213 | 0 | 2,161,678 | 2,158,963 |
Genome size | 2,338,193 | 2,256,923 | 2,161,678 | 2,338,240 |
Plasmid scaffolds | ? | 0 | 2 | 4 |
Plasmid sizes | ? | NA | 14.5 kb, 70.3 kb | 14,047 bp, 16,597 bp 78,727 bp, and 69,906 bp |
Fragment library
5,001,861 reads – 4,656,638 mappable Assembled 2.5 M reads in SPAdes with K45
- Contigs > 1kb: 163
- Contigs > 500 bp: 184
- N50: 56,903
- Max contig: 179,213 bp
8 kb NxSeq library
Megaruptor sheared
- Raw reads: 6,909,356
- True mate pairs: 3,288,275 (48% of raw)
- Mate pair distance: 8,310 bp
De novo assembly of E. coli K12 genome. 2.5M fragment reads were assembled de novo into 163 contigs over 1 kb by SPAdes 3.1. Scaffolding was performed with commercial software using 3.2M 8 kb mate pairs. The single scaffold was compared to a reference genome with Mauve 2.3.1.
Figure 3: Assembly of Repeat-Rich Mouse BACs Assembly Forms One 171kb Scaffold
Sequence assembly for two repeat-rich mouse BACs. The sequences were assembled with DNAStar software using Ion Torrent 400 bp fragments and 5 kb NxSeq sequence data. Despite having over 50% repeat sequence, two BACs were each assembled into single scaffolds of 171 kb (shown) and 143 kb (not shown).
Back to top
How does it work?
Lucigen has created a new paradigm in long span read technology via highly efficient mate pair library prep technology. Genomic DNA is sheared to the desired size (2-8 kb for bead-based methods and 10-20 kb for gel-based sizing methods), end repaired, A-tailed and ligated to barcode adaptors prior to size selection. The insert is ligated to a unique multiplex coupler with encrypted Chimera Code™ sequences. Samples are then treated with exonuclease to remove unwanted DNA, and finally digested with a selection of endonucleases to produce the correct sized di-tags. Biotin capture allows for the removal of unwanted DNA fragments prior to the addition of a Junction Code adaptor and re-circularization. Libraries are then PCR amplified and sequenced on an Illumina sequencer.
Figure 4. NxSeq Long Mate Pair Library Workflow

Back to top
Chimera Code™ Sequences
Lucigen\'s patent-pending Chimera Code sequences are the key to achieving ultra-high frequencies of true mate pairs, ensuring the most accurate assembly possible. Software analysis of final sequences filters out false mate pairs formed by chimeras during the library prep process. As a result, most libraries achieve >90% true mate pair efficiency.
Figure 5. Chimeric Read Detection
Figure 6.
E. coli DH10B 2kb | E. coli DH10B 5kb | E. coli DH10B 8kb | |
Raw Reads | 6,377,792 | 5,995,974 | 6,851,682 |
Total Mates | 2,167,286 | 2,242,930 | 3,091,359 |
True Mate Pairs | 2,071,267 (96%) | 2,094,413 (93%) | 2,938,426 (95%) |
Chimeric Reads | 96,019 (4%) | 148,517 (7%) | 152,933 (5%) |
Avg. Read Length (after split) | 170 b | 161 b | 159 b |
Total Mate Pair Bases | 352,115,390 | 337,200,493 | 467,209,734 |
Mapped Mate Pair Distance | 2,543 | 5,145 | 6,191 |
Back to top
Insert Size Flexibility – You Choose Your Library Size
The NxSeq Long Mate Pair Library Kit can accommodate a wide range of insert sizes to fit your needs. Bead-based, gel-free fragment sizing protocols enable libraries up to 8 kb insert size, while gel-based sizing protocols will accommodate 10-20 kb insert size. The result is tight sizing of your mate pairs, enabling accurate and complete bioinformatic assembly.
Figure 7. Long Mate Pair Libraries

An 8 kb NxSeq Long Mate Pair library was constructed using bead-based, gel-free methods, and a 10-20 kb mate pair library was constructed using gel isolation. Resulting true mate pairs were mapped against the respective reference genome to determine the resulting mate pair distances.
Back to top
Indices for Multiplexing Libraries
Want to multiplex up to 12 libraries at one time? Lucigen offers the NxSeq Long Mate Pair Library Index kit with 12 different indexed amplification primer sets (Illumina compatible). See the ordering information tab for more details.
NxSeq Scripts & Sample Data Set
To perform bioinformatic analysis of your Illumina runs, scripts must be run to confirm Chimera Code and Junction Code sequences as well as filter out these sequences prior to final assembly. These scripts, along with a sample data set for trial analysis can be found here.
NxSeq Genome Closure Services
Would you like to have a NxSeq Long Mate Pair library, but don\'t want to do it yourself? Contact our Custom Genomic Services group and we\'ll provide a no-obligation quote for a range of services offered by Lucigen.
Back to top
ORDER INFORMATION
For a full list of reagents and components included in this product, refer to the user manual. The NxSeq Long Mate Pair Library kit includes two boxes, each of which can be ordered separately. Box 1 contains all reagents necessary for end repair and tailing of fragmented DNA, ligase, and an internal adapter sequence. Box 2 contains reagents for ligation to the coupler and Junction Code™ sequence, exonuclease digestion, biotin capture, and amplification.
The NxSeq Long Mate Pair Library Index kit contains 5 reactions each of 12 separate index primer sets, for a total of 60 index reactions. The kit may be ordered in combination with the library kit or as a separate item.
For research use only. Not for human or diagnostic use.
1 This kit contains the reagents necessary to generate 10 libraries of 8kb or less. Larger libraries, 10-20kb, will use more reagents and generate fewer libraries per kit. Instructions to generate mate pair libraries using 10-20kb inserts are described in SP001: NxSeq 20 kb Mate Pair Protocol.