The Basics and Beyond of PacBio SMRT Sequencing
Posted by kikoetgarcia from the Health category at 25 Sep 2023 09:08:44 am.
PacBio sequencing platform is a single-molecule real-time sequencing technology developed by Pacific Biosciences. PacBIo sequencing features long read length, high accuracy, uniform genome coverage, and detection of base modifications while sequencing.
How does PacBio sequencing work?
PacBio sequencing technology is based on the principle of sequencing while synthesizing, with sequencing lengths up to 30kb and throughputs up to 20 Gb. PacBio sequencing captures sequence information during the replication process of the target DNA molecule. The template, called a SMRTbell, is a closed, single-stranded circular DNA that is created by ligating hairpin adaptors to both ends of a target double-stranded DNA (dsDNA) molecule.
What is SMRT sequencing?
Single Molecule, Real-Time sequencing, which we abbreviate as SMRT sequencing, is a technology introduced by Pacific Biosciences of California, Inc. (PacBio).
SMRT sequencing uses four-color fluorescently labeled dNTP and ZMW wells to complete the sequencing of a single DNA molecule. In each ZMW well, a single DNA molecule template is bound to a primer and then, after binding DNA polymerase, is immobilized to the bottom of the ZMW well. When four-color fluorescently labeled dNTP is added and DNA synthesis begins, the attached dNTP will stay at the bottom of the ZMW for a longer period of time due to base pairing and emit a corresponding fluorescent signal after excitation to be recognized, and the returned fluorescent signal will form a special pulse wave. On the other hand, because the fluorescence signal is attached to the phosphate group of dNTP, when the last dNTP is synthesized, the phosphate group is automatically shed, which ensures the continuity of the detection and improves the detection speed of 3 bases per second synthesized with a high-resolution optical detection system, real-time detection is achieved.
What is HiFi Read? How does PacBio HiFi work?
HiFi reads, or High Fidelity Long Reads for short, is based on Circular Consensus Sequencing (CCS) mode to produce both long read length (10-20kb length) and high accuracy (>99% accuracy) sequencing results.
PacBio HiFi sequencing is currently the model for excellent data types for a variety of genomic applications. In this sequencing mode, the enzyme read length is typically larger than the insert length, so the enzyme is sequenced in a rolling loop around the template and the insert is sequenced multiple times. Random sequencing errors caused during a single sequencing can be corrected by the algorithm itself, resulting in highly accurate HiFi reads.
What are the differences between Illumina and PacBio?
• The sequencing principles are different. The next-generation sequencing is a high-throughput sequencing technology developed based on PCR, while long-read sequencing uses a reversible termination end, which can realize sequencing while synthesizing.
• The difference of library building process. Taking genomic DNA as an example (6-20k libraries), the library-building process is similar to "short sequence massively parallel sequencing", both of them fragment the genome and then add specific splice sequences at both ends of the fragmented DNA, the main difference is that the final bell library needs to be combined with sequencing universal primers, strand-swapping The main difference is that the final bell library needs to be sequenced with universal primers, DNA polymerase with strand substitution, forming a complex of template, primer and polymerase to be tested, and then loaded onto the sequencing chip.
• Unlike sequencing libraries, SMRTbell libraries are single-stranded hairpin connectors, and the connectors attached to both ends of the library are identical (except for barcode sequences in Dual index). This is due to the principle of rolling-loop sequencing, which does not require a partially complementary, partially free structure like Illumina or MGI adapters to achieve PCR and double-end sequencing.
• PacBio provides two types of splice sequences, one is the familiar A/T sticky end splice, and the other is the flat end splice, which is used when the insert fragment is larger than 250bp, and vice versa (which of course means that in 99% of cases, the flat end is used).
• Different fluorescent moieties. Unlike existing second-generation sequencing technologies that label the fluorescent moiety on the 5' end methyl group, the fluorescent moiety of dNTP in SMRT sequencing is labeled on the phosphate group, which is subsequently shed after the synthesis is completed, the same as the natural DNA strand synthesis product, ensuring the extra-long read length of sequencing.
• Illumina focuses on short-read sequencing, producing read lengths of 50-300 base pairs (bp), while PacBio HiFi sequencing can produce read lengths in excess of 10,000 base pairs. The advantage of long read lengths over the very time and labor-intensive assembly of short read data is that it is easier to perform and go all the way through the assembly and has advantages in identifying rare variants and large structural variants.
What is the difference between PacBio and Nanopore?
Both are long-read sequencing techniques, Nanopore and PacBio also have many differences and discrepancies. Nanopore reads are much longer than PacBio, they can reach 330kbp in length, even exceeding 2Mb according to one report. Yield/cell is 245 Gb. It can be used for both DNA and RNA (without reverse transcription), and it can read methylated bases (and other modifications) directly (read).