Data analysis guide
Standard Curve Analysis:
First you need to analyze your data. Don't worry about labeling the wells using the program. We're going to eyeball the results first. Here is a good picture of a standard curve.
Now I will explain how to read this picture. The cycle numbers are printed across the bottom. Real-time PCR always goes for 40 cycles. On the y axis is the deltaRn, or the change in fluorescence from the previous cycle. This is the first derivative of the actual fluorescence. Note that the y axis is in the log scale. The machine is capable of detecting fluorescence differences over several orders of magnitude. In this case, a fluorescence change from 10-3 to 10.
The red line is called the threshold. In a given tube, if sample is being amplified, there is a doubling of fluorescence every cycle. Once this doubling is high enough, it crosses the threshold. The fractional cycle number at which that sample crosses the threshold is called the Ct, or cycle of the threshold. This Ct value is later converted to relative concentration of product.
The baseline is the area in the graph where the samples are below the level of detection by the machine. Even though the products and fluorescence are doubling each step of the PCR, in this case, you cannot see this until cycle 18. The program default is setting the baseline from cycles 3 to 15. See the little red marks above the word cycle? These are the 'nubs' that you can move to adjust the baseline. See this website for a guide to placing the 'nubs' in the proper place.
A word about PCR... In a given PCR reaction, the one limiting reagent is (hopefully) the starting cDNA. All other compounds are in excess, including nucleotides, DNA polymerase, and primers. As long as these are in excess, the PCR should occur with 100% efficiency: the product should double each time. When one of the reagents starts to run out, the efficiency of the reaction no long is 100%. In the above chart, this starts to occur for the most-concentrated standard at around cycle 27 or so. Finally, the doubling reaction levels off, or plateaus, and becomes more linear as the reagents run out, as in cycle 33 or so.
Therefore, the best time to analyze how much starting material you had is during the exponential phase. The exponential phase occurs from cycle 1 until the plateau portion, but the machine can only detect it, in this case, from about cycle 20 onward. The threshold bar is placed at a point where the lines are parallel. Note how evenly the lines are spaced at that point. Also, since we diluted the samples 1:2 for the standard curve, these should be exactly one Ct value apart from one another, assuming 100% efficiency of the reaction. By knowing how are samples are diluted, we can calculate the efficiency of the reaction. But before we go on to calculations, I'll first talk about some of the other methods for examining your results.
The following picture is of a standard curve that is not as good as the last one.
This is also a standard curve plot. While the first few samples in the standard curve look good, the last few get closer together. The fifth standard almost overlaps the fourth. This indicates a problem with the reaction, particularly with the lower concentrations. Low concentrations of template may lower the efficiency of the PCR below 100%.
For this standard curve, I would examine the samples first. As long as the samples were to the left of the standard curve, than the PCR is OK.
In both of the above examples, the water blank never crosses the threshold. This indicates that primer dimers are not contributing significantly to the signal. If a water blank sample does amplify, you can still use your data, as long as the water blank is at least seven cycles from the aberrant bands. This is due to the 'rule of seven'.
The rule of seven. If an aberrant band occurs in the water blank or in the RNA standard control (perhaps due to genomic DNA contamination), check to see how many Ct values they are from your actual samples. If they are more than seven cycles away, their contribution to the fluorescence is less than 1%.
Dissociation curve (SYBR green only)
The dissociation curve is a protocol added on to the end of your sample run. The purpose is to determine the melting temperature of the product(s) in your reaction. At the end of the regular PCR run, the samples are slowly heated from 60C to 95C. Since SYBR green binds to double-stranded DNA, there will be a drop in fluorescence as strands separate. For any given piece of DNA, there is a very small temperature range at which the strands will separate (depending on G/C content, length, and other factors). At that temperature, there will be a sharp drop in fluorescence. If the primers are good, there should only be one product, and therefore only one major drop in fluorescence. Here's an example:
In this example, there is a sharp drop in fluorescence at about 85C. On the Y axis is the fluorescence, and the X axis is temperature. It's easier to look at this data by plotting the y axis as the change in fluorescence (the first derivative of the fluorescence). At the 85C range, the change in fluorescence is greatest. Here's a picture of that example:
This strongly suggests that there is only one product in the reaction. It is possible (but unlikely) that a contaminating product is identical in melting temperature to our product of interest. Without checking on an agarose gel, there is no way to tell.
Here is an example of what normal amplification looks like overlaid on the standard curve.
In this case, there is one bad amplification. See how the one purple band crosses over the other bands. This indicates a problem with that reaction, and therefore that data point can't be used in this reaction.
RNA sample - genomic DNA control
A second control is to run a sample that, during the cDNA reaction, no reverse transcriptase was added. This sample tells us whether there is genomic DNA contamination. Again, follow the rule of 7 with the genomic DNA control. It is difficult to entirely remove the genomic DNA. With Qiagen's RNeasy kit, they say that nearly all genomic DNA is removed during the purification process. They do offer an on-column RNase-free DNAase for removing contaminating DNA from your RNA sample. There are other methods for removing this contaminant. They will be reviewed later.
Data analysis using SDS software
Once you are convinced that your standard and controls are OK, you generate the standard curve. In the Applied Biosience version of the analysis software, you can tell the program which samples are standards. Since we did a standard curve, I typically use the following number scheme for plugging in quantities.
It doesn't really matter what quantity you give the first sample, so long as the second is given half of the first, the third is given half of the second, and so on. It is important that you don't give a value of 0 to your water blank. Since the water blank never crosses the threshold, it is never assigned a Ct value. Once the analysis software knows which samples are standards, it will quantify the rest of your samples, assigning them numbers that correspond to the standard curve. Once again, it is NOT important that your samples fall within this standard curve, since the standard curve only tells us the efficiency of the reaction. Problems with efficiency occur more at low concentrations than at high ones. The standard curve therefore provides a good test for how good the primers are, and for how good the reaction
Data analysis - manual
If you would prefer to perform data analysis manually, it is fairly easy to do. First, analyze the data (the green arrow button), then export your data. Open in Excel, and find your standard curve Ct value data points. Take the log of your curve quantities. For example, see below:
|Ct values||Curve Quantities||Log of Curve Quantities|
Using a statistics program, calculate the linear regression of y=Ct values and x=log of curve quantities. Note the p value of your standard curve to determine if any points are outliers. In this example,
the formula is y = 26.8205 - 3.21336X.
To determine your sample values, use this formula:
10^(26.8205-3.21336*Ct), where Ct is your Ct value for a given sample.
Once the analysis software has converted the Ct values to values in your standard curve, the data can be exported. Reopen the file in Microsoft Excel to do the next phase of analysis.
For each gene that you are doing, you will have a separate standard curve. I copy the numbers for each gene into separate columns.
Next, divide the number for your gene of interest by a control gene, such as actin or GAPDH. This controls for loading differences, RNA quantitation differences, or cDNA generation differences. Also, using two controls allows you to double-check a given sample.
*** Important side-note: If you are averaging your samples (you ran them in triplicate), make sure you average your samples AFTER you control your gene against the housekeeping gene. The housekeeping gene control is specific to that exact sample.
After this, plot your data normally. You have analyzed you real time PCR reaction!