Addressing the Variability of Molecular Assays
Presenter: Jonathan Frampton
JF: Horizon Discovery is a translational company and we have expertise in genome editing, discovery services and diagnostic reagents. And the diagnostic reagents are what I’m going to be focusing and talking about today.
So the question that we asked ourselves two to two and a half years ago when the idea for the business unit was in its infancy was how you know that your molecular assay is working today. And the common response is the positive control that you’re running gives you the expected results and likewise, the negative control you’re running gives you a negative or blank result as well. What we then asked next is really what are the controls you are using, where are they from and what is their suitability and how are they validated. Because its often the case that certainly the kit controls that are used have been manufactured specifically with that kit and have gone through all the developmental stages that when you get to using that kit the true reason for that control is to show that the components of that kit work. You can’t use that kit controls to necessarily trust or validate your samples are compatible with the kit that you’re running. And that’s where reference materials and reference standards really play an important role.
So here’s a simple schematic of where a tumor sample goes when it goes through either histology for diagnosis and therapy or some FFPE sections will go to the molecular pathology or molecular diagnostic lab ., where the DNA will be extracted and genotyped. Throughout this whole process there’s opportunity for variability to creep in so one part would be the sample quality, sample heterogeneity, DNA quality, and platform variability and mutation characteristics. And what I’m going to do is talk through each of these points on the next 5-6 slides.
So tumor sample heterogeneity. We all know that human tumors do display a startling amount of heterogeneity, either driven morphologically or by epigenetics. Or a large proportion is driven through the genetics and variation at the genetic level. And here’s a schematic where you have wild type sample all the way through to a tumor sample which has 100% tumor cells. If you’re looking at the phenotype level you may have 100% phenotype, but when you go down to the genotype depending on whether the mutations incorporated are homozygous or heterozygous will then govern whether or not the genotype you can detect at 50% or 100%. If you’re working at the level where you have a 10% mutation you’ll require a limit of detection of at least 5% to be able to detect this difference in heterozygous mutation. 3.50
The next variability that needs to be considered is the sample quality. And this will be much governed by the formalin fixation duration. When the biopsy or resection surgery is taken, if it was taken on a Friday afternoon it could be fixed for a full 24 hours or even over the weekend for a 72 hours fixation if it isn’t processed to the FFPE straight away and this would then lead to your sample being more difficult to extract the DNA or RNA if that’s what you’re looking for, and that would really govern or influence the quantity of DNA you could extract from our sample, the concentration and the purity. To also play into this you also have a number of different DNA extraction methods available. You can get these from Qiagen, Promega, you can use homebrew methods and all of these different kits have slightly different chemistry and that will then provide you with DNA with slightly different levels of DNA quality. The one thing we’ve seen and is often overlooked but is a very important step to consider, and this is the DNA quantification. The most common method for measuring DNA content is by Nanodrop. That is accurate when the concentration is 10ng/μL or above. However when the concentration is below 10ng the best methods and the gold standard methods for testing the concentration are either Picogreen or Quantiflor. And I’ve just put together an example using data from our lab, and we generated a concentration of 2.5ng/μL in a sample and what we wanted to show was if you had a genotyping assay that required 20ng to run, if you had a starting concentration of 2.5ng/μL you would need to add 8ul to have the required amount of DNA to put into your assay. However when we quantified using Nanodrop, it’s actually quantified as 6ng/μL and then taking this value you would only put 3ul in and then this would only give you a total of 8.3ng for this particular assay. So this would lead to the assay being suboptimal and if you were at your lower limit of detection, this would lead you to getting a false negative result. And so it’s critical that the DNA quantification prior to loading your sample into your assay is accurate.
Now as I'm sure you are aware there are multiple assays you can use, there are obviously the Homebrew methods that are often quite sensitive, there’s the Qiagen Therascreen, Entrogen has a number of KRAS and BRAF kits, there’s the Vienna lab TRP assay. These from Triptrom, the data we’ve seen from people we’re working with have a limit of detection in the range of 7.50-----------(audio cuts out)-------------7.53 Sequenom, Roche Cobas and Pyrosequencing system has a range of about 5% and then Sanger sequencing is typically accepted between 15% and 25% limit of detection.
Now on this slide what was interesting was when we sent some of our standards out to one of our partner labs whose running Sanger sequencing, it’s actually a nice dataset that highlights that mutations and different primer sets will actually give you different limit of detections. With this what we were about to highlight was although for the KRAS G12A, PI3 Kinase h2047R and B-Raf V600-E was that they had good sensitivities of around 5-25%. For the PI3 Kinase E545K they had a limit of detection of around 25%. That would mean they would need at least 50% tumor sample to be able to detect this mutation of it was heterozygous. So in this case once we highlighted this they were able to redesign primers to get a much more sensitive assay set up.
Coming back to what I asked at the start of the talk, with all of these variability’s coming into the process it’s absolutely vital that the controls that you’re using are very well validated and suitable. We’ve spoken a lot to the proficiency schemes and testing groups and have very relationships with them, and they continue to tell us that reference materials are very difficult to source. They are often undefined and can be very variable. You’ve got your synthetic oligos, which will always be very easy to detect because they are so short and for many of these assays a simple PCR will be able to see them. Cell line mixtures have been very popular in the past, and in the next couple of slides I’ll introduce why they need to be used with some caution. And then obviously primary tissue samples will at some point always need to be involved in the validation of a molecular assay. It’s just that with primary tissue samples you don’t want to be using them for the initial set up and development, especially if it’s a rare mutation and the tissue samples are very difficult to come by. What you want to be able to do and achieve is to get your assay up and running, understand the limit of detection and then run tissue samples through the system to show you can accept real patient samples with your molecular assay.
So here’s some data from one of our partner labs. They came to us and told s that they had been using an EGFR E746-A750 deletion cell line and were actually able, using Sanger sequencing, to have a limit of detection of 2%, which is incredible and much lower than you would expect. So we asked them to send the samples to us and we would validate the samples they were using with our digital PCR platform. At this point you’ll have to take my word for it that with our digital PCR platform is very sensitive and gives incredibly accurate results. So what we did was take their dilutions – they thought they had made a 1%, 2% 5% 7.5%, 10% and 15%, and then the cell line undiluted and we ran it through the system. And what we found was that what they claimed was a 2% dilution was in fact actually 14% dilution. And so this then explained why they were able to detect what they thought was a 2% using a Sanger sequencing method.
When we then looked into this further and investigated the cell line, we took their cell line which is a HCC827, let’s call it Source 1, and we also took the wild type they were using. HeLA and we did a copy number assay and looked to see how many EGFR copies there were. Starting with the mutant, there was 14 copies of EGFR which explains the much higher allelic frequency than expected. What we then did as well was brought in another cell line classified as HCC827 from a different source (Source 2) and when we actually tested the copy number of that it came out as 38. So from this not only can cell line give you mutations with unknown copy numbers, depending on where the source of the cell line could actually introduce another variability and so one cell line may be characterized by one lab but unless you get that actually cell line from them if you get the same cell line from a different source there may be variability drawn into the assay.
And what was also interesting that the HeLA they gave us had a copy number of 2.6 and then a second HeLA from Source 2 had 2.7 and this is indicative of heterogeneous population where you have a number of HeLA cells with a copy number of 2 for EGFR ad then more cells with a copy number of 3 across the locus. So even if the copy of the mutant was correct, mixed in with the heterogeneous population would throw up errors in the limit and dilution curve anyway. So there were a number of factors actually impacting why this dilution curve was so messy essentially.
So what we set about doing in January 2011 and what we continue to strive for is to make reference standards that contain a defined copy number, a defined allele burden, they show heterogeneity, they were available as DNA or FFPE and that they mimicked tumor genetics. And what I’m now going to talk though for the rest of the talk is how we have achieved this goal and how we set about doing that.
Probably a good time to define what we feel a refer standard is. It’s an independent or externally validated reference material that allows you to understand the sensitivity and/or the limit of detection of your assay. It will facilitate you to monitor and maintain the reproducibility of your assay is suitable for your application and for kit developers you can it for research and development. And this is where I can close at the start – reference standards will tell you your assay is working today. And that's the most important part of a reference standard.
Because at Horizon Discovery we have the expertise in gene editing. We used our gene editing platform Genesis to develop all of our reference standards. So what do we do along this process? This is a very simplified schematic of what we do – we take a wild type cell line like HTCC or other sources, we do single cell dilution. This will ensure we have a clonal wild type cell line banked. And then we do extensive cell line validation on this wild type. Once the cell line validation is complete, we will then do the gene engineering and gene editing event which will then ultimately create a clonal mutant cell line. And again this will go through the same cell line validation process.
It’s important to mention that with our Genesis technology we always introduce a and create a heterozygous mutant cell line and also if you would like a bit more information, or would like to know more of the mechanics of the technology this can be found on our website and we have a webinar where we have a full hour devoted to how this technology works which will be available.
So cell validation. What we do is confirm the identity of the parent cell line by STR as well as running SNP 6.0. We then confirm that the mutation has been integrated at the correct locus using DNA specific PCR and Sanger sequencing. This is comprised of somewhere between 8 and 12 different PCR reactions to ensure that the integration. We then confirm the expression of the modified allele by running cDNA-PCR and Sanger sequencing – this is where we took a big leap forward last January where we brought in our digital PCR platform and are now using it to routinely to confirm clonality of our cell lines as well as confirming the gene copy. So the end point of the engineering is that we have the original clonal wild type cell line as well as our clonal heterozygous mutant cell line. And essentially these two cell lines are identical except for the one mutation that we’ve introduced at the endogenous site.
This is a list of all of the mutations we’ve currently engineered for our molecular reagents and out quantitative molecular reference standards range. In total Horizon Discovery has engineered somewhere in the region of 450-500 cell lines.
So we’ve had the cell lines, we’ve got mutant and wild type cell lines that were isogenic and we went about generating two different types of standards. One is our FFPE reference standard and the idea of that is to monitor and evaluate the DNA extraction proves as well as be able to genotype it using through this workflow. And also we have genomic reference standards which can be used to understand the sensitivity and be used for batch testing of the platform performance. This is a simple schematic of the different ways which we make the standards. So the genomic reference standards, we grow the cell lines, we extract the DNA and we either provide the DNA as a 100% mutant or 100% wild type – alternatively we can create mixtures so you can have a mixture of DNA that will verify using digital PCR and the mixtures range from 1%, 5% to 10% depending on the need for the test.
On this side we how we do the FFPW – so we grow our cell lines, we’ll pellet and formalin fix then go through the FFPE processing step. If we want to generate a mixed FFPE block we will mix the cells prior to formalin fixation and then we end up with the mixed sections.
This slide is a very nice slide summarizing what we we’ve done with our genomic DNA reference standards. This is using the digital PCR platform, the Bio-Rad QX100 what we’ve done here is take the mutant vial which is 50% allelic frequency and diluted that with the matched original wild type used to generate the mutation. We’ve got here B-Raf V600K wild type used to make a 50%, 10%, 5%, 1%, 0.5%, 0.1% and 0.05%and what you're looking at on the left graph is along the x-axis is the allelic frequency that is predicted or expected by our dilutions we've made and on the y-axis is the actual B-Raf V600K that the digital PCR called and read from the system. And what you see is very tight linear regression where the 505 is called at 50% all the way down to 0.05% called at 0.05%. Now obviously for reference standards in your hands the 1% is probably as far as you would need to go right now as the clinical relevance would be below 1% and is not known right now. But for us it is important we always see it as a critical part of our process that we are able to show the accuracy of our validation methods is down at this level. And likewise on the right hand side, this is where we’ve taken K-Ras G12D and done similar dilutions from 50% to 0.1% and again you see this same linear regression. So it gives us great confidence that the genomic DNA standards, not only are we able to generate standards at 50% we are also able to generate precise standards with defined allelic frequencies.
Moving onto FFPE block production. We’ve developed proprietary FFPE technology and this allows us to control very accurately the number of cells and cores that we include in each block. It also allows us to generate a block with homogenous cell distribution, so every single section that we generate has the same number of cells within it. We can change and tweak the allelic frequency and we can then control the section thickness and of course the DNA content. Here is a workflow of what we would do to generate a B-Raf V600K 5% block and sections. We w would grow up the cells, make them to a 1 in 10 dilution, we would prepare the block and sections we would then cut the sections and evaluate the sections using digital PCR and we would then have our sections at 5%.
What we would then do and what we’ve done to demonstrate the consistency of our blocks – here are 8 independent blocks that we’ve made and we’ve done H&E stains and these are just representative H&E stains from within the block. We’ve then taken 20 sections throughout the block and run DNA extraction on all of them independently using a Promega Maxwell platform and what we are very pleased with and think it’s fantastic is the level of consistency between the blocks and also between the error bars is very tight – so within a block each section contains almost identical amounts of cells and DNA content.
We then took sections from throughout the block, and what we’ve done here is made 4 different blocks for B-Raf V600E where there’s a 25% allelic frequency, 5%, 3.5% and 1%. We ran this through digital PCR and what you can see is the 25% block comes out dead on 25%. The 5% is slightly higher and rounded up to 6%, the 3.5% was dead on 3.5% and our 1% comes out as 1.4%. And what was really encouraging for us, and gives us great confidence is that once we’ve made a block the allelic frequency throughout the block is always consistent and maintained. Likewise with a B-Raf V600K we made 3 blocks here, a 50%, 5% and 1% - 50% hits 50%, 5% is on the 5% and 1% comes out as 0.8% which for us is close enough to 1%. So we’re really able to demonstrate and show that the FFPE blocks we make contain the same number of cells and hence the same amount of DNA you can extract from them per section and the allelic frequency that we define is consistent throughout the block. Here’s another couple of examples to show you for K-Ras – we’ve got a 50% and 5% K-Ras block and the allelic frequencies are coming out dead on what they should do.
So what we thought would be an interesting project to run was we developed a FFPE block with a low cell number and so each section has sufficient cells to extract around 200-250ng of DNA – we always find that the theoretical DNA content is always going to be twice as much as what you can achieve due to the formalin fixation and decrease in efficiency in extracting DNA. When we looked at different kits we found that different kits will give you different total yields from the same sections and the DNA concentration can be impacted by the kit you use. What I think is important and useful for molecular labs is always understand what the proprieties of the kits are and what we find in our lab is the column based methods, like the Qiagen column based method, are more variable between different people but are very consistent with the same person running the same kit.
Now moving on to some external data that we have from our partners. This is a diagnostic company that we’re working with and they took 3 of our BRAF V600E FFPE blocks – we generated a BRAD 25% V600E block, 3.5% and 6% they ran it on their 454 platform. This is an independent lab with an independent system and they got dead on 25% when they ran the 25%, 3.5% is 3.5% and the 6% is the 6%. And what is really nice is that you can distinguish the 3.5% from the 6% and so this type of data really confirms the degree of consistency and homogeneity that we have and claim using our system methods can also be validated using other peoples methods.
So here’s another nice example where we‘re working with Molecular MD and they took our EGFR T790M DNA starter pack and this is where we provide a mutant vial on it s own and a mutant vial and you can combine the mutant and wild type to generate a dilution curve from 50% all the way down to 1% and this is exactly what Molecular MD did. And they tested using their digital PCR assay and if you compare the predicted numbers of EGFR T790M copies per μL to what they actually read in the system it’s incredibly similar all the way down. So it’s very nice to see the stoichiometric dilutions that we achieve in house can be achieved using an independent assay in an independent external lab. T
This has been a nice study that Entrogen undertook, and this is using their kit – they have K-Ras, B-Raf and AKT-1 assay kits available. They took our FFPE samples and they made a dilution curve using our samples and they showed that for all of their assays they have a limit of detection down to 0.5%. I should mention that all of our application data is available on our website in the downloads section so if you want to learn a little bit more about this study you can go on there and downloads that.
Now I mentioned this at the start, we’ve worked very closely with the proficiency schemes and to really understand where the gaps are in molecular assay testing. We’re officially working with UK NEQAS and the European Molecular genetics Quality Network, and are working unofficially with a major US proficiency scheme provider as well as three others around the world. We are always open to helping and providing samples to additional scheme so if any of you are part of a scheme please feel free to contact me afterwards and we can discuss whether we can make some custom blocks for you or generate custom cell lines to help you in your proficiency schemes.
Now I’m going to change gear for the last part of my talk and introduce you to some of the newer standards that we have in our portfolio. The quantitative multiplex DNA reference standard is our latest products that we released at the end of February. What we’ve done is take 8 of our genetically defined cell lines and mixed them together in precise DNA dilutions – we mixed them at the DNA level to make a quantitative multiple DNA reference standards then ran digital PCR on this. And what we’ve done is cover 11 mutations originally – the genes BRAF, cKIT, EGFRKRAS, NRAS and PI3 Kinase and the allelic burden ranges from 1% all the way to 24%. And the idea f this is you can now use this standard to start addressing the variability and reproducibility of multiplex assays such as NGS or Sequenome. Our first beta testing labs took our standards and ran it on their Ion torrent platforms. Here’s the data we obtained on our digital PCR platform on all of the mutations and then here are our three partners using Ion Torrent and AmpliSeq panels. What’s interesting and what gives us great confidence in this reference standard is there is a lot of consistency between the panels for a lot of the mutations – there are a few interesting results which were expected. For example the deletion of EGFR is often not detected using some of the next generation sequencing platforms and at the 2% level it wasn’t detected using the Hotspot panel but was detected n the cancer panel. Similarly the T790M at this low 1% couldn’t be detected using these panels. There are a few others – this lab here wasn’t able to detect the G12D mutation but they weren’t sure why and there is another mutation gives a slightly unusual result is the PI3 Kinase E545K down here and this is likely to be attributed to strand bias. The feedback that we got and what we are going to do next is really that we have this standard we can now start to understand where there is variability is in these assays and use an external digital system to really show where tweaks and improvements may need to be made to specific c panels in order to detect these mutations.
What we are now doing is this initial offering we validated 11 mutations and what we are going to be doing next we know there a number of SNPs and indels across these genes in our cell lines and now we will now validate the indels and SNP’s across these genes and once we’ve done that we can add these additional features to these 11 already so we will be building up a multiplex standard that will contain somewhere I the region of 30 features. Then after that there’s another 20 we will focus on and so the aim will be to have a multiplex standard that will have up to 50 features.
Just as I'll bring the talk to an end I will just introduce briefly the other reference standards we are looking to develop and we are always looking to new collaborations or ideas for reference standards or reference materials. So we’ve got the in situ hybridization reference slides coming in March 2013 and what we’ve focused on initially was translocations for liquid tumors where we’ve developed interface spreads onto glass slides and then we’ve also used our FFPE technology to generate FFPE sections mounted on slides for a PTEN deletion, a PAX3/FOX01 translocation and EML4/ALK translocation.
Here’s some preliminary data to give you a flavor – we’re able to use our standards to really test how well different probes work on the different translocations and the feedback we’ve had from our beta sites is that these slides will be an ideal development training tools for any new FISH probes that are being developed as well as being able to be used as batch to batch testing to make sure the probe and user has run the assay correctly on that particular day.
And of course because we’ve generated the EML4/ALK FFPA slides its pushed us toward looking how we can apply that into immunohistochemistry and so what we are now doing is developing a method for being able to provide reproducible and finely tuned range of staining intensities for a number of genes and a number of other targets.
So just to summarize, this is a list of almost 40 mutations that we now have available as quantitative molecular reference standards and we can precisely manufacture allelic frequency from 1% up to 50%.
This is a slide we often include at the start of a talk but I thought it more appropriate to put it towards the end. In building these quantitative molecular reference standards that we have it does allow kit developers and kit manufacturers to optimize new kits for the targets that we’ve spoken about. It also gives molecular pathology or molecular diagnostic labs the ability to test the limit of detection of their assays in validation studies as well as testing on a batch to batch run to ensure molecular assays are working on the day they are being run. As a team at Horizon Discovery we do believe and feel that if we can eliminate molecular assay variability, this will ultimately lead to better patient outcomes.
So I’d like to thank you very much for your time. Please feel free to email me any questions after this talk and I’d be very happy to answer them