Skip to main content

Optimizing Research Outcomes

A high-quality sequencing service is built on several pillars: process reliability, partnership support, and technical excellence. Process reliability ensures projects run smoothly, with realistic timelines, standardized workflows, proactive communication, and contingency plans to handle any unexpected issues. Equally important is the partnership aspect: a high–quality service provider offers consultative guidance for choosing a test, ensures well-organized data delivery, and provides post-delivery support. Blueprint Genetics is investing in long-term collaboration, helping research groups maximize outcomes and avoid common pitfalls. Blueprint Genetics WGS and WES sequencing services for research use meet the highest standards in quality and performance to maximize potential for successful research outcomes. In this webinar Ville Kytölä, Director of Bioinformatics Engineering, and Mikko Muona, PhD, Senior Bioinformatics Manager from Blueprint Genetics will discuss how to evaluate WGS and WES services, understand key technical and operational criteria to ensure that sequencing projects achieve high-quality, reproducible results.

Webinar objectives

  • Understand the key pillars of a high-quality sequencing service, including process reliability, partnership support, and technical excellence 
  • Learn how to evaluate whole genome sequencing (WGS) and whole exome sequencing (WES) services to ensure high-quality and reproducible results for research projects

About the speakers

Ville Kytölä

Ville Kytölä, MS, works as Director of Bioinformatics Engineering at Blueprint Genetics, overseeing the bioinformatics product development process focusing on designing, building, and delivering new diagnostic tests and services. Ville has a mixture of industry and research background in bioinformatics, with a strong engineering focus and mindset from technical university.

Mikko Muona

Mikko Muona, PhD, is a Senior Bioinformatics Engineering Manager at Blueprint Genetics. He leads the clinical bioinformatics operations team in the company and serves as the technical and general supervisor of bioinformatics. Mikko has several years of experience in clinical genetics research and diagnostics and is an expert on the application of bioinformatics methods to identify novel or known genetic causes for Mendelian disorders.

Hello, hello everyone, and welcome to today's education. Our webinar titled Optimized Research Outcomes, Whole Genome Sequencing and Whole Exome Sequencing in partnership with Blueprint Genetics. The webinar is brought to you by Blueprint Genetic, a genetic knowledge company committed to providing an innovative approach to genetic testing. My name is Taina Vuopio and I have the privilege of hosting today's webinar. Please submit any questions you may have in the QFI box. You can submit them throughout the webinar and we will answer as many as possible at the end. So we are very excited to have Vila Quitola and Mikko Moana as our presenters today. Willa Kutela, Master of Science, works as director of bioinformatics engineering at Blueprints Genetics, overseeing in bioinformatics a product development process focusing on designing, building and delivering new diagnostic tests and services. Nick Komona, Feeds a PhD, is a senior bioinformatics engineering manager at Blueprints Genetics. He leads the clinical bioinformatics operations team and serves as a technical and general supervisor of bioinformatics. Thank you for being here today, please. All right, thank you so much, Tina for that, that introduction. And on my behalf also welcome everybody. I appreciate you're taking the time to join and listening to our webinar about our new sequencing service. I'll be getting getting us started here. Let's cover the first half of our webinar with me and then we will change, change to Mikko and let's let's get started right away then. So objectives for the for today then they are they're twofold. So first of all, we wanted to talk through understanding the key pillars of what we think constitutes a high quality sequencing service. We think that should include things like process reliability, collaborative support, technical excellence, and data stewardship and compliance. On the other hand, we also wanted to discuss the differences of whole genome sequencing and whole exome sequencing, which are part of part of the service that Blueprint is providing, and how these might translate into something that benefits your research. So starting with the latter. So first of all, just a few general words about whole exome sequencing and whole genome sequencing. If we start on the left, whole exome is a very widely used and well known assay and a well known approach for characterizing human DNA. If we think about the benefits of the whole exome approach, it is it is a cost effective way of running DNA sequencing. Another factor then is that by focusing on the coding regions and the whole XO, we do achieve typically a much higher sequencing depth over over the targeted regions, which can have significant impact for specific types of variants or specific types of research use cases. Again, like I said, it's a very familiar assay. There is a large body of publicly available well sequencing data, but then also derivatives including different population frequencies around the world. And then also a lot of guidelines whether it's for about the clinical use of that or you know a large body of research work too. And maybe another factor factor to consider is the smaller data volume, So whole like some sequencing data, whether we talk about the raw data or then the actual variance is much smaller and maybe easier to handle, doesn't require similar amount of compute might be easier for some research use cases even with large numbers of samples on the whole genome side. Of course, that's definitely towards the towards the future of the field offers more unbiased genome wide coverage, you know, coding, non coding, capturing them entire whole genome, maybe in quotation marks still, because naturally there are still assuming we talk about the short read sequencing methods, there are always areas that will be difficult to cover using short reads, but at least in quote quotes that we're capturing the whole genome. If, if, if going PCR free is an option, then that will also help with the uniformity of coverage reducing, reducing certain biases in the in the sequencing, which will always be present, for example, on the capture based whole XO whole XO methods. Maybe another somewhat related feature is that the whole genome might react a bit differently to to the sample material. So some materials like saliva based samples then they might behave very differently with whole exome and whole genome, which is maybe an interesting factor to consider. Also on the whole exome side when we talk about capture based methods and we are able to well capture the a human derived portions of the DNA at least in those areas that don't have a lot of homology between the sort maybe a source of contamination and and human, human genome. Whereas on the whole genome side, we end up sequencing whatever is all the DNA that is, that is in the sample, for better or worse. Essentially, on the whole genome side, still higher CNV, copy number variant and structural variant sensitivity is definite. Plus, if that is relevant to the research use case, given the uniform uninterrupted, mostly uninterrupted coverage, then we have much better chances to capture copy number variations with much better resolution using better algorithms than the ones we typically have to use on the whole exome side. And then structural variants and more complex rearrangements of the genome, we have better chances of capturing those. Still, maybe with the short read sequencing limitations in mind, whole genome side, one could argue it's also more future proof. So we fully expect that the research community will continue to rapidly discover new things about the human human genome and its relevance to, for example, disease, various other use cases. So the whole genome data cohort might be much more future proof for upcoming research use cases too. And then the uniformity of coverage, I said enables various things. On the other hand, it's good to keep in mind that the typical whole genome sequencing depth of 30X may not be sufficient for all use cases. So if we sort of know what we're going for and then on the whole exome side we know it is covered and we require deeper sequencing coverage, then it is definitely an option worth considering. All right, so then switching gears and moving forward here. So now if we imagine what sort of pillars would create a high quality sequencing service. So let's approach this first from a more generic point of view where we are thinking what we would be expecting to see if we were, if we wanted to essentially get sequencing data and bio traumatic analysis for research use cases. We decided to break this down into these four areas. So we're talking about process reliability, collaborative support, technical excellence and data stewardship and compliance. Let's let's dive in. So process reliability, first of all, if, if we imagine that we are setting out to a partner with sequencing provider for, for our research, of course, it's essential that we can rely on the partner and we are and we can rest assured that we are going to get accurate results in a timely manner. They are reproducible. They are of you know, uniform wallet. They may be across batches, maybe even across projects. And then there are several sub sub factors to go and consider here. Sample handling for sure. I would expect to see clear processes for receiving, storing and processing specimens. And then of course I would expect the appropriate quality controls be in place. For example for input DNA library prep, sequencing, and then the bioinformatics analysis. All of these would would require their own own quality controls and and reviews. I would want to see standardized, well widely used assays maybe also go into more technical matters. Well, we can do that on the next slide. Definitely critical to keep to the turn around time commitments and I'd be interested to hear about potential redundancy or fail safe. So what if something goes wrong? Something eventually always goes wrong. So how is there a partner prepared to handle that sort of situations? On the other hand, then the sequencing services. So if, if I imagine again using using a partner here to support my research project, I would definitely be looking for a collaborative partner who is not just treating treating this as a transaction. Not saying that would be bad. Sometimes of course, I might know exactly what I want and I might not require a lot of setup or discussion to get started with the project. But other times there are definitely cases where it would benefit to have a discussion with the experts. Maybe I don't run a lot of sequencing projects or maybe maybe not in the not in the scale I have in mind right now. And I would appreciate being able to discuss with the experts and maybe do some design design of the project together. And that could include things like consulted the project design, transparent communication, dedicated scientific support, and then flexible engagement. And in general, staying, staying informed throughout, throughout the project, throughout the processing of the badges and in general, feeling that the communication, the information flow is working. Potentially even education and and access to resources could be something useful, especially if I'm doing something, something new here. The third, third pillar is the technical excellence. So naturally I would expect to get a high quality product because in the end that's the that's the thing that I would be purchasing here. That's the thing I would be acquiring. So high quality sequencing data with state-of-the-art sequencing platforms and algorithms if, if I would be interested in using ready analyzed, ready analyzed results. So that could mean discussions about the assay, the protocols potentially also whether the partner is running off the shelf. If we talk about all exome, are they running off the shelf capture kids? Have they customized those in any way after seeing that they result in better yield or basically producing more typically relevant genomic regions on the on the whole exome side? I'd want to hear about the biofromatics, whether there is anything that is setting them apart on the analysis side. And then of course, understand in general more about the partner here in terms of what are they typically running, what kind of business there, what kind of sample flows they're typically handling about their validation benchmarks and also the scalability, especially if I would be planning a larger study. So are they, are they going to be able to support my use case here in terms of scale? Then lastly, it's so important also to consider the data and compliance. So typically, especially when we are discussing human samples or in general sensitive samples, it's essential to feel comfortable and convinced that the data storage is secure, the data transfer is secure, appropriately encrypted, and the partner is complying with the laws and regulations and they know what they're doing. That might mean things like following GDPR or HIPAA, depending on where one is located and then understanding the relevant local, local laws and what does it mean? And finally, also the ethical responsibility in terms of being clear on what is the purpose of the use of samples, How are they going to handle data retention? Who is going to own my data during and after the project? Will the partner retain a copy of the data and use it for their purposes or am I going to be the owner, owner of the data? That sort of questions are would be really important. All right. So then moving forward from the considerations that we went through that one could use when assessing different sequencing service partners moving on towards the Blueprint, Blueprint genetic sequencing service. So first a few words about who we are and what we do. So Blueprint, it's a quality focused and transparent genetic testing company. We have over 13 years of experience doing clinical genetic diagnostics. That's the bread and butter of what we do. Of course, a core part of running a high quality or let's say quality focused clinical diagnostics business is having a very high quality and reliable sequencing lab and the related biofromatics. So the intent here is that with the whole genome and whole like sub sequencing service, Blueprint is able to support wider range of use cases with good quality data. And the processes we already implement at our company essentially the sequencing services, they use similar workflows to the accredited clinical products and they are being run in the same laboratory. However, may be worth noting that the sequencing service is a research use only product. So in that sense it's not in the scope of our clinical accreditations. However, like said, similar, similar processes, similar workflows, similar lab are being used there. So we are quite happy about the quality, quality of the products. Then further furthermore, we have a dedicated team that collaborates closely with the customers throughout the process throughout the project, ranging from setting things up to supporting that this may be the design of the study, helping to get everything, everything started to get the sample batches flowing, help and potential if needed with acquiring the data afterwards. And of course being just available for general questions at at all times. And there is a committed project coordinator overseeing each project which we find really important. Here are some names, names of the team of our in house team here. So I could start from the left here with Nellie, who is our client services project lead. She is acting as the project coordinator in in many projects at the moment, and she's perfectly positioned and skilled to lead communications and help customers set up, set up new projects and connect them to the right people when questions arise. Tele. Salomon Pereira is our Senior Director, Senior Scientific Director and overseeing the wet lab processes and the R&D of of our assays in the lab. And he's he's an expert and has actually been with the company since from almost from the very start and has tremendous amount of experience setting up and scaling different sequencing assays that we prefer to use. On device and N is one of our bioinformatics engineers also working hands on with the things like data delivery and is able to answer a wide range of questions when it comes to bioinformatics and the data. And then you Hakoski and Wah is our medical Executive Director and he's our clinical laboratory director. He has research background and he's our go to person when we talk about questions related to research with this sequencing data. All right. Then moving on to the performance of our assays and this is a good point to switch over to Mikko who is able to tell more about the performance and a few more details about the service and process flow. Thank you. Thank you. Will add welcome everybody to the webinar. So I'm Mikko on it. I'm managing the team who who is processing the BIO 1X analyst pipeline that processes the next generation sequencing data. And I will guide you through our sequencing services and really go into the detail what are the components of our services. And yeah, we have talked here a lot about the quality and I will show next what are our performance characteristics in both of our assets that we operate as part of the sequencing services. And our goal in developing the sequencing services has been to deliver indeed high quality sequencing data that really maximize the probability to identify something meaningful from the patient samples. And maybe the single most important aspect is to have uniform sequencing coverage. And in both our assets, we that's that's our our ultimate goal in both whole genome sequencing and whole exome sequencing, we operate high throughput the alumina sequencing instrument that produces 150 base pair and paired and reads In whole genome sequencing, we by default target to achieve 30X average coverage throughout the whole genome. And and this slide also represents the minimal acceptance accepted criteria that are used in our sample FAD policy. So if those criteria are not met, they will be re sequest. So we're really aim to achieve accepted a minimum, minimum acceptable quality in each and every sample because we know every sample is a very precious one. In whole exome sequencing, we are on average have 99.6% of the bases covered at minimum of 20 sequence rates and and that is translated to average sequencing depth of 154 sequence reads. And as we have really mentioned, we do use a similar workflow for for whole exome sequencing As for any other accredited clinical samples. So basically very good. The qualities are very similar to any clinical samples we operate through our accredited laboratory. And next, next slide, I have a bit of evidence of our high quality and the whole exome sequencing service. So we participate in various proficiency testing programs in our laboratory and the one organized by EMQN, which is European Molecular Genetics Quality Network is such where hundreds of labs throughout the Europe are sending their samples or about sequencing data to EMQN who then evaluates the quality of that data. So the MQN ships the samples to the labs and then the labs send the data data to EMQN and we have submitted our whole exome sequencing data there. And in this box plots, I want to highlight the one in the middle the D plot which so shows the blueprint index data as a black black dot there. And it is about the sequencing coverage and the percentage of target regions covered with more than 20 reads. And there we clearly are above the median values and and really, really higher in higher level than most other laboratories. And the same same high performance is is seen in in any other other quality metrics there are. So it's nice to not to have this kind of independent validation to our our claims about the high, high quality. And then in the next slide, I will go a bit to the actual output of of the sequencing services. So we we obviously share, share the actual sequencing reads in either fast queue format or cram or BAM format. So they're the unaligned or the aligned sequence reads. So some, some customers may want to use their own own pipeline to call the sequence variants from the raw data. But we also provide variant calls in in a few formats which vary a little bit between the whole genome sequencing and whole like some sequencing services. So in whole genome sequencing, we provide the single nuclear variant calls, insertions and deletions, copy number variance, structural variant cost as well as basic repeat expansions. Currently in whole XM sequencing, we have we have a bit of a more limited output, so single nuclear variance and indulse. And to ensure the data ingrated during the data download, we, we provide the MD5 checksum checksums that can be used to ensure that the files that have been downloaded have not been corrupted during the data download. And one key component of Blueprint Genetics is the data transparency. So we provide quality control files, so describing the quality performance of each sample. So anyone can then evaluate the sample specific performance of the sample specific performance. And yeah, data safety and privacy, they are, they are really important for us. We are, we are located in EU and both the laboratory and the data results in the European Union region all the time. And we have really, really string and protocols ensuring that the data remains just between between the customer and us. Then in the next, next I will briefly also touch our biometrics analysis pipeline. So of course the laboratory part is critical to ensure that the data is of high quality. But of course we need to analyse the actual sequencing data and maximize the potential of that data. And in the biometrics analysis pipelines, we apply either Dragon tool in whole genome sequencing or sentient tool in whole exome sequencing workflows. And we have in house tested and validated both and have been satisfied with their performance. We have used reference samples to assist test sensitivity and specificity of our workflows. But but both are actually tools that have been also demonstrated in the precision of FDA true challenge programs as the top performer. So their variant calling performers have been among the best in the participating software. So combined both the lab performance, I mean the sequencing coverage combined with Dubba mix and as pipeline we believe we have we are able to provide the highest quality data overall. And the next I will also show just a snapshot of our quality, quality reports in whole genome sequencing. So they are quite, quite thorough and give good insight on the on the sequencing coverage. The the one can assist any any biases in the in for example, the GZGZA percentage based coverage also some visualisation of the copy numbers across chromosomes. So yeah, we believe those are those are very useful for for anyone dealing with the data. And then finally, I will move on walking us through the ordering process and their key is our order portal nucleus where the orders are placed. But before that, we indeed have dedicated people to help help anybody to set up the project. And we have a coordinator for each project and, and once the project is set up, the orders can be placed and the samples can be shipped with with labels to us and and the nucleus portal which show a snapshot soon can be used to follow the sample and batch status. And yeah, then once the samples are received by the lab, they are sequenced and analyzed through the analyse above in this pipeline. And after after that is done, the sequencing data is available for 30 days to be downloaded to your preferred environment. And then next I have a screenshot of our Nucleus order portal. So very simplistic view there to show clearly what is the status of each batch now that has been submitted for any customer. And then finally, we have that download page. So we have a couple of options to to download the data either directly from the browser or then using a script that can be executed to command line. But yeah, that's it. It's, it's not a super complex process. I mean, it's getting samples in, getting them sequenced and analyzed and then downloaded. We are very happy to take any questions from you and thank you for your participation. Yeah, thank you, Milan and Nico, excellent overview and we have received a few questions. 1st starting, does Blueprint have long rates sequencing? Great question. We don't currently have a commercial offering offering any long range sequencing based method. That being said, we are very aware of the developments there and are always looking, looking into emerging technologies. I suppose with the long reads, it often comes down to cost and the utility of the approach and then I think that starts to be getting to the right, right place at the moment. But our current commercial offering whether whether clinical or sequencing service, that's still focused on short reach sequencing at the moment. OK, thank you. So then the next one, how do you handle mitochondrial genome sequencing? Yeah, we have in both whole genome sequencing and whole exome sequencing, good, good sequencing depth there. The variant cores of mitochondrial DNA are not among the variant core files in Holo exome sequencing, but they are in the whole genome sequencing. But it's very, I mean in both our services we provide very good possibility to to investigate even heteroplasmic low, low level heteroplasmic variants in the in both of the assets. Yeah, right. So the raw sequencing data also for whole exome sequencing will contain the mitochondrial sequencing reads. And then the researcher may may use whatever methods of their choosing to analyse, analyse the data further depending on the use case. Maybe it's just mitochondrial variant calling, maybe it's something more specific like Michael mentioned, low levels of heteroplasmia are relevant for some use cases, maybe even go for deletion, deletion calling or something like that. So using the raw data, then these research cases, use cases are possible. So how, how to contact Blueprint? Yeah, I assume that for for, for ordering, right. Well, I would go to our website and then approach us that way. But of course we have really good and collaborative sales teams around the world. So I'm sure we'd be more than happy to be in touch. There is a contact form on on our web page if, if that's a useful that that's one useful way of getting in touch. I don't mind anybody here reaching out directly to to any any of us as 3 here. So that's also an option. OK. Why is best not able to provide CNB detection? Yeah, yeah, that's very, very good question. Actually quite commonly asked question. Yeah. In whole genome sequencing, the copy number of area detection does not require other samples. But in whole action sequencing, the copy number variant detection in our workflow is done so that we have other other reference samples in each sequencing run. And so the variant calling is like context dependent and they're kind of from that purpose is from that background. It's been a bit problematic to for this whole action sequencing samples to share copy number variant calls without the context, the overall context from the sequencing run. And that is that is like one important explanation why we are not at this point able to share the copy number variant calls as part of our sequencing services exactly. So for whole exome on the clinical offering, we are indeed detecting copy number variance and routinely using those in in the diagnostics. So it's it's not that the whole exome assay could not be used to detect copy number variance. However there are definitely complications there compared because of the range of algorithms we have available because of this targeted nature of the whole exome sequencing. There are also for our clinical offering we utilize several maybe we could call them proprietary or self developed ways of filtering down the data maybe in maybe in ways that are then not well they would they would require access, access to reference data like like Miko was Miko was saying yeah, maybe those are maybe those are the main reasons. Yeah. So is it a question is it securely detected by VGS? Yeah, yeah, it's like a yeah, it's both. I mean, both are good, good. And we are, we are like we have validated the copper number variant detection in our whole axiom sequencing and the performance is is good. But yeah, it's the kind of the interpretation of the the data requires like the in data context and the reference samples from the same run. So that that's, that's that's the reason. So that kind of the performance and kind of the sensitivity is not the issue in whole axiom sequencing, but like interpretation of the outputs, right. So the whole exome side is using a so-called coverage based method for detecting the copy number of variants. And there we compare any given genomic region to that pool of to that reference sample pool to detect whether we are at the same level above or below. But with the whole genome, then, like Miko was saying, on a single sample level we're able to. Discover where in the in the reference genome, let's say a duplication or a deletion starts. Potentially we're able to locate the exact break point, break points of those events, which is then not also not possible with the whole exome sequencing mostly because they reside often in the in trans or intergenic regions which are not, not captured by the whole exome just due to the nature of the essay. Yeah. And then additional question, is this possible by gene panels? Yeah, yeah, yeah, yeah. It's same, same answer as here for a whole exome sequencing. So yeah, in our gene panels, yeah, we are providing copy number variant detection. But yeah, from from like that, from data sharing point of view, we are limiting the output to only seeing a variance and in those, OK, so how do you guarantee data security during transfer and storage? Great question. Well, we I could say that we use multiple layers of protection in our in the environments where we process and then restore store our data. First of all we are always ensuring that our data is encrypted at rest and in transit. So we are only only using encrypted storage to store our sequencing data and all so-called production grade data is access controlled. So we have specific personnel who is authorized to access those production environments and have have access to to the data then again for transit. So transfer, transfer of the files. So same, same thing there. We're ensuring that all of the Https://traffic over the Internet. This then TLS encrypted which is the which is the usual industry standard. And we are also then providing separate download credentials and ensuring that every file download has to be authenticated. So no, so that you know it's not possible to share a link and by getting access to the link, one could download all the data you will have need to have credentials and authenticate to be able to download any and all of the result files too. So that we can make sure that it's it's really the intended person who is getting access to that result data. OK, that was all. So thank you both Villa and make a excellent presentation. And I want to mention that a recording of this webinar will be available and we will send it by e-mail when it's ready. So thank you again and I hope a great rest of the evening or day for those who are in other time zones. So thank you. Thank you very much. Thank you very much.

Webinar information

Date:           October 1, 2025

Time:           5:00 PM EDT / 7:00 PM CEST

Duration:     1 Hour

C.E.U:           —

Register for the webinar

Subscribe to our newsletter

Subscribe