Genomic Data Biobank
In Zettagene we believe that the analysis of genomic should be straightforward and efficient. We do not want our users to take care about security, performance, storage of the data or integrity of their computing environment.
After several research projects within genomic data field we came up with the solution that addresses the most important challenges that Genomics puts before the Computer Science. Solution that uses the latest advancements of Big Data technologies and Cloud computing, but can be deployed on premise or in the cloud.
Our design principles
Genomics combined with distributed data processing technologies are evolving rapidly these days due to innovations and research. However you need enterprise-ready software that will be reliable and properly tested for diagnostic purposes extended by audit capabilities.
The secret of user satisfaction is to provide the right granularity of information with adequate interface to every group of business users, for medical researchers, clinicians and physicians. We give the users high degree of flexibility.
Genomic Data Biobank is built on open architecture to provide high level of interoperability to benefit from external services and databases of variants to enhance diagnostic process. Data can be accessed through standard database interface or high-performance API so it can be used by external systems. We are leveraging the power of Open Source tools to meet the enterprise IT requirements.
Solution is designed to store a large number of genomic WES/WGS data enriched with variants databases on one side and process/re-process data in parallel sessions to allow different groups of users access required information. In the future biobank will be extended to store data sets for proteomics, metabolomics, transcriptomics.
We are building solution with security by desing principles and compliant with HIPAA and GDPR. In fact it stores the most personal information ever - it has to be secure!
Artificial Intelligence / Statistics
Analysis of patients data is the most crucial part of diagnostic process. It can require ad-hoc access to specific data, comparison of sample with the larger database or unsupervised searches that combine sets of variants, sets of genomic intervals and phenotypic information. We are prepared for all usage scenarios and aiming at speeding up analysis by using A.I.