This is how Temus employed large-scale genomic data processing & cloud infrastructure to build a pipeline to process 100,000+ genomes at faster speeds
Singapore’s National Precision Medicine Programme (NPM), led by Precision Health Research, Singapore (PRECISE), is one of the region’s most ambitious population genomics programmes. The collaboration between PRECISE Genomics Solutions Unit and Temus was to build data infrastructure capable of processing these genomes (~ 150TB total data volume) whilst dramatically reducing processing time from weeks to days and preparing for petabyte-scale data management.
At the heart of this work lay an opportunity to optimise a sophisticated data infrastructure that required specialised expertise in genomics as well as cloud computing, and large-scale data processing from the technology enabler. Temus was selected as the technology partner to design and implement the critical data pipeline infrastructure that would enable PRECISE to process, analyse, and share genomic data at unprecedented scale. Throughout the development process, TEMUS worked closely with PRECISE’s genomics experts to ensure the pipeline met stringent scientific and regulatory requirements whilst maintaining the flexibility needed for diverse research applications.
As of August 2025, we have built a private and scalable pipeline comprising multiple bioinformatic tools to process about 100K people’s genomes data into summary statistics in just 2 weeks, enabling population-specific insights that could uncover causal biology. This achievement was made possible through the tight integration between TEMUS’s technical capabilities and PRECISE’s deep genomics expertise, creating a truly collaborative approach to solving complex computational challenges in precision medicine.
Photo below: Asst Prof Max Lam (standing) sharing thoughts on the PRECISE initiative with the wider Temus AI and Data and Health teams in a brown bag session.
Large-scale genomic data processing involves the systematic analysis of DNA sequences from thousands to hundreds of thousands of individuals simultaneously. Each human genome contains approximately 3 billion base pairs, generating 100-200 gigabytes of raw data per individual. At population scale, this creates massive computational challenges requiring:
Large-scale genomic data processing is revolutionising healthcare and medical research by enabling precision medicine, accelerating drug discovery or supporting population health initiatives including preventive care. For Singapore specifically, processing 100,000 genomes provides unprecedented insights into Asian genetic diversity, supporting the development of treatments optimised for local populations and establishing the foundation for personalised healthcare systems.
Temus brought the specialised cloud computing and large-scale data processing expertise we needed to build infrastructure capable of handling population-level genomics. Their team worked closely with our genomics experts to design and build a pipeline that met both our stringent scientific requirements and regulatory standards. The result speaks for itself: we’ve dramatically reduced processing time from weeks to days for 100,000 genomes—approximately 150TB of data—unlocking population-specific insights into Asian genetic diversity that are critical to Singapore’s National Precision Medicine Programme. This integration of technical capabilities and our deep genomics expertise exemplifies the kind of public-private collaboration essential to advancing precision medicine innovation and translating genomic research into tangible healthcare benefits for Singaporeans.
Asst Prof Max Lam
Chief Technology Officer
PRECISE
Temus engineered and deployed scalable data pipelines leveraging AWS’s processing capabilities:
Having successfully reached the 100,000 genome milestone, we’re actively developing next-generation optimisations including:
Help establish Singapore as the genomic data processing hub for Southeast Asia, with opportunities for regional expansion supporting pan-Asian genomic initiatives and establishing new standards for population-scale genomic research.
We can apply this same hyperscale processing architecture to your industry’s most demanding data challenges. Whether you’re processing millions of financial transactions for real-time fraud detection, analysing vast manufacturing sensor data for predictive maintenance, managing global supply chain logistics, or processing satellite imagery for precision agriculture, our proven ability to handle petabyte-scale datasets with parallel processing, intelligent resource allocation, and automated workflow orchestration can reduce your processing time from weeks to hours whilst lowering computational costs and enabling real-time decision-making capabilities.
To enjoy the full experience, please upgrade your browser
Try this browser