
Bio-Tech - Cloud Biobank and Analysis Platform
This research group managed thousands of biospecimens across multiple centres across Europe using spreadsheets, lab notebooks and ad-hoc trackers. Sample IDs, storage locations and handling steps were hard to reconcile, and audit history was fragmented. Generating cohorts for studies took days and required a great deal of manual intervention.
The work involved sending human tissue samples from various hospitals and research centres to the main centre in London. These samples had to be tracked in accordance with the Human Tissue Act. On receipt they needed to be tagged and stored in special freezers ensuring that when a sample was removed it was recorded.
The team also needed to ingest high-volume mass-spectrometry data running into Terabytes and make them available to bioinformaticians without slowing day-to-day sample operations. Security, traceability and role-based access were mandatory, along with a hosting model that could scale with growing datasets.
We worked with scientists, doctors and data managers to map the end-to-end sequence from sample creation to analysis and report. From that we produced a signed-off specification that defined the data model, allowed process steps, roles and permissions, upload templates, reports and SLAs.
Delivery was phased. Stage 1 delivered the biobank core: unique IDs, shipping and receipt checks, QC flags, storage hierarchies, movement history, search and reporting.
Stage 2 added a pipeline to ingest mass-spectrometry data at multi-terabyte scale, with direct-to-cloud upload, integrity checks and controlled access for analysis.
Wireframes evolved into prototype screens which were tested for intake, storage moves, audits and reporting were reviewed before build. We implemented validation at upload to enforce controlled vocabularies and quarantine issues.
To handle multi-terabyte mass-spectrometry outputs cost-effectively, we used Amazon S3 Glacier. Uploaded data goes to S3 Standard for rapid validation and early analysis, then transition by tag or prefix to S3 Glacier Instant Retrieval for infrequent reads, and later to Glacier Flexible Retrieval or Deep Archive for long-term retention.
Key features


The platform was delivered on time and on budget and is now the single source of truth for samples, storage and history. Cohorts that once took days to assemble can be produced in minutes, while mass-spec data lands reliably for analysis without disrupting biobank operations. Governance is clearer, audits are faster and the system scales as studies grow.
“RD Research did an outstanding job and delivered the best bio-tech database we have ever used. This team really knows what they are doing and I have no hesitation in recommending them”
Pier, Senior Project Manager