CPU/Mobo Need advice on server configuration for my university

kvn95ss

Adept
Hello guys, hope you all are doing well. We have the opportunity to upgrade our server infrastructure. Will be used predominantly for bioinformatics / data processing (Exome + genome) data. (No ML yet, but would like to keep options open).
There is no fixed budget to work with - generally the lower the better. For the sake of post, lets consider 10 lakh INR as the maximum budget.

Usage - There are 3 bioinformaticians who use the servers, mainly for processing research WES/WGS data, but also running analysis using different tools. We have hosted a HTTP server and a streamlit app where our lab mates (~15 individuals) can access files and data. The server is running on RHEL7, main tools we use include bwa-mem2, gatk suite, annovar, etc.

We are currently using a server with the following specification -
Dell 2u server (Bought in 2016)
Intel Xeon E5-2660 v4 x 2, 12 cores, 24 threads, totalling to 56 threads
RAM - 378 GB
Storage - 72TB mounted in 2 equal sized partitions, RAID 5 (6TB HDD x 13)
Gigabit networking
We have roughly 5 TB storage remaining. We have ~3000 exome samples , ~40 WGS samples (all the samples have BAMs + fastq.gz files) and a bunch of softwares and resources. We try to remove intermediate files as soon as possible, but the server tends to be in use most of the time. We use OneDrive to backup the gvcf files (University email comes with 1 TB OneDrive storage, but this is just a spot fix) but BAM files aren't backed up any where. We plan to delete older data, ex. any data 3 or more years old. However, even such files are accessed occasionally when doing analysis.

As of now, there is essentially one working copy of the data. Due to budget constraints, we would make a copy of raw data (fastq files) to 4 TB hard drives, catalogue the drives with VVV (good software) and retrieve the necessary hard drive in case we need any fastq files. I've been told I can delete the backed up fastq files from server, but I would rather not do that. We live in coastal regions and there is no telling how the external hard drives might (or might not) react to that.

My ideas are as follows -
  1. Use a NAS to store archived BAM and fastq files
  2. If possible, upgrade server
There is a NAS model from Dell which might be suitable for our purpose, but quoted for 4 lakhs INR with decked out storage (12TB x 4).
So instead of just upgrading our storage, I'm thinking of retiring old server for storage + some experimental stuff and buying a new server instead.

The configuration I've decided is as follows -
  • PowerEdge 750xs
  • Intel Silver 4314 (16C, 32T) x 2
  • 256 GB RAM (which is eye wateringly expensive!)
  • 1.2 TB HDD for OS, as just a 480 GB SSD was quited for ~600USD)
  • Roughly 12 TB HDD storage

Guess I'm looking to vent here as well. Our university IT doesn't see anything beyond their pre-approved list and for 'special use case' like ours, they have left the configuration aspect to us. Tried speaking with few people but the discussion didn't seem to lead anywhere.
 
have you thought about using Cloud like Azure or AWS?. You can scale it on demand and turn off when not needed. You can get an initial assessment on Azure cloud savings using the below link if you have a broad idea of the server config that you would need:. You can also check if you can save using your existing servers.

 
It doesn't work for your purpose directly. We tried to optimize but ended up deciding for a local system. Money spent on cloud leaves is with only data in hand, where as buying a server becomes an asset.
 
Back
Top