Some general prerequisites:
- BUSINESS Broadband (NOT home broadband!) - it may be more expensive, but the ISP is required to deliver the contracted speed 99.9% of the time. If they fail to perform you may be entitled to damages, but usually it almost never happens. THIS IS VERY IMPORTANT, especially where you need symmetric upload/download speeds, or need LARGER upload than download speeds to serve your pool of clients.
- A good server + array expander (I recommend HP, they have great enterprise support and their products are really good IMO)
- If you are building your own, you can.
- Lots and lots of enterprise-class disks. Unless you are building a PB array and not in TB, it's better to stick with smaller sized drives because you are looking at a lower AFR. Most enterprise drives have a NRE of 10^-15, avoid consumer drives which have an NRE of 10^-14. If you got the cash, opt for drives with an NRE rate of 10^-16 or better, but these are generally insanely expensive and rare. Better still, if you got deeper pockets, you can spam an SSD array. Realistically, if data size is important to you, getting an enterprise class HDD will suffice. Speeds are not significantly different from 7.2k to 10k in a RAID 5 setting, but 15k makes a difference (I doubt they make them anymore, mainstream enterprise drives are rated at 7.2k RPm which gives you a per drive speed of 160MB/s at best)
- A good RAID card. I won't recommend LSI if you are working with a large number of disks. Get a HP RAID card. I am a little backward, using the P410i but it still gets the job done.
- An SAS expander - they are your life savers for when you need to add more disks - the current HP SAS 3 Expander offers up to 48 drives - 12 ports with an SFF-8087 (split into 4 individual data cables)
- A good casing - A Norco RPC-4220 does the trick
- For your main server, you would look at server grade motherboards from Supermicro Computer. I have heard cases of people trying with Ryzen Threadripper but I have never done that so don't know.
- A decent multi-port LAN card - those from intel fare quite well.
- A managed switch - to perform link aggregation of multiple LAN ports
- Next, you will need to pick a method of data transport: Fibre Channel or iSCSI. The former is insanely expensive and you gotta have additional hardware for that, but you can get insane speeds through Fibre Channel.
- iSCSI is more widely used, is slower but can use existing hardware (the multiport LAN card where you aggregate 4 -> 1) + your managed switch
- And then you gotta pick your OS. Windows Server Standard may lack the features for fibrechannel (IIRC) and high availability - so you may want to consider Windows Server Enterprise in that sense.
- Most sysadmins will go with Linux - e.g. CentOS, SUSE - But you have to configure everything (it's more flexible but more troublesome too)! to set up multipath, high availability.
- I'd recommend you read further on how to configure all these.
Don't forget, your data center needs good ventilation. Get a server cabinet (+ fans) - and you need a dedicated room, air conditioned to keep the servers cool under high load. Some server racks come with liquid cooling to bring the temperatures down further, although you would only see this in massive scale data centers.
The cheapest you can get started would be US$2k - $3k.
Also, on the RAID, would recommend RAID 5 - if you are more with the volume per array, RAID 6 if you can't tolerate any downtime at all.
If you are remain unfazed by the time you get this sentence, then it's time to take up arms and build the first building block of your data center
As I said, if you want to avoid the hassle, go for the cloud. There are plenty of offerings like AWS, etc who have far more resources than you can imagine.
Hope this long post helps.
I welcome all other members to correct me if they wish