Help: How to create a high end NAS/SAN with standard hardware
I have been trying for about 2 years here and there (and failing I might add) to create a "SAN/NAS system" (see my description below) that can export a data partition that can be mounted r/w by multiple cluster nodes. I have tied to use all of these technologies at one point or another in various combinations: drbd, nfs, gfs, ocfs, aoe, iscsi, heartbeat, ldirectord, round robin DNS, pvfs, cman, clvm, and several fencing solutions. I finally "gave up" trying to do it myself, and e-mailed about 10 SAN/NAS appliance re-sellers and only 2 said they could meet my specs. Their price range . . . $10,000 - $80,000. Well that priced me right out of the market.
So back to trying to do it myself. I came across glusterfs, it was super easy to set-up and test and is almost exactly what I want/need. There are only 2 issues.
#1. When one of the servers goes down the client hangs at least for a little while (more testing is needed) to be sure it comes back at all.
#2. The read/write tests I performed came in at 1.6, NFS on all the same machines came in at 11, and a direct test on the data server came in at 111. How do I improve the performance?
###############################################
My gluster set-up:
2 supermicro dual Xeon 3.0 ghz CPUs, 8gb ram, 4 @ 750gb seagate sata HDs, 3 in raid5 with 1 hot spare. (data servers)
1 supermicro dual AMD 2.8 ghz CPUs, 4gb ram, 2 @ 250gb seagate sata HDs in raid 1. (client server)
gluster is set-up with round robin DNS to handle the load balancing of the 2 data servers.
I also tried heartbeat with a virtual IP and ldirectord to redirect the traffic, but when I took down server 1, there was like a 10 second delay before server 2 took over the virtual IP and the client hung.
###############################################
Description of my dream "SAN/NAS system"
Data server:
2 units (appliances/servers) that each have a 4 drives in a raid5 disk set (3 active, 1 hot spare) (they can be active/passive or active/active I don't care)
These two units should mirror each other in in real-time.
If 1 unit fails for any reason the other picks up the load and carries on.
On a failure I want the data to be re-synced automatically when the failed unit comes back on-line.
Data clients:
Each cluster node machine (the clients) in the server farm (CentOS 5.4 OS) will mount 1 or more data partitions provided by the data server(s).
If the active server goes down (multiple HD failure, network issue, power supply, etc) the 2nd server takes over and the client machines never know.
All clients will mount r/w simultaneously, so some type of network file system that supports network file locks is required.
It would be even more ideal if the data servers could be 1-N instead of just 1-2.
To re-cap:
A. 2 SAN/NAS data servers mirroring each other in real time.
B. Auto fail over to the 2nd server if 1st fails (without the clients needing to be restarted, or even being interrupted if possible).
C. Auto re-sync of the data if a failed unit comes back on-line, when the sync is done the unit goes active again (assuming its normal state is active).
D. Multiple machines mounting the same partition in read/write mode (some kind of network file system).
E. Linux CentOS will be used on the cluster nodes.
Can anyone help?
I am open to any viable solutions out there.