Find the answer to your Linux question:
Results 1 to 10 of 10
Hi, This is my first post here, but I hope i nailed to post it in the correct subforum While having a summer-break from the university I work as a ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Aug 2010
    Posts
    10

    Server distro, file buffering and filesystem


    Hi,

    This is my first post here, but I hope i nailed to post it in the correct subforum

    While having a summer-break from the university I work as a developer at a local company. While most of the software is developed for Windows, I have gotten a little deal about setting up a linux server, which will host some simple internal services.

    Its going to be a very basic server setup, the MOST important thing being that it is a very efficient file-server, on which the windows clients can access files.

    It will also host some basic services (I'll write a bash script for pulling files to backup and sending to FTP each night, i have some indexing programs i wrote in C so they will also be easy to port, and it will run a single mysql server). Very basic stuff.

    The "problem" is that there is about a million small shared files, which will be accessed often and randomly(as well as some large data files which will also be written to often). Access time has to be very fast, something which the old server doesn't handle really well(also it tends to not update the shared file table, which means that newly accessed files sometimes aren't available). Also, we have a problem with fragmentation when writing the large files. This is where my question about file systems comes in: Should I simply choose ext4, or is there a better alternative for this scenario?

    About hardware and buffering, we're thinking of going with either 2*WD velociraptor(1 disk for redundancy) OR 2*SSD disks. The reason why I can't choose is because i'm not sure how effectively Samba buffers files. If we installed e.g. 32gb ram in the machine, would it be able to use this for buffering previously written/read files (and for buffering the creation of new files)? If so, there would be less reason to go with the SSD disks. The server will have an UPS so it will get time to flush the buffers in case of power failure.

    My last question is about distribution. I'm a rather new linux user (Relatively. Compared to most of you I probably am, but i have done my share of C programming in VIM. Which means that I consistently press :wq in any other text editor heheh). I like simple distributions, like Ubuntu Server. Are there other distributions you can recommend me taking a look at before installing? The server will have to be extremely stable and secure(since its not running much external services and it will be behind a router, security is not a major concern). Maybe a plain debian, if that has advantages over Ubuntu (i like debian based installations because thats what i'm most used to, but if it will benefit me learning something new its not a problem)

    Thank you for your time!
    Daniel

  2. #2
    Linux Engineer Segfault's Avatar
    Join Date
    Jun 2008
    Location
    Acadiana
    Posts
    877
    Debian is just great for this, IMO.

  3. #3
    Just Joined!
    Join Date
    Aug 2010
    Posts
    10
    Quote Originally Posted by Segfault View Post
    Debian is just great for this, IMO.
    Yeah I'm also leaning a bit towards Debian due to its reputation of stability and loads of documentation. I guess the main difference between Ubuntu and Debian is that ubuntu features newer packages, but since that is not really a requirement perhaps Debian is the way to go. And debian also has a vast amount of documentation anyway

  4. $spacer_open
    $spacer_close
  5. #4
    Linux Engineer Kloschüssel's Avatar
    Join Date
    Oct 2005
    Location
    Italy
    Posts
    773
    If you need speed and it should be hardly failsafe, go for a SSD hardware RAID6 array or higher with at least 5 SSDs and one or more gigabit ports that are paired. This should give you a network read/write throughput of about 1.8gigabit and a hardware read/write far beyond 1000 IOPS and the SSDs will safe a lot of power (which obviously is cheaper). This kind of thing is completely done in hardware, so the OS doesn't care and you can go for whatever distri you want, even though I would consider debian for stable server environments.

  6. #5
    Just Joined!
    Join Date
    Aug 2010
    Posts
    10
    Quote Originally Posted by Kloschüssel View Post
    If you need speed and it should be hardly failsafe, go for a SSD hardware RAID6 array or higher with at least 5 SSDs and one or more gigabit ports that are paired. This should give you a network read/write throughput of about 1.8gigabit and a hardware read/write far beyond 1000 IOPS and the SSDs will safe a lot of power (which obviously is cheaper). This kind of thing is completely done in hardware, so the OS doesn't care and you can go for whatever distri you want, even though I would consider debian for stable server environments.
    I'll take a look at raid 6, i've mostly been using the old raid 1 on my nas-server (simple mirror). Thanks for the suggestion

  7. #6
    Linux Engineer Kloschüssel's Avatar
    Join Date
    Oct 2005
    Location
    Italy
    Posts
    773
    I personally prefer higher level raids over simplistic raid10 or raid1 cause one gets more for the money and the performance boost is significant.

    For example take a RAID6 over 10 ssd discs. You have the read performance of 8 ssd that each read 1/8th of the actual data. Assuming one of these ssd's has a throughput of 150mb/s, you get about 0.9 * 8 * 150mb/s (under the assumption 10% of read performance is overhead), which is about 1gb/s (with RAID1 it would be the performance of one disc = 150mb/s). Obviously it is convenient that all data is somewhat a multiple of 8bits and can be perfectly split over the 8 discs. And still you have 2 parity discs that make sure all data is safe unless more than 2 discs fail at the same time. If up to two discs fail, the system can still operate (with less performance). And in the end you get 80% disc space out of it (if you do a RAID1 it is only 50%) cause only 2 discs store the parity and the rest can store data. Thus, buying 10 80gb ssd would give you 640gb storage (over 400gb with RAID1).

    just my two cents.
    Last edited by Kloschüssel; 08-10-2010 at 09:19 AM.

  8. #7
    Just Joined!
    Join Date
    Aug 2010
    Posts
    10
    Quote Originally Posted by Kloschüssel View Post
    I personally prefer higher level raids over simplistic raid10 or raid1 cause one gets more for the money and the performance boost is significant.

    For example take a RAID6 over 10 ssd discs. You have the read performance of 8 ssd that each read 1/8th of the actual data. Assuming one of these ssd's has a throughput of 150mb/s, you get about 0.9 * 8 * 150mb/s (under the assumption 10% of read performance is overhead), which is about 1gb/s (with RAID1 it would be the performance of one disc = 150mb/s). Obviously it is convenient that all data is somewhat a multiple of 8bits and can be perfectly split over the 8 discs. And still you have 2 parity discs that make sure all data is safe unless more than 2 discs fail at the same time. If up to two discs fail, the system can still operate (with less performance). And in the end you get 80% disc space out of it (if you do a RAID1 it is only 50%) cause only 2 discs store the parity and the rest can store data. Thus, buying 10 80gb ssd would give you 640gb storage (over 400gb with RAID1).

    just my two cents.
    Yeah I was reading a bit up on it. I'll have to see if i can get my employer on this, though, as the thing is that there isn't a lot of big files but rather they will be placed out all over the disk(and SSD's already has a performance gain there). But then again, once in a while transfer speed also counts. And with RAID6 and SSD's we would strike both seektime, transfer speed and reliability. But its also a more expensive solution.

  9. #8
    Just Joined!
    Join Date
    Aug 2010
    Posts
    10
    Btw, is it OK to set the write cache size for samba? Or are file systems like EXT4 automatically handling this.

  10. #9
    Linux Engineer Kloschüssel's Avatar
    Join Date
    Oct 2005
    Location
    Italy
    Posts
    773
    AFAIK the write cache is a cache to which data is written by samba (BEWARE: for each file!) before it is written to the filesystem. for small files it can be rather useful, because you don't have to wait for the hard disc to finish, but wait only for the cache (which is in memory).

    I believe that most modern hard drives do write caching to improve performance. But all depends on the ext fs configuration settings and how well they play together with the hard drive ... well, I never bothered until now.

  11. #10
    Just Joined!
    Join Date
    Aug 2010
    Posts
    10
    Quote Originally Posted by Kloschüssel View Post
    AFAIK the write cache is a cache to which data is written by samba (BEWARE: for each file!) before it is written to the filesystem. for small files it can be rather useful, because you don't have to wait for the hard disc to finish, but wait only for the cache (which is in memory).

    I believe that most modern hard drives do write caching to improve performance. But all depends on the ext fs configuration settings and how well they play together with the hard drive ... well, I never bothered until now.
    Yeah I also thought it could have a positive benefit for small files. I know harddisks have their own write caches, but its usually only 16-32MB and if you continually push the system with a lot of small files perhaps it would fill the buffer too fast (Especially considering that the system will have a lot of memory available for this purpose)

    I suppose i'll have to play around a bit and measure performance once I get the system running

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •