Find the answer to your Linux question:
Results 1 to 6 of 6
We have got 2 web servers running Red Hat Enterprise Linux 6.2 64bit which use to have static content that was synchronised between servers every night with rsync. However, recently ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined!
    Join Date
    Jul 2011
    Location
    Sevenoaks, Kent, UK
    Posts
    5

    Keep directory in sync on multiple servers


    We have got 2 web servers running Red Hat Enterprise Linux 6.2 64bit which use to have static content that was synchronised between servers every night with rsync. However, recently we have added new website that requires storage to be in sync all the time between two aforementioned servers. We wouldn’t like to go for SAN/NAS solution as that would mean buying two quite expensive devices that will barely be used.

    Is there any software solution that can be used to keep a directory in sync between two or more servers?

    Any help would be appreciated.

    Kind Regards,
    Paul Preston

  2. #2
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,378
    That part is crucial:
    ..requires storage to be in sync all the time
    Because it -strictly seen- rules out rsync and leaves:
    - drbd
    - nfs
    - gfs2 over iscsi

    If it is not-so-strictly, then maybe replace rsync with lsyncd.
    This will invoke rsync only on changed files and only WHEN they change.
    Means: lsyncd gets invoked by inotify events and acts upon these and not on a fixed schedule like cron.
    Afaik it also doesnt need to build a full filelist on both machines as the list of changed files is given.


    Any of the three shared device/filesystem approaches listed above introduce complexity, dependency, performance loss and essentially a SPOF
    If the machine which hosts the files goes down, then that´s it.

    Can you tell more on how this new website uses the shared files?

    1) Static: A new version of the website is released and from there on the files are static
    In this case I would suggest to package the release into rpms.
    Make them available via a repository and install them in parallel.
    This will result in a few seconds of inconsistency.
    If you want to avoid that, drain the first machine on a (hopefully existing) loadbalancer before the rpm install.
    Then continue with the next.

    2) Dynamic:
    It would be good to know, which type of data (sessions, webuser data, generated data, images, etc)
    how many and how big files you expect.
    Also the life expectancy. Are they expendable after e.g. the session is over or is every file to be archived?
    You must always face the curtain with a bow.

  3. #3
    Just Joined!
    Join Date
    Jul 2011
    Location
    Sevenoaks, Kent, UK
    Posts
    5
    I didn't mentioned that we would like to have resiliency = avoid SPOF.

    NFS, gfs etc are ways to access shared storage, where I'm after replicating directories. Lets simplify it: The objective is to find a solution that will allow us to keep two directories on different servers in sync in a real time.

    Type of content: mostly images

    Kind Regards,
    Paul Preston

  4. #4
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,378
    Realtime and high availability is hard to achieve.
    I dont think, there is an easy out-of-the box trick to do that.

    If you can afford to wait a few seconds, then I would again go for the lsyncd approach.

    If not, then either invest in redundant shared storage hardware and use nfs
    or maybe save those images in a database.
    The basic idea would be: each image consists of an ID, mimetype and a blob.
    Have multiple tables for load distribution: Maybe ID modulo <number of tables> = tablename

    You could then have a master DB for writing and multiple slave DBs to read from.

    Yes, this needs code change and yes, not something that can be done in a few hours.
    You must always face the curtain with a bow.

  5. #5
    Just Joined!
    Join Date
    Jul 2011
    Location
    Sevenoaks, Kent, UK
    Posts
    5
    Irithori,

    Many thanks for your help. I will give it a go with lsyncd and hence only local storage is being accessed by apache I see it as a resilient solution (on a server level). We have got 3 proxy servers with load balancing before web servers so if one web server fails, it will automatically divert all traffic to a second one. The same situation with database which are configured in bi-directional replication.

    I will try it and post my thoughts on that solution.

    Kind Regards,
    Paul Preston

  6. #6
    Trusted Penguin Irithori's Avatar
    Join Date
    May 2009
    Location
    Munich
    Posts
    3,378
    The lsyncd approach has one flaw for your scenario: There is no locking.
    This is crucial, as you have multiple "writers".

    I found this nice HowTo, which explains how to tie lsyncd with csync2
    https://www.axivo.com/community/thre...nd-lsyncd.121/

    Nice read. Might be usefull for me as well in the future.
    I only dislike the idea of having to rely on 3rd_party rpm repos,
    but this can be addressed by packaging the tools inhouse.


    Just for the record, my lsyncd usecase is simpler:
    - One "forge" machine, where files are generated.
    - And multiple "frontends", that get the files.
    - The forge machine is considered backend and not redundant.
    - If it goes down, then it goes down. No direct effect on the website.
    - It can be redeployed anytime and with little effort, as it is fully automated.
    Last edited by Irithori; 06-18-2012 at 12:09 PM.
    You must always face the curtain with a bow.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •