Find the answer to your Linux question:
Results 1 to 4 of 4
i am on the verge of making some (drastic) changes to my company's webpage that was built before i got here... i have replaced everything i want to with new ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined! herot's Avatar
    Join Date
    Dec 2005
    Posts
    41

    need advice on cleaning out unused files in a web server...


    i am on the verge of making some (drastic) changes to my company's webpage that was built before i got here... i have replaced everything i want to with new updated content... now i have gotten the go ahead to go through and destroy everything old, and uneeded. aside from some *.swf buttons, is there a way i could map out every link and path to every file currently in use,
    ">" it in a file, then "grep" out "ls" for all other files not in the map so that i can delete them??? i know dreamweaver (which i use) has a feature called a "link checker". would that produce a sufficient map of every file that my web page needs to function???

    i guess first step is finding a reliable way to map out the "used" and "essential" files recursively from /. then i can figure out another way to filter out the leftovers.

    if anyone has a better idea please share

  2. #2
    Linux Guru
    Join Date
    May 2004
    Location
    forums.gentoo.org
    Posts
    1,817
    Maybe wget can help. It's intended to copy pages and their links, so maybe if you wget your files out of the current directory (without dead links), you will be able to clean house and then wget them back in. If you try this, experiment first to make sure it works for you.
    /IMHO
    //got nothin'
    ///this use to look better

  3. #3
    Just Joined! herot's Avatar
    Join Date
    Dec 2005
    Posts
    41
    you can wget back in ?? i thought it was just for downloading, cool.
    i dont think i have any dead links, but can i make it get them too??
    because as long as i keep em i can go back and fix em later...

    i hadn't really considered wget, i think thats a pretty good idea...

    now howto do a mass delete of various file located in various directories...
    (sounds nasty doesn't it?).

    i do have a complete (and current) backup of the entire web page though...

    oh and a question about wget... will it follow links off of like *.swf (flash buttons) and similar ???

  4. #4
    Linux Guru
    Join Date
    May 2004
    Location
    forums.gentoo.org
    Posts
    1,817
    From the man page for wget:
    • Wget can follow links in HTML and XHTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred to as ``recursive downloading.''
    When you say you want to "destroy everything old, and uneeded" I understood that to mean all of the stuff that no longer is linked to anything you want to keep. So I was thinking that you could wget your home page, and if that's the top of everthing you want to keep, everything should be copied. I've only used wget for downloading from the internet, so maybe it can only move stuff from a web server. But after you've moved the good stuff from your web server, you can move it back (post housekeeping) by a normal copying method. If all of your links are relative (and wget can make them so), that should do it. If you have external links, you'll want to make sure that wget doesn't go outside of your domain.

    There are plenty of ways to screw up downloads, as I've learned, but I think with a little effort, it might do what you need. As for flash buttons, if you mean linked files in your web directory, then yes, I think those will be copied same as if they were html pages. I've been wrong before, though.
    /IMHO
    //got nothin'
    ///this use to look better

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •