Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 17
How does the Forum manage the descripive names for given URL's? Does it load the site and uses the title of the given URL as the name for the link? ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Just Joined! Rava's Avatar
    Join Date
    Jul 2007
    Location
    hacking 127.0.0.1
    Posts
    50

    How are URL desciptions managed?


    How does the Forum manage the descripive names for given URL's?
    Does it load the site and uses the title of the given URL as the name for the link?

    Like /distro.ibiblio.org/fatdog/web/ gets tweaked into this: Fatdog64 Linux (which is the title of that URL)

    So I guess I am right to assume so, but how is it coded? Which programming language is used?

    Is there an easy to do bash script for that as well?

  2. #2
    Super Moderator Roxoff's Avatar
    Join Date
    Aug 2005
    Location
    Nottingham, England
    Posts
    3,942
    What happens is that when no title is supplied, the URL page is fetched and the html title of the page is used. The forum software actually substitutes the title into the post using the correct BBCode markers.

    Here are some examples using distrowatch, which has always presented a long title in these scenarios:

    When I provide no tags or title: DistroWatch.com: Put the fun back into computing. Use Linux, BSD.
    When I provide a BBCode tag, but nothing else: [url=http://www.distrowatch.com], and this is clearly broken
    And when I do this with my own tags/title: Distrowatch

    This functionality is provided by the vBulletin software that runs under the hood here at LinuxForums. There are some special modifications here, but I don't think link processing in posts is changed.

    The site is implemented in PHP, which can hide some of the complexity of fetching information from other websites.
    Linux user #126863 - see http://linuxcounter.net/

  3. #3
    Just Joined! Rava's Avatar
    Join Date
    Jul 2007
    Location
    hacking 127.0.0.1
    Posts
    50
    Quote Originally Posted by Roxoff View Post
    This functionality is provided by the vBulletin software that runs under the hood here at LinuxForums. There are some special modifications here, but I don't think link processing in posts is changed.

    The site is implemented in PHP, which can hide some of the complexity of fetching information from other websites.
    Thanks for the head sup. Still I think it should be possible to copy that ability / behaviour so that a local bash script can do the same, yes?

    Saving a index.html or whatever page it might be, and searching for the <title> in that html file should not be that hard to create with bash...

  4. $spacer_open
    $spacer_close
  5. #4
    Super Moderator Roxoff's Avatar
    Join Date
    Aug 2005
    Location
    Nottingham, England
    Posts
    3,942
    Doing it in Bash? I've never even considered doing html manipulation in that environment before. PHP runs on the web server, so it's already in an html environment when it runs.
    Linux user #126863 - see http://linuxcounter.net/

  6. #5
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,312
    Hi.

    If I understand your question, here is an example of extracting a title from a web page using standardly available commands:
    Code:
    #!/usr/bin/env bash
    
    # @(#) s1	Demonstrate extraction of title from a URL.
    
    # Utility functions: print-as-echo, print-line-with-visual-space, debug.
    # export PATH="/usr/local/bin:/usr/bin:/bin"
    LC_ALL=C ; LANG=C ; export LC_ALL LANG
    pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
    pl() { pe;pe "-----" ;pe "$*"; }
    db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
    db() { : ; }
    C=$HOME/bin/context && [ -f $C ] && $C wget grep
    
    URL=${1-"http://distro.ibiblio.org/fatdog/web/"}
    
    pl " URL being considered:"
    pe "$URL"
    
    pl " Results:"
    wget -q -O - "$URL" |
    grep -m 1 "title"
    
    exit 0
    producing:
    Code:
    $ ./s1
    
    Environment: LC_ALL = C, LANG = C
    (Versions displayed with local utility "version")
    OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
    Distribution        : Debian 5.0.8 (lenny, workstation) 
    bash GNU bash 3.2.39
    wget GNU Wget 1.11.4
    grep GNU grep 2.5.3
    
    -----
     URL being considered:
    http://distro.ibiblio.org/fatdog/web/
    
    -----
     Results:
        <title>Fatdog64 Linux</title>
    The further manipulation of the title can be done with various tools.

    See man pages for details ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  7. #6
    Just Joined! Rava's Avatar
    Join Date
    Jul 2007
    Location
    hacking 127.0.0.1
    Posts
    50
    Thanks drl, that was very helpful.

    But for some reason it cannot read a title from a wikipedia article:
    Code:
    $ extract-title-from-html-page https://en.wikipedia.org/wiki/Nuremberg
    
    -----
     URL being considered:
    https://en.wikipedia.org/wiki/Linux
    
    -----
     Results:
    Of course the page has a title:
    HTML Code:
    <title>Linux - Wikipedia, the free encyclopedia</title>

  8. #7
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,312
    Hi.

    It seemed to work for me:
    Code:
    ./s1 https://en.wikipedia.org/wiki/Nuremberg
    
    Environment: LC_ALL = C, LANG = C
    (Versions displayed with local utility "version")
    OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
    Distribution        : Debian 5.0.8 (lenny, workstation) 
    bash GNU bash 3.2.39
    wget GNU Wget 1.11.4
    grep GNU grep 2.5.3
    
    -----
     URL being considered:
    https://en.wikipedia.org/wiki/Nuremberg
    
    -----
     Results:
    <title>Nuremberg - Wikipedia, the free encyclopedia</title>
    Comparing my system and versions as noted in the output to your system and versions, what is different?

    Best wishes ... cheers, drl
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

  9. #8
    Penguin of trust elija's Avatar
    Join Date
    Jul 2004
    Location
    Either at home or at work or down the pub
    Posts
    3,677
    Moved here as it seems more relevant
    "I used to be with it, then they changed what it was.
    Now what was it isn't it, and what is it is weird and scary to me.
    It'll happen to you too."

    Grandpa Simpson



    The Fifth Continent

  10. #9
    Just Joined! Rava's Avatar
    Join Date
    Jul 2007
    Location
    hacking 127.0.0.1
    Posts
    50
    Environment: LC_ALL = C, LANG = C
    (Versions displayed with local utility "version")
    OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
    Distribution : Debian 5.0.8 (lenny, workstation)
    bash GNU bash 3.2.39
    wget GNU Wget 1.11.4
    grep GNU grep 2.5.3
    I get non of that stuff (when running your script). And I have no C or any other linker/compiler stuff installed.

    Some info about my system:
    Code:
    # uname -a
    Linux porteus 3.13.6-porteus #1 SMP PREEMPT Fri Mar 7 06:53:36 Local time zone must be set--s x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux
    That is / runs as: Slackware 14.1

    GNU bash, version 4.2.45(2)-release (x86_64-slackware-linux-gnu)

    GNU Wget 1.14 built on linux-gnu.

    grep (GNU grep) 2.14


    So, the only thing newer you got seems to be the grep...
    Last edited by Rava; 06-25-2014 at 04:36 PM.

  11. #10
    drl
    drl is offline
    Linux Engineer drl's Avatar
    Join Date
    Apr 2006
    Location
    Saint Paul, MN, USA / CentOS, Debian, Slackware, {Free, Open, Net}BSD, Solaris
    Posts
    1,312
    Hi.

    Here it is on slackware:
    Code:
    ./s1 https://en.wikipedia.org/wiki/Nuremberg
    
    Environment: LC_ALL = C, LANG = C
    (Versions displayed with local utility "version")
    OS, ker|rel, machine: Linux, 3.2.29, x86_64
    Distribution        : Slackware 14.0 
    bash GNU bash 4.2.37
    wget GNU Wget 1.14 built on linux-gnu.
    grep (GNU grep) 2.14
    
    -----
     URL being considered:
    https://en.wikipedia.org/wiki/Nuremberg
    
    -----
     Results:
    <title>Nuremberg - Wikipedia, the free encyclopedia</title>
    The set up for the display of the context checks to see if you have the appropriate tools. If they are not present, the code simply skips that section.

    Have you copied and pasted this code exactly?

    Best wishes ... cheers, drl
    Last edited by drl; 06-26-2014 at 11:13 AM. Reason: Correct small typo.
    Welcome - get the most out of the forum by reading forum basics and guidelines: click here.
    90% of questions can be answered by using man pages, Quick Search, Advanced Search, Google search, Wikipedia.
    We look forward to helping you with the challenge of the other 10%.
    ( Mn, 2.6.n, AMD-64 3000+, ASUS A8V Deluxe, 1 GB, SATA + IDE, Matrox G400 AGP )

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •