Find the answer to your Linux question:
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 12
want to store the data - instead of printing out there fore i set up a mysqldb on the suse Code: import urllib import urlparse import re url = "http://search.cpan.org/author/?W" ...
Enjoy an ad free experience by logging in. Not a member yet? Register.
  1. #1
    Linux User
    Join Date
    May 2013
    Posts
    264

    store the data instead of printing it - in to a mysql-db


    want to store the data - instead of printing out
    there fore i set up a mysqldb on the suse
    Code:
    import urllib
    import urlparse
    import re
    
    url = "http://search.cpan.org/author/?W"
    html = urllib.urlopen(url).read()
    for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></
    a><br/><small>(.*?)</small>', html):
        alk = urlparse.urljoin(url, lk)
    
        data = { 'url':alk, 'name':name, 'cname':capname }
    
        phtml = urllib.urlopen(alk).read()
        memail = re.search('<a href="mailto:(.*?)">', phtml)
        if memail:
            data['email'] = memail.group(1)
    
        print data
    Akoya P 6512 15" OpenSuse 13.1: AMD Athlon X2 P320
    Samsunng q 210, 12,1" OpenSuse 13.1: Intel® Core™ 2 Duo Proz. P8400 2,26 GHz 1066 MHz FSB 3 MB

  2. #2
    Linux Enthusiast
    Join Date
    Jan 2005
    Location
    Saint Paul, MN
    Posts
    679
    Do you want to store the file being printed, a PDF of the file being printed, or the rasterized content that can only be printed on that printer or one 100% compatible with your current printer?

  3. #3
    Linux User
    Join Date
    May 2013
    Posts
    264
    hello good day

    many many thanks for the quick reply. great to hear from you-

    no - i want to store - INSTEAD - of printing.
    i want to store the results in a mysql-db

    note; this is allready installed on the opensuse-13.1

    now i am in need of connecting the output of the parser with the db.
    Akoya P 6512 15" OpenSuse 13.1: AMD Athlon X2 P320
    Samsunng q 210, 12,1" OpenSuse 13.1: Intel® Core™ 2 Duo Proz. P8400 2,26 GHz 1066 MHz FSB 3 MB

  4. $spacer_open
    $spacer_close
  5. #4
    Linux User
    Join Date
    May 2013
    Posts
    264
    while running this code i get the following results

    Code:
    import urllib
    import urlparse
    import re
    
    url = "http://search.cpan.org/author/?W"
    html = urllib.urlopen(url).read()
    for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></
    a><br/><small>(.*?)</small>', html):
        alk = urlparse.urljoin(url, lk)
    
        data = { 'url':alk, 'name':name, 'cname':capname }
    
        phtml = urllib.urlopen(alk).read()
        memail = re.search('<a href="mailto:(.*?)">', phtml)
        if memail:
            data['email'] = memail.group(1)
    
        print data
    see the results...

    Code:
    martin@linux-c5sz:~/perl> python cpan1.py
      File "cpan1.py", line 7
        for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></
                                                                             ^
    SyntaxError: EOL while scanning string literal
    martin@linux-c5sz:~/perl>
    Akoya P 6512 15" OpenSuse 13.1: AMD Athlon X2 P320
    Samsunng q 210, 12,1" OpenSuse 13.1: Intel® Core™ 2 Duo Proz. P8400 2,26 GHz 1066 MHz FSB 3 MB

  6. #5
    Linux Enthusiast
    Join Date
    Jan 2005
    Location
    Saint Paul, MN
    Posts
    679
    Easiest would be to write a "fake printer" that takes the file from the print spooler and saves them into the DB.

  7. #6
    Linux Guru Lakshmipathi's Avatar
    Join Date
    Sep 2006
    Location
    3rd rock from sun - Often seen near moon
    Posts
    1,769
    You just need to make that as single line:

    Code:
    import urllib
    import urlparse
    import re
    
    url = "http://search.cpan.org/author/?W"
    html = urllib.urlopen(url).read()
    for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
        alk = urlparse.urljoin(url, lk)
    
        data = { 'url':alk, 'name':name, 'cname':capname }
    
        phtml = urllib.urlopen(alk).read()
        memail = re.search('<a href="mailto:(.*?)">', phtml)
        if memail:
            data['email'] = memail.group(1)
    
        print data
    This is what I got before stopping it via Ctrl+C
    Code:
    $ python printer.py 
    {'url': 'http://search.cpan.org/~wac/', 'cname': 'WAC', 'name': 'Wang Aocheng', 'email': 'wangaocheng%40hotmail.com'}
    {'url': 'http://search.cpan.org/~wade/', 'cname': 'WADE', 'name': 'James Wade', 'email': 'CENSORED'}
    ^C
    You are getting Python dictionary as output. You just need to insert this into mysql db.
    Check this How do I connect to a MySQL Database in Python? - Stack Overflow
    First they ignore you,Then they laugh at you,Then they fight with you,Then you win. - M.K.Gandhi
    -----
    FOSS India Award winning ext3fs Undelete tool www.giis.co.in. Online Linux Terminal http://www.webminal.org

  8. #7
    Linux User
    Join Date
    May 2013
    Posts
    264
    hi - thanks to you
    i will check the stack-example.

    many thanks again

    update:


    the stackoverlflow - the great manual - i will do as adviced... - see below...


    Connecting to MYSQL with Python in 3 steps
    ----------------------------------------

    1 - Setting

    You must install a MySQL driver before doing anything. Unlike PHP, only the SQLite driver is installed by default with Python. The most used package to do so is [MySQLdb][1] but it's hard to install it using easy_install.

    For Windows user, you can get a [exe of MySQLdb][2].

    For Linux, this is a casual package (python-mysqldb).

    For Mac, you can [install MySQLdb using Macport][3].

    2 - Usage

    After installing, reboot. This is not mandatory, but will prevent me from answering 3 or 4 others
    questions in this post if something goes wrong. So please reboot.


    Then it is just like using another package :

    Code:
       #!/usr/bin/python
        import MySQLdb
        
        db = MySQLdb.connect(host="localhost", # your host, usually localhost
                             user="john", # your username
                              passwd="megajonhy", # your password
                              db="jonhydb") # name of the data base
        
        # you must create a Cursor object. It will let
        #  you execute all the queries you need
        cur = db.cursor() 
        
        # Use all the SQL you like
        cur.execute("SELECT * FROM YOUR_TABLE_NAME")
        
        # print all the first cell of all the rows
        for row in cur.fetchall() :
            print row[0]
    Of course, there are thousand of possibilities and options, this is a very basic example. You will have to look at the documentation.

    3 - More advanced usage

    Once you know how it works, you may want to use an [ORM][5] to avoid writting SQL manually and manipulate your tables as they were Python objects. The most famous ORM in the Python community is [SQLAlchemy][6]. I strongly advice you to use it: your life is going to be much easier. I recently discovered another jewel in the Python world: [peewee][7]. It's a very lite ORM, really easy and fast to setup then use. It makes my day for small projects or stand alone apps, where using big tools like SQLAlchemy or Django is overkill :


    Code:
        import peewee
        from peewee import *
        
        db = MySQLDatabase('jonhydb', user='john',passwd='megajonhy')
        
        class Book(peewee.Model):
        	author = peewee.CharField()
        	title = peewee.TextField()
        
        	class Meta:
        		database = db
        
        Book.create_table()
        book = Book(author="me", title='Peewee is cool')
        book.save()
        for book in Book.filter(author="me"):
        	print book.title
    
        Peewee is cool

    This example works out of the box. Nothing other than having peewee (`pip install peewee` )
    is required. No complicated setup. It's really cool.




    [1]: http://pypi.python.org/pypi/MySQL-python/
    [2]: http://sourceforge.net/project/showf...group_id=22307
    [3]: http://stackoverflow.com/questions/1...c-os-x#1448476
    [4]: http://www.mikusa.com/python-mysql-docs/
    [5]: https://en.wikipedia.org/wiki/Object-Relational_Mapping
    [6]: http://www.sqlalchemy.org/
    [7]: http://peewee.readthedocs.org/en/latest/index.html

    end of cit.
    well i will dig into the documentation - and will do as they ADVICE..

    MANY MANY THANKS AGAIN
    Last edited by sayhello; 07-02-2014 at 09:24 PM.
    Akoya P 6512 15" OpenSuse 13.1: AMD Athlon X2 P320
    Samsunng q 210, 12,1" OpenSuse 13.1: Intel® Core™ 2 Duo Proz. P8400 2,26 GHz 1066 MHz FSB 3 MB

  9. #8
    Linux User
    Join Date
    May 2013
    Posts
    264
    hello after installint - i hope that iwas successfully
    i try the following....:


    that fails....


    Code:
     
    import urllib
    import urlparse
    import re
    import MySQLdb
    
    
    db = MySQLdb.connect(host="localhost", # your host, usually localhost
                         user="root", # your username
                          passwd="my_passwd", # your password
                          db="cpan") # name of the data base
    
    # you must create a Cursor object. It will let
    #  you execute all the queries you need
    cur = db.cursor() 
    
    
    url = "http://search.cpan.org/author/?W"
    html = urllib.urlopen(url).read()
    for lk, capname, name in re.findall('<a href="(/~.*?/)"><b>(.*?)</b></a><br/><small>(.*?)</small>', html):
        alk = urlparse.urljoin(url, lk)
    
        data = { 'url':alk, 'name':name, 'cname':capname }
    
        phtml = urllib.urlopen(alk).read()
        memail = re.search('<a href="mailto:(.*?)">', phtml)
        if memail:
            data['email'] = memail.group(1)
    
    
    # Use all the SQL you like
    cur.execute("SELECT * FROM YOUR_TABLE_NAME")
    
    # print all the first cell of all the rows
    for row in cur.fetchall() :
        print row[0]
    this comes out....


    Code:
        
        
        martin@linux-70ce:~/perl> python cpan2.py
    Traceback (most recent call last):
      File "cpan2.py", line 13, in <module>
        db="jonhydb") # name of the data base
      File "/usr/lib/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
        return Connection(*args, **kwargs)
      File "/usr/lib/python2.7/site-packages/MySQLdb/connections.py", line 187, in __init__
        super(Connection, self).__init__(*args, **kwargs2)
    _mysql_exceptions.OperationalError: (1045, "Access denied for user 'john'@'localhost' (using password: YES)")
    martin@linux-70ce:~/perl> python cpan2.py
    Traceback (most recent call last):
      File "cpan2.py", line 13, in <module>
        db="cpan") # name of the data base
      File "/usr/lib/python2.7/site-packages/MySQLdb/__init__.py", line 81, in Connect
        return Connection(*args, **kwargs)
      File "/usr/lib/python2.7/site-packages/MySQLdb/connections.py", line 187, in __init__
        super(Connection, self).__init__(*args, **kwargs2)
    _mysql_exceptions.OperationalError: (1049, "Unknown database 'cpan'")
    martin@linux-70ce:~/perl>
    i wonder - who ^is john!?!?
    Akoya P 6512 15" OpenSuse 13.1: AMD Athlon X2 P320
    Samsunng q 210, 12,1" OpenSuse 13.1: Intel® Core™ 2 Duo Proz. P8400 2,26 GHz 1066 MHz FSB 3 MB

  10. #9
    Linux Guru Lakshmipathi's Avatar
    Join Date
    Sep 2006
    Location
    3rd rock from sun - Often seen near moon
    Posts
    1,769
    I would recommend you to debug this via python intepreter

    Just type "python" to reach python prompt

    Now start importing modules one by one

    import urllib
    import urlparse
    import re
    import MySQLdb


    and run the command

    db = MySQLdb.connect(host="localhost", # your host, usually localhost
    user="root", # your username
    passwd="my_passwd", # your password
    db="cpan") # name of the data base


    If you get any error, you find fix that first before proceeding.
    First they ignore you,Then they laugh at you,Then they fight with you,Then you win. - M.K.Gandhi
    -----
    FOSS India Award winning ext3fs Undelete tool www.giis.co.in. Online Linux Terminal http://www.webminal.org

  11. #10
    Linux Engineer docbop's Avatar
    Join Date
    Nov 2009
    Location
    Woodshed, CA
    Posts
    949
    You learn to program by writing small pieces of code and make sure they work, not grabbing someone else's code. You're not ready to debug others code.

    So write a small piece of code to load the mysql modules and open the database and get it to work.
    Next expand that and now open a database and send a SQL select to a table and get it to work.
    New keep expanding and insert a new row in the table. Open the mysql command line and check the table for the update.
    Keep going now write code to select all the rows.
    Keep going with queries and etc.

    You get the idea learn by writing and debugging little pieces of code may sure each little step works before moving on so you know if something is wrong it's in what you just added. If you did just the little bit I just talked about probably take you a few hours, but you'd be further along than you are from weeks of trying to get other people programs to work. You have to build a foundation first.
    A lion does not lose sleep, over the opinion of sheep.

Page 1 of 2 1 2 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •