Commit Graph

109 Commits (e09d0abf5447cbaa1ea6f07c76585237cf3d30b4)

Author SHA1 Message Date
pictuga 437b0da8a9 Updated README to reflect 404 redirection support. 2013-04-19 11:30:34 +02:00
pictuga af8879049f Another huge commit.
Now uses OOP where it fits. Atom feeds are supported, but no real tests were made. Unix globbing is now possible for urls. Caching is done a cleaner way. Feedburner links are also replaced. HTML is cleaned a more efficient way. Code is now much cleaner, using lxml.objectify and a small wrapper to access Atom feeds as if they were RSS feeds (and much faster than feedparser). README has been updated.
2013-04-15 18:51:55 +02:00
pictuga ad25516e34 Speak about deleteTags in README. 2013-04-04 18:31:26 +02:00
pictuga 82084c2c75 Move to OOP.
This is a huge commit. The whole code is ported to Object-Oritented Programming. This makes the code cleaner, which became required to deal with all the different cases, for example with encoding detection. Encoding detection now works better, and uses 3 different methods. HTML pages with an xml declaration are now supported. Feed urls with parameters (eg. "index.php?option=par") are also supported. Cache is now smarter, since it no longer grows indefinitely, since only in-use pages are kept in the cache. Caching is now mandatory. urllib (not urllib2) is no longer needed. Solved a possible crash with log function (when passing list of str with non-unicode encoging).
README is also updated.
2013-04-04 17:43:30 +02:00
pictuga f734fb2623 Added quick licence information. 2013-03-29 20:05:53 +01:00
pictuga 682ab253b0 Typo in README 2013-02-25 21:56:16 +01:00
pictuga 217ff0fd8f Use better markdown syntax for default xpath rule 2013-02-25 21:55:17 +01:00
pictuga 27b0fbaf01 Speak about default xpath in README 2013-02-25 21:54:04 +01:00
pictuga be17f0c78f Updated README to markdown 2013-02-25 21:49:38 +01:00