578 Commits (master)

Author SHA1 Message Date
pictuga 438c32a312 Remove sqlite & mysql cache backends
continuous-integration/drone/push Build is failing Details
2 months ago
pictuga e1ed33f320 crawler: improve html iter code
continuous-integration/drone/push Build is passing Details
12 months ago
pictuga b65272daab crawler: accept more meta redirects
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 4d64afe9cb crawler: fix regression from d6b90448f3
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga 32645548c2 pytest: first batch with test_feeds
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga d6b90448f3 crawler: improve handling of non-ascii urls
1 year ago
pictuga da81edc651 log to stderr
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga 4f2895f931 cli: update `--help`
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga b2b04691d6 Ability to pass custom data_files location
1 year ago
pictuga bfaf7b0fac feeds: clean up default `item_link`
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga 32d9bc9d9d feeds: proceed with conversion when rules do not match
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga b138f11771 util: support more `data_files` location
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga a01258700d More ordering options
continuous-integration/drone/push Build was killed Details
1 year ago
pictuga 4d6d3c9239 wsgi: limit supported mimetypes & return actual mimetype
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga e81f6b173f readabilite: remove code duplicate
1 year ago
pictuga fe5dbf1ce0 wsgi: reuse mimetype table from crawler
1 year ago
pictuga d05706e056 crawler: fix typo
continuous-integration/drone/push Build was killed Details
1 year ago
pictuga e88a823ada feeds: better handle rulesets without a 'mode' specified
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga 750850c162 crawler: avoid too many .append()
1 year ago
pictuga c8669002e4 feeds: exotic xpath in html as well
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga c524e54d2d feeds: support some exotic xpath rules returning a single string
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga fb643f5ef1 readabilite: remove unneeded reference to `features` (overriden by `builder`)
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga dbdca910d8 readabilite: fix new parser code & drop PIs
continuous-integration/drone/push Build was killed Details
1 year ago
pictuga 9eb19fac04 readabilite: use custom html parser within bs4's lxml parser
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga d424e394d1 readabilite: use lxml bs4 parser for speed
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 3f92787b38 readabilite: limit html comments related issues
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga afc31eb6e9 readabilite: avoid double parsing of html
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 87d2fe772d wsgi: fix py2 compatibility
1 year ago
pictuga 917aa0fbc5 crawler: do not re-save cached response
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga d17b9a2f27 Fix typo in DISKCACHE_DIR var name
continuous-integration/drone/push Build was killed Details
1 year ago
pictuga 368e4683d6 util: clean paths code
continuous-integration/drone/push Build was killed Details
1 year ago
pictuga 7cdcbd23e1 wsgi: fix another typo
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 25f283da1f wsgi: fix bug following the removal of the loop
continuous-integration/drone/push Build was killed Details
1 year ago
pictuga 727d14e539 wsgi: use data_files helper
continuous-integration/drone/push Build was killed Details
1 year ago
pictuga 3392ae3973 util: try one more path for data_files
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 51f1d330a4 Fn to access data_files & pkg files
continuous-integration/drone Build is running Details
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga eb47aac6f1 morss: respect timeout settings in all cases
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga eca546b890 Change HTTP error code to 404
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga d8cc07223e readabilite: fix bug when nothing above threshold
continuous-integration/drone/push Build is failing Details
1 year ago
pictuga 765e0ba728 Pass py error msg in http headers
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 6ec3fb47d1 readabilite: .strip() first to save time
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 1083f3ffbc crawler: make sure to use HTTPMessage
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga 7eeb1d696c crawler: clean up code
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga e42df98f83 crawler: fix regression brought with 44a6b2591
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga cb21871c35 crawler: clean up caching code
continuous-integration/drone/push Build is passing Details
1 year ago
pictuga c71cf5d5ce caching: fix diskcache implementation
1 year ago
pictuga 44a6b2591d crawler: cleaner http header object import
1 year ago
pictuga a890536601 morss: comment code a bit
1 year ago
pictuga 8de309f2d4 caching: add diskcache backend
1 year ago
pictuga cbf7b3f77b caching: simplify sqlite code
1 year ago