pictuga
|
b138f11771
|
util: support more `data_files` location
continuous-integration/drone/push Build is passing
Details
|
2022-01-23 12:40:18 +01:00 |
pictuga
|
a01258700d
|
More ordering options
continuous-integration/drone/push Build was killed
Details
|
2022-01-23 12:27:07 +01:00 |
pictuga
|
4d6d3c9239
|
wsgi: limit supported mimetypes & return actual mimetype
continuous-integration/drone/push Build is passing
Details
|
2022-01-23 11:44:07 +01:00 |
pictuga
|
e81f6b173f
|
readabilite: remove code duplicate
|
2022-01-23 11:41:32 +01:00 |
pictuga
|
fe5dbf1ce0
|
wsgi: reuse mimetype table from crawler
|
2022-01-22 13:22:39 +01:00 |
pictuga
|
d05706e056
|
crawler: fix typo
continuous-integration/drone/push Build was killed
Details
|
2022-01-19 13:41:12 +01:00 |
pictuga
|
e88a823ada
|
feeds: better handle rulesets without a 'mode' specified
continuous-integration/drone/push Build is failing
Details
|
2022-01-19 13:08:33 +01:00 |
pictuga
|
750850c162
|
crawler: avoid too many .append()
|
2022-01-19 13:04:33 +01:00 |
pictuga
|
c8669002e4
|
feeds: exotic xpath in html as well
continuous-integration/drone/push Build is passing
Details
|
2022-01-17 14:22:48 +00:00 |
pictuga
|
c524e54d2d
|
feeds: support some exotic xpath rules returning a single string
continuous-integration/drone/push Build is passing
Details
|
2022-01-17 13:59:58 +00:00 |
pictuga
|
fb643f5ef1
|
readabilite: remove unneeded reference to `features` (overriden by `builder`)
continuous-integration/drone/push Build is passing
Details
|
2022-01-03 18:01:12 +00:00 |
pictuga
|
dbdca910d8
|
readabilite: fix new parser code & drop PIs
continuous-integration/drone/push Build was killed
Details
|
2022-01-03 17:51:49 +00:00 |
pictuga
|
9eb19fac04
|
readabilite: use custom html parser within bs4's lxml parser
continuous-integration/drone/push Build is passing
Details
Solves the following obscure error:
ValueError: Invalid PI name 'b'xml''
|
2022-01-03 16:26:17 +00:00 |
pictuga
|
d424e394d1
|
readabilite: use lxml bs4 parser for speed
continuous-integration/drone/push Build is passing
Details
|
2022-01-01 14:52:48 +01:00 |
pictuga
|
3f92787b38
|
readabilite: limit html comments related issues
continuous-integration/drone/push Build is passing
Details
|
2022-01-01 13:58:42 +01:00 |
pictuga
|
afc31eb6e9
|
readabilite: avoid double parsing of html
continuous-integration/drone/push Build is passing
Details
|
2022-01-01 12:51:30 +01:00 |
pictuga
|
87d2fe772d
|
wsgi: fix py2 compatibility
|
2022-01-01 12:35:41 +01:00 |
pictuga
|
917aa0fbc5
|
crawler: do not re-save cached response
continuous-integration/drone/push Build is passing
Details
Otherwise cache never gets invalidated!
|
2021-12-31 19:28:11 +01:00 |
pictuga
|
d17b9a2f27
|
Fix typo in DISKCACHE_DIR var name
continuous-integration/drone/push Build was killed
Details
|
2021-12-23 12:02:24 +01:00 |
pictuga
|
368e4683d6
|
util: clean paths code
continuous-integration/drone/push Build was killed
Details
|
2021-12-16 08:53:18 +00:00 |
pictuga
|
7cdcbd23e1
|
wsgi: fix another typo
continuous-integration/drone/push Build is passing
Details
|
2021-12-14 12:06:08 +00:00 |
pictuga
|
25f283da1f
|
wsgi: fix bug following the removal of the loop
continuous-integration/drone/push Build was killed
Details
|
2021-12-14 11:56:55 +00:00 |
pictuga
|
727d14e539
|
wsgi: use data_files helper
continuous-integration/drone/push Build was killed
Details
|
2021-12-14 11:47:10 +00:00 |
pictuga
|
3392ae3973
|
util: try one more path for data_files
continuous-integration/drone/push Build is passing
Details
|
2021-12-14 11:10:26 +00:00 |
pictuga
|
51f1d330a4
|
Fn to access data_files & pkg files
continuous-integration/drone Build is running
Details
continuous-integration/drone/push Build is passing
Details
|
2021-12-05 12:09:01 +01:00 |
pictuga
|
eb47aac6f1
|
morss: respect timeout settings in all cases
continuous-integration/drone/push Build is failing
Details
Special treatment of feed fetch not justified and not documented
|
2021-11-25 22:13:38 +01:00 |
pictuga
|
eca546b890
|
Change HTTP error code to 404
continuous-integration/drone/push Build is failing
Details
To tell them apart from 'true' 500 errors
|
2021-11-25 21:34:46 +01:00 |
pictuga
|
d8cc07223e
|
readabilite: fix bug when nothing above threshold
continuous-integration/drone/push Build is failing
Details
|
2021-11-23 20:53:00 +01:00 |
pictuga
|
765e0ba728
|
Pass py error msg in http headers
continuous-integration/drone/push Build is passing
Details
|
2021-11-22 23:22:13 +01:00 |
pictuga
|
6ec3fb47d1
|
readabilite: .strip() first to save time
continuous-integration/drone/push Build is passing
Details
|
2021-11-15 21:54:07 +01:00 |
pictuga
|
1083f3ffbc
|
crawler: make sure to use HTTPMessage
continuous-integration/drone/push Build is passing
Details
|
2021-11-11 10:21:48 +01:00 |
pictuga
|
7eeb1d696c
|
crawler: clean up code
continuous-integration/drone/push Build is passing
Details
|
2021-11-10 23:25:03 +01:00 |
pictuga
|
e42df98f83
|
crawler: fix regression brought with 44a6b2591
continuous-integration/drone/push Build is passing
Details
|
2021-11-10 23:08:31 +01:00 |
pictuga
|
cb21871c35
|
crawler: clean up caching code
continuous-integration/drone/push Build is passing
Details
|
2021-11-08 22:02:23 +01:00 |
pictuga
|
c71cf5d5ce
|
caching: fix diskcache implementation
|
2021-11-08 21:57:43 +01:00 |
pictuga
|
44a6b2591d
|
crawler: cleaner http header object import
|
2021-11-07 19:44:36 +01:00 |
pictuga
|
a890536601
|
morss: comment code a bit
|
2021-11-07 18:26:07 +01:00 |
pictuga
|
8de309f2d4
|
caching: add diskcache backend
|
2021-11-07 18:15:20 +01:00 |
pictuga
|
cbf7b3f77b
|
caching: simplify sqlite code
|
2021-11-07 18:14:18 +01:00 |
pictuga
|
d023ec8d73
|
Change default port to 8000
|
2021-10-19 22:19:59 +02:00 |
pictuga
|
5473b77416
|
Post-clean up isort
continuous-integration/drone/push Build is passing
Details
|
2021-09-21 08:11:04 +02:00 |
pictuga
|
0365232a73
|
readabilite: custom xpath for article detection
continuous-integration/drone/push Build is failing
Details
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
a523518ae8
|
cache: avoid name collision
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
52c48b899f
|
readability: better var names
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
9649cabb1b
|
morss: do not crash on empty pages
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
10535a17c5
|
cache: fix isort
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
7d86972e58
|
Add Redis cache backend
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
5da7121a77
|
Fix Options class behaviour
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
bb82902ad1
|
Move cache code to its own file
|
2021-09-21 08:04:45 +02:00 |
pictuga
|
04afa28fe7
|
crawler: cache pickle'd array
|
2021-09-21 08:04:45 +02:00 |