Commit Graph

910 Commits (577e5b4afedc01e8c692743c8b3d55ef7c072d7e)
 

Author SHA1 Message Date
pictuga e76ab2b631 Update gunicorn instructions 2020-08-23 18:59:02 +02:00
pictuga aa9143302b Remove now-unused isInt code 2020-08-23 18:51:09 +02:00
pictuga 0d62a7625b Define http port via env vars as well 2020-08-23 18:50:18 +02:00
pictuga bd0efb1529 crawler: missing os import 2020-08-23 18:45:44 +02:00
pictuga 47a17614ef Rename morss/cgi.py into morss/wsgi.py
To avoid name collision with the built-in cgi lib
2020-08-23 18:44:49 +02:00
pictuga 4dfebe78f7 Pick caching backend via env vars 2020-08-23 18:43:18 +02:00
pictuga dcd3e4a675 cgi.py: add missing impots 2020-08-23 18:31:05 +02:00
pictuga e968b2ea7f Remove leftover :debug code 2020-08-23 16:59:34 +02:00
pictuga 0ac590c798 Set MAX_/LIM_* settings via env var 2020-08-23 16:09:58 +02:00
pictuga fa1b5aef09 Instructions for DEBUG= use 2020-08-23 15:31:11 +02:00
pictuga 7f6309f618 README: :silent was explained twice 2020-08-23 14:34:04 +02:00
pictuga f65fb45030 :debug completely deprecated in favour of DEBUG= 2020-08-23 14:33:32 +02:00
pictuga 6dd40e5cc4 cli.py: fix Options code 2020-08-23 14:25:09 +02:00
pictuga 0acfce5a22 cli.py: remove log 2020-08-23 14:24:57 +02:00
pictuga 97ccc15db0 cgi.py: rename parseOptions to parse_options 2020-08-23 14:24:23 +02:00
pictuga 7a560181f7 Use env var for DEBUG 2020-08-23 14:23:45 +02:00
pictuga baccd3b22b Move parseOptions to cgi.py
As it is no longer used in cli.py
2020-08-22 00:37:34 +02:00
pictuga f79938ab11 Add :silent to readme & argparse 2020-08-22 00:02:08 +02:00
pictuga 5b8bd47829 cli.py: remove draft code 2020-08-21 23:59:12 +02:00
pictuga b5b355aa6e readabilite: increase penalty for high link density 2020-08-21 23:55:04 +02:00
pictuga 94097f481a sheet.xsl: better handle some corner cases 2020-08-21 23:54:35 +02:00
pictuga 8161baa7ae sheet.xsl: improve css 2020-08-21 23:54:12 +02:00
pictuga bd182bcb85 Move cli code to argParse
Related code changes (incl. :format=xyz)
2020-08-21 23:52:56 +02:00
pictuga c7c2c5d749 Removed unused filterOptions code 2020-08-21 23:23:33 +02:00
pictuga c6b52e625f split morss.py into __main__/cgi/cli.py
Should hopefully allow cleaner code in the future
2020-08-21 22:17:55 +02:00
pictuga c6d3a0eb53 readabilite: clean up code 2020-07-15 00:49:34 +02:00
pictuga c628ee802c README: add docker-compose instructions 2020-07-13 20:50:39 +02:00
pictuga 6021b912ff morss: fix item removal
Usual issue when editing a list while looping over it
2020-07-06 19:25:48 +02:00
pictuga f18a128ee6 Change :first for :newest
i.e. toggle default for the more-obvious option
2020-07-06 19:25:17 +02:00
pictuga 64af86c11e crawler: catch html parsing errors 2020-07-06 12:25:38 +02:00
pictuga 15951d228c Add :first to NOT sort items by date 2020-07-06 11:39:08 +02:00
pictuga c1b1f5f58a morss: restrict iframe use from :get to avoid abuse 2020-06-09 12:33:37 +02:00
pictuga 985185f47f morss: more flexible feed creator auto-detection 2020-06-08 13:03:24 +02:00
pictuga 3190d1ec5a feeds: remove useless if(len) before loop 2020-06-02 13:57:45 +02:00
pictuga 9815794a97 sheet.xsl: make text more self explanatory 2020-05-27 21:42:00 +02:00
pictuga 758b6861b9 sheet.xsl: fix text alignment 2020-05-27 21:36:11 +02:00
pictuga ce4cf01aa6 crawler: clean up encoding detection code 2020-05-27 21:35:24 +02:00
pictuga dcfdb75a15 crawler: fix chinese encoding support 2020-05-27 21:34:43 +02:00
pictuga 4ccc0dafcd Basic help for sub-lib interactive use 2020-05-26 19:34:20 +02:00
pictuga 2fe3e0b8ee feeds: clean up other stylesheets before putting ours 2020-05-26 19:26:36 +02:00
pictuga ad3ba9de1a sheet.xsl: add <select/> to use :firstlink 2020-05-13 12:33:12 +02:00
pictuga 68c46a1823 morss: remove deprecated twitter/fb link handling 2020-05-13 12:31:09 +02:00
pictuga 91be2d229e morss: ability to use first link from desc instead of default link 2020-05-13 12:29:53 +02:00
pictuga 038f267ea2 Rename :theforce into :force 2020-05-13 11:49:15 +02:00
pictuga 22005065e8 Use etree.tostring 'method' arg
Gives appropriately formatted html code.
Some pages might otherwise be rendered as blank.
2020-05-13 11:44:34 +02:00
pictuga 7d0d416610 morss: cache articles for 24hrs
Also make it possible to refetch articles, regardless of cache
2020-05-12 21:10:31 +02:00
pictuga 5dac4c69a1 crawler: more code comments 2020-05-12 20:44:25 +02:00
pictuga 36e2a1c3fd crawler: increase size limit from 100KiB to 500
I'm looking at you, worldbankgroup.csod.com/ats/careersite/search.aspx
2020-05-12 19:34:16 +02:00
pictuga 83dd2925d3 readabilite: better parsing
Keeping blank_text keeps the tree more as-it, making the final output closer to expectations
2020-05-12 14:15:53 +02:00
pictuga e09d0abf54 morss: remove deprecated peace of code 2020-05-07 16:05:30 +02:00