Commit Graph

211 Commits (985185f47f9cf14ac4d1b1982f6cbe2de3810c2e)

Author SHA1 Message Date
pictuga 68d920d4b5 morss: make FeedFormat more flexible with encoding 2020-03-21 23:26:35 +01:00
pictuga 758ff404a8 morss: fix cgi_app silent output
*Must* return sth
2020-03-21 23:25:25 +01:00
pictuga 463530f02c morss: middleware to enforce encoding
bytes are always expected
2020-03-21 23:23:50 +01:00
pictuga ec0a28a91d morss: use middleware for wsgi apps 2020-03-21 23:23:21 +01:00
pictuga 421acb439d morss: make errors more readable over http 2020-03-21 23:08:29 +01:00
pictuga 42c5d09ccb morss: split "options" var into "raw_options" & "options"
To make it clearer who-is-what
2020-03-21 23:07:07 +01:00
pictuga 056de12484 morss: add sheet.xsl to file handled by http server 2020-03-21 23:06:28 +01:00
pictuga 961a31141f morss: fix url fixing 2020-03-21 17:28:00 +01:00
pictuga d24734110a morss: convert all feeds to RSS
As html feeds might not contain some feeds, leading to data loss
2020-03-20 12:26:34 +01:00
pictuga a41c2a3a62 morss: fix twitter link detection 2020-03-20 12:26:19 +01:00
pictuga dd2651061f feeds & morss: clean up comments/empty lines 2020-03-20 12:25:48 +01:00
pictuga 5865af64f9 Fix indent output for html/xml 2020-03-20 12:18:13 +01:00
pictuga b3b90c067a morss.py: remove "useless" functions
Have to keep the code clean
2020-03-20 11:19:06 +01:00
pictuga bda51b0fc7 feeds & morss: many encoding/tostring fixes 2020-03-19 12:53:25 +01:00
pictuga d26795dce8 morss: from feedify to feeds
Also scrap obsolete feedify code
2020-03-19 10:27:44 +01:00
pictuga 9dbe061fd6 Remove markdown-related code
Time to clean up the code and stop with those non-core features
They just make the code harder to maintain
2020-03-18 16:47:00 +01:00
pictuga e606c5eefb feeds: various small cleanup/fixes 2018-11-18 15:14:38 +01:00
pictuga 3581f34db7 Various feeds.py related fixes 2018-11-11 16:46:23 +01:00
pictuga 679628c7fa Small code clean up 2018-11-11 16:11:00 +01:00
pictuga 399e867c94 morss: add py2 indication 2018-11-11 16:07:25 +01:00
pictuga 221e1f85ad feeds: fix implementation in morss 2018-11-11 15:26:09 +01:00
pictuga 4e144487db Test for feedify support first
Otherwise might never be called if the content-type is also supported
2018-10-25 01:17:24 +02:00
pictuga e72ca3f984 morss: improved output type 2018-09-30 22:02:29 +02:00
pictuga 2ccf36617a morss: improve http parameter parsing 2018-09-30 22:01:19 +02:00
pictuga 2d5bf7b38b Fix xml detection regex
Also (dirtily) fixes #18 for now
2017-11-04 14:21:05 +01:00
pictuga 194465544a crawler: separate CacheHander and actual caching
Default cache is now just an in-memory {}
2017-11-04 12:41:56 +01:00
pictuga 2d7d0fcdca morss: fix cgi in python 3
Needs explicit [] in py3
2017-11-04 12:27:47 +01:00
pictuga f563040809 readabilite: threshold to detect if it contains an article
Useful for videos/images-based images
2017-10-28 01:30:21 +02:00
pictuga 64babd6713 morss: make readabilite links absolute 2017-07-29 14:37:37 +02:00
pictuga d3bc2926fc Remove :hungry
Mostly usless. If you need it, you might as well not need to use morss in the first place...
2017-03-25 13:52:58 -10:00
pictuga 167e3e4a15 feedify: accept xpath rules passed as parameters 2017-03-20 20:56:48 -10:00
pictuga 08f08ef704 improve morss url detection regex 2017-03-20 20:51:13 -10:00
pictuga 1b4341f741 accept query_string in morss cgi 2017-03-20 20:50:04 -10:00
pictuga 5e61686373 Only use full feed for articles & feedify
Sometimes using referrer and/or useragent makes some dumb websites return diferent content (hello feedburner)
2017-03-18 23:43:28 -10:00
pictuga 0b6e553054 Move iTunes code to feedify.py 2017-03-18 23:41:37 -10:00
pictuga d4937812a8 Remove HTTPError code
Only used to look nice but useless (inherits from IOError anyway)
2017-03-18 23:39:32 -10:00
pictuga 67f5a21019 Move build_opener to crawler
Forgotten
2017-03-18 23:03:04 -10:00
pictuga 2003e2760b Move custom_handler to crawler
Makes more sense. Easier to reuse. Also cleaned up a bit the code
2017-03-18 22:51:27 -10:00
pictuga f4abc4e8a4 Detect encoding (using crawler) before readabilite 2017-03-11 02:30:57 -10:00
pictuga 385f9eb39a morss: use crawler strict accept for feed 2017-03-08 19:05:48 -10:00
Florian Muenchbach 993ac638a3 Added override for auto-detected character encoding of parsed pages. 2017-03-08 18:45:20 -10:00
pictuga 627163abff Make cache settings in morss nicer 2017-03-08 18:09:24 -10:00
pictuga e5f8e43659 Shifted the <link rel='alternate'/> redirect to crawler
Now using MIMETYPE var from crawler within morss.py
2017-03-08 18:03:34 -10:00
pictuga a8ac2ed1ca Turn FeedBefore/After into ItemBefore/After
To reduce the number of loops
2017-02-28 23:24:32 -10:00
pictuga fcc5e8a076 Add "Feed/Item" in functions name
To make it instantly clearer what they work on
2017-02-28 23:23:15 -10:00
pictuga 60e3311e97 Use readabilite properly
Not thru some weird wrapper anymore
2017-02-28 22:45:26 -10:00
pictuga dc8423550f Support xml starting with \s 2017-02-25 19:04:32 -10:00
pictuga b14381f575 Use internal readability fork
Much simpler, doesn't clean the html, probably less efficient, but much faster
2016-05-31 02:50:03 +02:00
pictuga 2b9bfb47e5 Remove :smart and etag headers
Dirty code, not very useful. Use simple cache-control instead.
2016-05-31 02:47:49 +02:00
pictuga 4ff80cec86 Check argv length before using it 2016-05-31 02:46:28 +02:00
pictuga 466d8e47d6 Also make buriy's readability port compatible
Should be faster, and it now supports py3
2015-08-29 18:33:12 +02:00
pictuga 95d9d847e9 :proxy implies :keep 2015-08-29 17:48:07 +02:00
pictuga 624fa47f4f Allow CLI change of the www/ path 2015-08-28 19:22:55 +02:00
pictuga 31fc939d52 Allow CLI change of the http server port 2015-08-28 19:22:23 +02:00
pictuga 4f9000beed Comment code of launching modes 2015-08-28 19:18:09 +02:00
pictuga 5e87b56a03 Return error code in plain text in file server 2015-08-28 19:16:15 +02:00
pictuga ffda3fac7e Improve file detection in web server 2015-08-28 19:15:40 +02:00
pictuga 6741a408dd Remove now-useless ca-cert file path 2015-08-28 19:13:54 +02:00
Massimo Vannucci 098a306c91 Fixed typo 2015-08-05 23:24:44 +02:00
pictuga 5c2151ffd6 Improve widely feedsportal url decoder 2015-06-14 20:32:47 +08:00
pictuga ae062ebe90 Remove deprecated https error catch 2015-04-07 18:59:37 +08:00
pictuga 7a3b257328 Make :mono use basic loop
Makes profiling easier
2015-04-07 18:16:08 +08:00
pictuga 2f86a2a44b Remove useless obscure cgi code 2015-04-07 09:49:44 +08:00
pictuga 131ba09207 Change :cache mode behavior
Makes underlying code way cleaner
2015-04-07 09:38:22 +08:00
pictuga cafb87d561 Fix sqlite relative path in cgi 2015-04-07 09:37:25 +08:00
pictuga decb3f15f6 Move the mod_cgi files to /cgi/ 2015-04-07 09:36:00 +08:00
pictuga b267791199 Remove hashbang from __init__.py 2015-04-07 09:34:22 +08:00
pictuga acae47dc79 2to3: fix cli_app string print 2015-04-06 23:27:15 +08:00
pictuga 32aa96afa7 Cache HTTP content using a custom Handler
Much much cleaner. Nothing comparable
2015-04-06 23:26:12 +08:00
pictuga 1b4fc88ad0 Replace MetaRedirect handler with two cleaner ones
One for <meta http-equiv> and one for HTTP 'refresh' header
2015-04-06 23:03:17 +08:00
pictuga f2fe4fc364 Drop HTTPS SSL certificate verification
Breaks everything with python 3. Now built-in in recent python 2.7.9 and python 3.4-ish
2015-04-06 22:54:59 +08:00
pictuga 2e3b766a0a http-server port as a var, print port on startup 2015-03-24 23:20:06 +08:00
pictuga 656b29e0ef 2to3: using unicode/str to please py3 2015-03-11 01:05:02 +08:00
pictuga cbeb01e555 2to3: fix urllib header retrieval 2015-03-11 01:03:16 +08:00
pictuga 6ae60d0343 2to3: py3-compatible readability fork 2015-03-03 01:03:03 +08:00
pictuga dbb3883516 2to3: urllib mimetype 2015-03-03 00:55:58 +08:00
pictuga 071288015b 2to3: morss.py port xrange 2015-02-25 18:41:49 +08:00
pictuga 803d6e37c4 2to3: morss.py port most default libs 2015-02-25 18:36:27 +08:00
pictuga 27cf8f6498 2to3: (iter)items to list 2015-02-25 12:02:53 +08:00
pictuga 3fb90cb7b4 2to3: local import 2015-02-25 11:57:10 +08:00
pictuga 47c8a511ff 2to3: print's 2015-02-25 11:57:10 +08:00
pictuga 604b03e2ba Delete desc when :keep=False
Still needed for Firefox, cause empty <desc/> still show up instead of content in feed preview
2015-02-24 00:38:34 +08:00
pictuga 83ed440e67 Fix issue when desc and content empty
Wouldn't put fetched article in feed
2015-02-24 00:38:02 +08:00
pictuga 5c23f90f0b Disable options filtering by default
But still provide sample code
2015-02-21 02:01:32 +08:00
pictuga 149117029c Improve logging of fetching errors 2015-02-21 01:58:45 +08:00
pictuga d5269964fc Make :theforce also bypass http errors 2015-02-21 01:58:16 +08:00
pictuga f0dcb9912e Fix cached errors handling 2015-02-21 01:57:33 +08:00
pictuga f62aedda12 Double HTTP timeout
Better slow than nothing (especially when running on a personal computer)
2015-02-21 01:55:53 +08:00
pictuga 76c4211a04 Make :hungry more useful 2015-02-21 01:55:25 +08:00
pictuga ef946c0712 XML pretty-print in separate option
Who reads plain XML anyway?
2015-02-20 17:38:39 +08:00
pictuga ec5f5b865f Make it easy to restrict available options 2014-11-21 22:01:03 +01:00
pictuga 105ca67744 Move facebook token to own script
To a PHP script actually. Not sure why PHP. Keeps morss' code cleaner. This piece of code had nothing to do in there, and didn't bring any advantage.
2014-11-19 20:09:27 +01:00
pictuga 8131ea2244 HTTPS SSL certificate validation
Specific error message added
2014-11-19 11:59:59 +01:00
pictuga 1b26c5f0e3 Split SimpleDownload in a lot of Handlers
Cleaner code, easier to edit, more flexibility. Paves the way to SSL certificates validation.
Still have to clean up the code of AcceptHeadersHandler.
2014-11-19 11:57:40 +01:00
pictuga f46576168a Add :mono to disable multithreading
Convenient to have linear logging
2014-11-10 23:14:54 +01:00
pictuga 5dd262139d Add HTTP error code to download error message 2014-11-09 15:45:01 +01:00
pictuga 6d5bb2b3c5 Print error message in wgi mode 2014-11-09 15:44:42 +01:00
pictuga a820cf6812 Run :strip in After
Makes more sense
2014-11-09 15:01:50 +01:00
pictuga 5eefe2c916 Log more when using wgi 2014-11-08 21:22:34 +01:00
pictuga 6f2061ff37 Fix :smart
Wasn't using the right way
2014-11-08 21:22:07 +01:00