Commit Graph

231 Commits (a523518ae8610b65a80754246508fb239c3dd6c6)

Author SHA1 Message Date
pictuga 63a06524b7 morss: various encoding fixes 2020-04-09 19:06:51 +02:00
pictuga 78cea10ead morss: replace :getpage with :get
Also provides readabilite debugging
2020-04-09 18:43:20 +02:00
pictuga f3d1f92b39 Detect encoding everytime 2020-04-07 10:38:36 +02:00
pictuga 7691df5257 Use wrapper for http calls 2020-04-07 10:30:17 +02:00
pictuga f1d0431e68 morss: drop :html, replaced with :reader
README updated accordingly
2020-04-07 09:23:29 +02:00
pictuga a82ec96eb7 Delete feedify.py leftover code
iTunes integration untested, unreliable and not working...
2020-04-05 22:16:52 +02:00
pictuga 3617f86e9d morss: make cgi_encore more robust 2020-04-05 16:43:11 +02:00
pictuga d90756b337 morss: drop 'keep' option
Because the Firefox behaviour it is working around is no longer in use
2020-04-05 16:37:27 +02:00
pictuga d20f6237bd crawler: replace ContentNegoHandler with AlternateHandler
More basic. Sends the same headers no matter what. Make requests more "replicable".
Also, drop "text/xml" from RSS contenttype, too broad, matches garbage
2020-04-05 16:05:59 +02:00
pictuga 8a4d68d72c crawler: drop 'basic' toggle
Can't even remember the use case
2020-04-05 16:03:06 +02:00
pictuga e6811138fd morss: use redirected url in :getpage
Still have to find how to do the same thing with feeds...
2020-04-04 20:04:57 +02:00
pictuga 35b702fffd morss: default values for feed creation 2020-04-04 19:39:32 +02:00
pictuga 4a88886767 morss: get_page to act as a basic proxy (for iframes) 2020-04-04 16:37:15 +02:00
pictuga 1653394cf7 morss: cgi_dispatcher to be able to create extra functions 2020-04-04 16:35:16 +02:00
pictuga a8a90cf414 morss: move url/options parsing to own function
For future re-use
2020-04-04 16:33:52 +02:00
pictuga bdbaf0f8a7 morss/cgi: fix handling of special chars in url 2020-04-04 16:21:37 +02:00
pictuga d0e447a2a6 ItemFix: clean up Pocket links 2020-04-04 16:20:39 +02:00
pictuga 7c3091d64c morss: code spacing
One of those commits that make me feel useful
2020-03-21 23:41:46 +01:00
pictuga 37b4e144a9 morss: small fixes
Includes dropping off ftp support
2020-03-21 23:30:18 +01:00
pictuga bd4b7b5bb2 morss: convert HTML feeds to XML ones for completeness 2020-03-21 23:27:42 +01:00
pictuga 68d920d4b5 morss: make FeedFormat more flexible with encoding 2020-03-21 23:26:35 +01:00
pictuga 758ff404a8 morss: fix cgi_app silent output
*Must* return sth
2020-03-21 23:25:25 +01:00
pictuga 463530f02c morss: middleware to enforce encoding
bytes are always expected
2020-03-21 23:23:50 +01:00
pictuga ec0a28a91d morss: use middleware for wsgi apps 2020-03-21 23:23:21 +01:00
pictuga 421acb439d morss: make errors more readable over http 2020-03-21 23:08:29 +01:00
pictuga 42c5d09ccb morss: split "options" var into "raw_options" & "options"
To make it clearer who-is-what
2020-03-21 23:07:07 +01:00
pictuga 056de12484 morss: add sheet.xsl to file handled by http server 2020-03-21 23:06:28 +01:00
pictuga 961a31141f morss: fix url fixing 2020-03-21 17:28:00 +01:00
pictuga d24734110a morss: convert all feeds to RSS
As html feeds might not contain some feeds, leading to data loss
2020-03-20 12:26:34 +01:00
pictuga a41c2a3a62 morss: fix twitter link detection 2020-03-20 12:26:19 +01:00
pictuga dd2651061f feeds & morss: clean up comments/empty lines 2020-03-20 12:25:48 +01:00
pictuga 5865af64f9 Fix indent output for html/xml 2020-03-20 12:18:13 +01:00
pictuga b3b90c067a morss.py: remove "useless" functions
Have to keep the code clean
2020-03-20 11:19:06 +01:00
pictuga bda51b0fc7 feeds & morss: many encoding/tostring fixes 2020-03-19 12:53:25 +01:00
pictuga d26795dce8 morss: from feedify to feeds
Also scrap obsolete feedify code
2020-03-19 10:27:44 +01:00
pictuga 9dbe061fd6 Remove markdown-related code
Time to clean up the code and stop with those non-core features
They just make the code harder to maintain
2020-03-18 16:47:00 +01:00
pictuga e606c5eefb feeds: various small cleanup/fixes 2018-11-18 15:14:38 +01:00
pictuga 3581f34db7 Various feeds.py related fixes 2018-11-11 16:46:23 +01:00
pictuga 679628c7fa Small code clean up 2018-11-11 16:11:00 +01:00
pictuga 399e867c94 morss: add py2 indication 2018-11-11 16:07:25 +01:00
pictuga 221e1f85ad feeds: fix implementation in morss 2018-11-11 15:26:09 +01:00
pictuga 4e144487db Test for feedify support first
Otherwise might never be called if the content-type is also supported
2018-10-25 01:17:24 +02:00
pictuga e72ca3f984 morss: improved output type 2018-09-30 22:02:29 +02:00
pictuga 2ccf36617a morss: improve http parameter parsing 2018-09-30 22:01:19 +02:00
pictuga 2d5bf7b38b Fix xml detection regex
Also (dirtily) fixes #18 for now
2017-11-04 14:21:05 +01:00
pictuga 194465544a crawler: separate CacheHander and actual caching
Default cache is now just an in-memory {}
2017-11-04 12:41:56 +01:00
pictuga 2d7d0fcdca morss: fix cgi in python 3
Needs explicit [] in py3
2017-11-04 12:27:47 +01:00
pictuga f563040809 readabilite: threshold to detect if it contains an article
Useful for videos/images-based images
2017-10-28 01:30:21 +02:00
pictuga 64babd6713 morss: make readabilite links absolute 2017-07-29 14:37:37 +02:00
pictuga d3bc2926fc Remove :hungry
Mostly usless. If you need it, you might as well not need to use morss in the first place...
2017-03-25 13:52:58 -10:00
pictuga 167e3e4a15 feedify: accept xpath rules passed as parameters 2017-03-20 20:56:48 -10:00
pictuga 08f08ef704 improve morss url detection regex 2017-03-20 20:51:13 -10:00
pictuga 1b4341f741 accept query_string in morss cgi 2017-03-20 20:50:04 -10:00
pictuga 5e61686373 Only use full feed for articles & feedify
Sometimes using referrer and/or useragent makes some dumb websites return diferent content (hello feedburner)
2017-03-18 23:43:28 -10:00
pictuga 0b6e553054 Move iTunes code to feedify.py 2017-03-18 23:41:37 -10:00
pictuga d4937812a8 Remove HTTPError code
Only used to look nice but useless (inherits from IOError anyway)
2017-03-18 23:39:32 -10:00
pictuga 67f5a21019 Move build_opener to crawler
Forgotten
2017-03-18 23:03:04 -10:00
pictuga 2003e2760b Move custom_handler to crawler
Makes more sense. Easier to reuse. Also cleaned up a bit the code
2017-03-18 22:51:27 -10:00
pictuga f4abc4e8a4 Detect encoding (using crawler) before readabilite 2017-03-11 02:30:57 -10:00
pictuga 385f9eb39a morss: use crawler strict accept for feed 2017-03-08 19:05:48 -10:00
Florian Muenchbach 993ac638a3 Added override for auto-detected character encoding of parsed pages. 2017-03-08 18:45:20 -10:00
pictuga 627163abff Make cache settings in morss nicer 2017-03-08 18:09:24 -10:00
pictuga e5f8e43659 Shifted the <link rel='alternate'/> redirect to crawler
Now using MIMETYPE var from crawler within morss.py
2017-03-08 18:03:34 -10:00
pictuga a8ac2ed1ca Turn FeedBefore/After into ItemBefore/After
To reduce the number of loops
2017-02-28 23:24:32 -10:00
pictuga fcc5e8a076 Add "Feed/Item" in functions name
To make it instantly clearer what they work on
2017-02-28 23:23:15 -10:00
pictuga 60e3311e97 Use readabilite properly
Not thru some weird wrapper anymore
2017-02-28 22:45:26 -10:00
pictuga dc8423550f Support xml starting with \s 2017-02-25 19:04:32 -10:00
pictuga b14381f575 Use internal readability fork
Much simpler, doesn't clean the html, probably less efficient, but much faster
2016-05-31 02:50:03 +02:00
pictuga 2b9bfb47e5 Remove :smart and etag headers
Dirty code, not very useful. Use simple cache-control instead.
2016-05-31 02:47:49 +02:00
pictuga 4ff80cec86 Check argv length before using it 2016-05-31 02:46:28 +02:00
pictuga 466d8e47d6 Also make buriy's readability port compatible
Should be faster, and it now supports py3
2015-08-29 18:33:12 +02:00
pictuga 95d9d847e9 :proxy implies :keep 2015-08-29 17:48:07 +02:00
pictuga 624fa47f4f Allow CLI change of the www/ path 2015-08-28 19:22:55 +02:00
pictuga 31fc939d52 Allow CLI change of the http server port 2015-08-28 19:22:23 +02:00
pictuga 4f9000beed Comment code of launching modes 2015-08-28 19:18:09 +02:00
pictuga 5e87b56a03 Return error code in plain text in file server 2015-08-28 19:16:15 +02:00
pictuga ffda3fac7e Improve file detection in web server 2015-08-28 19:15:40 +02:00
pictuga 6741a408dd Remove now-useless ca-cert file path 2015-08-28 19:13:54 +02:00
Massimo Vannucci 098a306c91 Fixed typo 2015-08-05 23:24:44 +02:00
pictuga 5c2151ffd6 Improve widely feedsportal url decoder 2015-06-14 20:32:47 +08:00
pictuga ae062ebe90 Remove deprecated https error catch 2015-04-07 18:59:37 +08:00
pictuga 7a3b257328 Make :mono use basic loop
Makes profiling easier
2015-04-07 18:16:08 +08:00
pictuga 2f86a2a44b Remove useless obscure cgi code 2015-04-07 09:49:44 +08:00
pictuga 131ba09207 Change :cache mode behavior
Makes underlying code way cleaner
2015-04-07 09:38:22 +08:00
pictuga cafb87d561 Fix sqlite relative path in cgi 2015-04-07 09:37:25 +08:00
pictuga decb3f15f6 Move the mod_cgi files to /cgi/ 2015-04-07 09:36:00 +08:00
pictuga b267791199 Remove hashbang from __init__.py 2015-04-07 09:34:22 +08:00
pictuga acae47dc79 2to3: fix cli_app string print 2015-04-06 23:27:15 +08:00
pictuga 32aa96afa7 Cache HTTP content using a custom Handler
Much much cleaner. Nothing comparable
2015-04-06 23:26:12 +08:00
pictuga 1b4fc88ad0 Replace MetaRedirect handler with two cleaner ones
One for <meta http-equiv> and one for HTTP 'refresh' header
2015-04-06 23:03:17 +08:00
pictuga f2fe4fc364 Drop HTTPS SSL certificate verification
Breaks everything with python 3. Now built-in in recent python 2.7.9 and python 3.4-ish
2015-04-06 22:54:59 +08:00
pictuga 2e3b766a0a http-server port as a var, print port on startup 2015-03-24 23:20:06 +08:00
pictuga 656b29e0ef 2to3: using unicode/str to please py3 2015-03-11 01:05:02 +08:00
pictuga cbeb01e555 2to3: fix urllib header retrieval 2015-03-11 01:03:16 +08:00
pictuga 6ae60d0343 2to3: py3-compatible readability fork 2015-03-03 01:03:03 +08:00
pictuga dbb3883516 2to3: urllib mimetype 2015-03-03 00:55:58 +08:00
pictuga 071288015b 2to3: morss.py port xrange 2015-02-25 18:41:49 +08:00
pictuga 803d6e37c4 2to3: morss.py port most default libs 2015-02-25 18:36:27 +08:00
pictuga 27cf8f6498 2to3: (iter)items to list 2015-02-25 12:02:53 +08:00
pictuga 3fb90cb7b4 2to3: local import 2015-02-25 11:57:10 +08:00