pictuga
071288015b
2to3: morss.py port xrange
2015-02-25 18:41:49 +08:00
pictuga
803d6e37c4
2to3: morss.py port most default libs
2015-02-25 18:36:27 +08:00
pictuga
327b8504c4
2to3: feeds.py port urllib2
2015-02-25 18:22:38 +08:00
pictuga
4f6f8bd41b
2to3: feedify.py port http-related lib
2015-02-25 18:16:35 +08:00
pictuga
a0f2e0d995
2to3: crawler.py improve except
2015-02-25 18:07:09 +08:00
pictuga
6a06b742f9
2to3: crawler.py port try as
2015-02-25 18:03:54 +08:00
pictuga
c2d85e2bf9
2to3: crawler.py port httplib
2015-02-25 18:02:29 +08:00
pictuga
4f224888d8
2to3: crawler.py port urllib2 and StringIO
2015-02-25 17:53:36 +08:00
pictuga
27cf8f6498
2to3: (iter)items to list
2015-02-25 12:02:53 +08:00
pictuga
3fb90cb7b4
2to3: local import
2015-02-25 11:57:10 +08:00
pictuga
47c8a511ff
2to3: print's
2015-02-25 11:57:10 +08:00
pictuga
604b03e2ba
Delete desc when :keep=False
...
Still needed for Firefox, cause empty <desc/> still show up instead of content in feed preview
2015-02-24 00:38:34 +08:00
pictuga
83ed440e67
Fix issue when desc and content empty
...
Wouldn't put fetched article in feed
2015-02-24 00:38:02 +08:00
pictuga
5c23f90f0b
Disable options filtering by default
...
But still provide sample code
2015-02-21 02:01:32 +08:00
pictuga
149117029c
Improve logging of fetching errors
2015-02-21 01:58:45 +08:00
pictuga
d5269964fc
Make :theforce also bypass http errors
2015-02-21 01:58:16 +08:00
pictuga
f0dcb9912e
Fix cached errors handling
2015-02-21 01:57:33 +08:00
pictuga
f62aedda12
Double HTTP timeout
...
Better slow than nothing (especially when running on a personal computer)
2015-02-21 01:55:53 +08:00
pictuga
76c4211a04
Make :hungry more useful
2015-02-21 01:55:25 +08:00
pictuga
446dd9fb3f
Fix typo in FeedListDescriptor
...
Thanks @tehsphinx. Fixes #4 .
2015-02-20 17:41:14 +08:00
pictuga
ef946c0712
XML pretty-print in separate option
...
Who reads plain XML anyway?
2015-02-20 17:38:39 +08:00
pictuga
fcf4197801
Populate __init__.py
2015-02-19 13:05:59 +08:00
pictuga
ec5f5b865f
Make it easy to restrict available options
2014-11-21 22:01:03 +01:00
pictuga
105ca67744
Move facebook token to own script
...
To a PHP script actually. Not sure why PHP. Keeps morss' code cleaner. This piece of code had nothing to do in there, and didn't bring any advantage.
2014-11-19 20:09:27 +01:00
pictuga
a9654ea578
Fix encoding detection in feedify
2014-11-19 12:25:18 +01:00
pictuga
8131ea2244
HTTPS SSL certificate validation
...
Specific error message added
2014-11-19 11:59:59 +01:00
pictuga
1b26c5f0e3
Split SimpleDownload in a lot of Handlers
...
Cleaner code, easier to edit, more flexibility. Paves the way to SSL certificates validation.
Still have to clean up the code of AcceptHeadersHandler.
2014-11-19 11:57:40 +01:00
pictuga
f46576168a
Add :mono to disable multithreading
...
Convenient to have linear logging
2014-11-10 23:14:54 +01:00
pictuga
5dd262139d
Add HTTP error code to download error message
2014-11-09 15:45:01 +01:00
pictuga
6d5bb2b3c5
Print error message in wgi mode
2014-11-09 15:44:42 +01:00
pictuga
a820cf6812
Run :strip in After
...
Makes more sense
2014-11-09 15:01:50 +01:00
pictuga
607df4b123
Fix Twitter
...
They changed the html structure of the profile pages
2014-11-09 15:00:38 +01:00
pictuga
5eefe2c916
Log more when using wgi
2014-11-08 21:22:34 +01:00
pictuga
6f2061ff37
Fix :smart
...
Wasn't using the right way
2014-11-08 21:22:07 +01:00
pictuga
40834eeb93
Split After into Before/After
...
Needed since a bunch of options needed to be run before the actual fetching (cause no-one needs to fetch the articles of to-be-dropped items)
2014-11-08 20:31:29 +01:00
pictuga
f20fb9cdf6
Use more stable loop-over-list in Gather
2014-11-08 20:30:36 +01:00
pictuga
6a40731248
Return output when DEBUG is on
...
Much more convenient to actually debug
2014-11-07 18:44:59 +01:00
pictuga
d3eb2dd88d
Implement :smart to save bandwidth
2014-11-07 18:40:44 +01:00
pictuga
67fc5f06f8
Run "After" even when debug mode is on
2014-11-06 21:15:16 +01:00
pictuga
ad2673f474
Add :emtpy to remove all items
...
This is completely useless...
2014-11-06 21:14:41 +01:00
pictuga
ecfda1d05a
Add :strip to remove desc and content
2014-11-06 21:14:20 +01:00
pictuga
1a8ee716f3
Add "search" option
...
PLEASE NOTE that this is case sensitive and does really basic research ("is xyz in the title?"). Don't use this for fine filtering.
Also fixed an issue with After(), due to the fact that some functions were removing items from the feed while looping over the feed items, creating some anoying item-skipping issues.
2014-11-06 21:11:23 +01:00
pictuga
690bf43977
reader: show desc if no content is available
2014-10-26 19:22:57 +01:00
pictuga
0e22bb4316
Cache: catch json parse erros
2014-09-28 12:03:58 +02:00
pictuga
5f8288eecb
Add :hungry to fill feeds with long intros
2014-06-28 01:43:31 +02:00
pictuga
ac69b28f1b
Pass options to Fill
2014-06-28 01:43:09 +02:00
pictuga
6cc3e7eb93
Fix :callback and add content-type
2014-06-28 01:20:47 +02:00
pictuga
0ec7c2f3e6
Fix :callback crash
2014-06-28 01:13:29 +02:00
pictuga
484432d804
Add :callback for JSONP calls
2014-06-28 00:59:57 +02:00
pictuga
226441d821
Add :cors for cross-domain XHR (with README update)
2014-06-28 00:59:13 +02:00
pictuga
230659a34b
Reenable args with values
2014-06-28 00:58:37 +02:00
pictuga
38b90e0e4c
Fix template syntax
2014-06-22 20:23:32 +02:00
pictuga
d877e856d3
Fix feed.items.append since pep8
...
The underscore naming convention was not yet applied in that function
2014-06-22 20:13:36 +02:00
pictuga
ee3b2590d0
Remove useless line-break (pep8)
2014-06-22 20:00:44 +02:00
pictuga
5a0084c7cc
Fix isPermaLink in feedify
2014-06-22 19:54:13 +02:00
pictuga
e991d356f4
Fix duckduckgo layout in .ini
2014-06-22 19:53:53 +02:00
pictuga
ecabbc0175
Replace <a> with <span> in reader with :noref
2014-06-22 19:42:52 +02:00
pictuga
6352ef28a9
Use pep8-like layout for .ini
2014-06-22 02:14:11 +02:00
pictuga
3ca5dbaf31
Raise ImportError when missing dependency for call
2014-06-22 02:04:14 +02:00
pictuga
9f51448160
Use xrange where applicable (faster)
2014-06-22 02:02:43 +02:00
pictuga
f01efb7334
Make most of the code pep8-compliant
...
Thanks a lot to github.com/SamuelMarks for his nice work
2014-06-22 01:59:01 +02:00
pictuga
da0a8feadd
Replace TABS with FOUR SPACES in .py
...
(you might want to use: git diff -w)
2014-06-21 18:35:59 +02:00
pictuga
da857f8bb2
Remove useless odata var in morss/morss.py
2014-06-21 18:25:50 +02:00
pictuga
286b90ab8e
Fix typo in error raising message
2014-06-21 16:29:05 +02:00
pictuga
cc27483143
Remove ununsed imports
2014-06-21 16:13:54 +02:00
pictuga
1cf959ce5b
Fix item.link deletion
2014-06-21 16:08:37 +02:00
pictuga
de5b75162c
Add :ad mode (as an example)
...
Not really useful, but shows how to quickly add/remove items from the feed
2014-06-16 14:07:59 +02:00
pictuga
850d574424
Add one comment
...
Was waiting to be committed for months...
2014-06-16 14:07:23 +02:00
pictuga
45478b592e
Remove cache-redirect
...
Some kind of no-longer-working code left-over
2014-06-16 14:06:42 +02:00
pictuga
8270685ac6
Use longer timeout for xml fetching
2014-06-16 14:03:24 +02:00
pictuga
0e3751c712
Remove useless comment
2014-06-16 14:02:54 +02:00
pictuga
862fe3cae4
Use more recent user-agent
2014-06-16 14:01:01 +02:00
pictuga
7211093cc5
Add :smart :noref modes, update README
2014-06-16 14:00:02 +02:00
pictuga
f991802d9e
Try to use less server-specific code for FB tokens
2014-06-16 13:57:53 +02:00
pictuga
9285525256
Unify internal/external errors
2014-06-16 13:55:59 +02:00
pictuga
cdef40fbbe
Fix Cache saving crash
...
Because was deleting values of a dict while looping over its values...
2014-06-07 19:14:31 +02:00
pictuga
f90958149e
Add :reader
...
Uses wheezy.template, which is said to be fast and light. Provided template file is really basic, custom css suggested.
2014-05-29 14:12:16 +02:00
pictuga
b66ac2bc5e
Make it possible not to use caching
2014-05-24 19:13:41 +02:00
pictuga
25fdca4bf0
Add do-it-all function
...
For quick lib use
2014-05-24 19:02:22 +02:00
pictuga
26c91070f5
Time-based Cache
...
Solves the :proxy issue for good. More convenient, more flexible
2014-05-24 19:01:21 +02:00
pictuga
5e64696031
Fix '/morss.py/' url fixer
2014-05-22 22:53:36 +02:00
pictuga
364fbc4ba6
Remove apparent limit
...
Cause no longer works, cause of all-bool args introduced earlier
2014-05-22 22:52:49 +02:00
pictuga
b03d865b7b
Get rid of ParseOptions()
...
That thing wasn't nice, and depended too much on the various use case. The new approach is to turn morss into a library and turn the use cases into some pre-implemented lib usages
2014-05-22 22:44:59 +02:00
pictuga
3c48c58127
Remove useless HOLD var
...
Was needed in DEBUG at some point
2014-05-21 12:19:49 +02:00
pictuga
e8e7f170a6
Include super dumb http file server
...
For index.html, other files can be added, but everything has to be hard-coded (mimetype included)
2014-05-18 12:34:23 +02:00
pictuga
c41a1fe226
Support for wikipedia fetured articles feed
...
Should work with most wikipedias
2014-05-18 12:17:14 +02:00
pictuga
d8a3c4e9af
Add support for Google News
2014-05-18 11:58:45 +02:00
pictuga
bbf1ffbb15
Remove 'persistent' and 'dic' arg in Cache
...
'dic' was mostly intended for facebook now-bygone advanced buggy token storage. 'persistent' was needed by fb and 'proxy' mode, but a small workaround was found for the proxy mode (basically making sure the cache object is always at least 5-item long)
2014-05-15 00:54:40 +02:00
pictuga
76e7f1ea00
Try to use more generic 302/303 redirections
...
Still far from being great, but at least I can use it on both morss.it and test.morss.it now
2014-05-14 15:05:14 +02:00
pictuga
031b67a8db
Remove some useless options
...
progress and a accidentaly-disclosed one, cause useless
2014-05-14 15:03:40 +02:00
pictuga
974bad7974
Fix and strip down facebook
...
Remove unstable non-working facebook semi-automatic token renewal (a simple warning on morss.it should be enough). Also commited some forgotten stuff.
2014-05-14 15:01:41 +02:00
pictuga
b7136f2056
Pull iTunes raw feed out of iTunes url
...
This iTunes thinggy somehow qualifies as yet-another-apple-tech-rape: just some old tech behind iron curtains…
2014-05-12 23:15:51 +02:00
pictuga
d8074d6b6d
Redirect google translate links to original link
...
Cause anyway Google Translate isn't scrappable. So it's better to have at least some content.
2014-03-22 20:53:33 +01:00
pictuga
a4cf5e0daa
Google link cleaner now works on all .dot versions
2014-03-22 20:52:25 +01:00
pictuga
c94ef92131
Fix Facebook support
...
Now token is grabbed directly by the server, and sent back by means of a cookie. This does unify token "creation" and renewal.
2014-02-21 14:36:06 +01:00
pictuga
a1f5c3db3a
Have .csv files be downloaded
...
So that users can open it in LibreOffice/OpenOffice/Word without having to save it to disk beforehands
2014-02-05 00:37:12 +01:00
pictuga
6c33bb6e1c
Safer Cache saving
...
Create tmp file and then move it to destination. Avoids corrupt files during write
2014-01-29 20:36:45 +01:00
pictuga
6eaec96af7
Keep "dic" param in Cache.new
2014-01-22 15:56:08 +01:00
pictuga
4e549dc88a
Change lim/max settings only for current "run"
2014-01-19 23:36:41 +01:00
pictuga
0f7bc568e4
Send CGI HTTP headers earlier
...
So that browsers show that sth is going on
2014-01-15 21:02:47 +01:00
pictuga
4d6ef92504
Separate function for output. Add csv
2014-01-13 00:10:57 +01:00
pictuga
7fbe728f93
Feeds: allow json, csv export
...
Uses OrderedDict
2014-01-13 00:08:03 +01:00
pictuga
ec55f5e856
Use smarter order for RSS.dict
2014-01-13 00:07:04 +01:00
pictuga
3d78cfb638
Fix HTTP bug when returning empty page
2014-01-11 18:21:37 +01:00
pictuga
840b0b1ded
Remove yet another silly log message
2014-01-11 18:18:02 +01:00
pictuga
8209f243bb
Fix rss-redirection code
...
And add log, which was lost when splitting functions (which made this fix needed)
2014-01-11 18:15:36 +01:00
pictuga
3b3ac4c8a6
Remove batch of useless imports
2014-01-11 17:31:27 +01:00
pictuga
5feb061bf7
First attempt at decent folder structure
...
Use setup.py, subfolder for code.
2014-01-11 17:11:57 +01:00
pictuga
851dacdfbc
Renamed to .py.
2013-04-04 18:17:12 +02:00
pictuga
6783bbf992
Improved shebang.
2013-04-04 17:56:37 +02:00
pictuga
82084c2c75
Move to OOP.
...
This is a huge commit. The whole code is ported to Object-Oritented Programming. This makes the code cleaner, which became required to deal with all the different cases, for example with encoding detection. Encoding detection now works better, and uses 3 different methods. HTML pages with an xml declaration are now supported. Feed urls with parameters (eg. "index.php?option=par") are also supported. Cache is now smarter, since it no longer grows indefinitely, since only in-use pages are kept in the cache. Caching is now mandatory. urllib (not urllib2) is no longer needed. Solved a possible crash with log function (when passing list of str with non-unicode encoging).
README is also updated.
2013-04-04 17:43:30 +02:00
pictuga
05b5bc7783
Catch extra errors (timeout).
2013-03-29 20:06:31 +01:00
pictuga
6f6c5fbaad
Faster xml cleaning
2013-03-01 14:26:51 +01:00
pictuga
e305f387ab
Hopefully fixed encoding issues
...
with the dirtiest trick out there...
2013-02-27 15:12:32 +01:00
pictuga
ed8a45875c
Default to "//h1/.." since most website use it
...
because it is said to be good for SEO. Debug now requires env variable "DEBUG" to be set to something else than "".
2013-02-25 21:36:02 +01:00
pictuga
d39604c453
Support for cookies added
...
NYT needs them
2013-02-25 20:53:59 +01:00
pictuga
d6179a734f
Clearer debug info
2013-02-25 20:53:22 +01:00
pictuga
eb63ce3f4f
Handle more errors
2013-02-25 18:32:23 +01:00
pictuga
b63f91a151
Added cache, easier debug
2013-02-25 18:01:59 +01:00
pictuga
51fe6ce81b
First commit
2013-02-25 15:50:32 +01:00