Remove markdown-related code

Time to clean up the code and stop with those non-core features
They just make the code harder to maintain
master
pictuga 2020-03-18 16:47:00 +01:00
parent dbb9cccc42
commit 9dbe061fd6
3 changed files with 2 additions and 15 deletions

View File

@ -45,7 +45,6 @@ You do need:
- [python](http://www.python.org/) >= 2.6 (python 3 is supported) - [python](http://www.python.org/) >= 2.6 (python 3 is supported)
- [lxml](http://lxml.de/) for xml parsing - [lxml](http://lxml.de/) for xml parsing
- [dateutil](http://labix.org/python-dateutil) to parse feed dates - [dateutil](http://labix.org/python-dateutil) to parse feed dates
- [html2text](http://www.aaronsw.com/2002/html2text/)
- [OrderedDict](https://pypi.python.org/pypi/ordereddict) if using python < 2.7 - [OrderedDict](https://pypi.python.org/pypi/ordereddict) if using python < 2.7
- [wheezy.template](https://pypi.python.org/pypi/wheezy.template) to generate HTML pages - [wheezy.template](https://pypi.python.org/pypi/wheezy.template) to generate HTML pages
- [chardet](https://pypi.python.org/pypi/chardet) - [chardet](https://pypi.python.org/pypi/chardet)
@ -77,7 +76,6 @@ The arguments are:
- `search=STRING`: does a basic case-sensitive search in the feed - `search=STRING`: does a basic case-sensitive search in the feed
- Advanced - Advanced
- `csv`: export to csv - `csv`: export to csv
- `md`: convert articles to Markdown
- `indent`: returns indented XML or JSON, takes more place, but human-readable - `indent`: returns indented XML or JSON, takes more place, but human-readable
- `nolink`: drop links, but keeps links' inner text - `nolink`: drop links, but keeps links' inner text
- `noref`: drop items' link - `noref`: drop items' link
@ -199,7 +197,7 @@ Using cache and passing arguments:
>>> import morss >>> import morss
>>> url = 'http://feeds.bbci.co.uk/news/rss.xml' >>> url = 'http://feeds.bbci.co.uk/news/rss.xml'
>>> cache = '/tmp/morss-cache.db' # sqlite cache location >>> cache = '/tmp/morss-cache.db' # sqlite cache location
>>> options = {'csv':True, 'md':True} >>> options = {'csv':True}
>>> xml_string = morss.process(url, cache, options) >>> xml_string = morss.process(url, cache, options)
>>> xml_string[:50] >>> xml_string[:50]
'{"title": "BBC News - Home", "desc": "The latest s' '{"title": "BBC News - Home", "desc": "The latest s'
@ -214,7 +212,7 @@ Doing it step-by-step:
import morss, morss.crawler import morss, morss.crawler
url = 'http://newspaper.example/feed.xml' url = 'http://newspaper.example/feed.xml'
options = morss.Options(csv=True, md=True) # arguments options = morss.Options(csv=True) # arguments
morss.crawler.sqlite_default = '/tmp/morss-cache.db' # sqlite cache location morss.crawler.sqlite_default = '/tmp/morss-cache.db' # sqlite cache location
rss = morss.FeedFetch(url, options) # this only grabs the RSS feed rss = morss.FeedFetch(url, options) # this only grabs the RSS feed

View File

@ -19,7 +19,6 @@ from . import readabilite
import wsgiref.simple_server import wsgiref.simple_server
import wsgiref.handlers import wsgiref.handlers
from html2text import HTML2Text
try: try:
# python 2 # python 2
@ -290,15 +289,6 @@ def ItemAfter(item, options):
if options.noref: if options.noref:
item.link = '' item.link = ''
if options.md:
conv = HTML2Text(baseurl=item.link)
conv.unicode_snob = True
if item.desc:
item.desc = conv.handle(item.desc)
if item.content:
item.content = conv.handle(item.content)
return item return item

View File

@ -1,5 +1,4 @@
lxml lxml
python-dateutil <= 1.5 python-dateutil <= 1.5
html2text
chardet chardet
pymysql pymysql