pictuga
|
d6882e0a6a
|
readabilite: (try to) emprove detection
Kinda hopeless
|
2017-03-19 02:00:31 -10:00 |
pictuga
|
79a8ada9f4
|
readabilite: add tags to score
|
2017-03-19 01:57:54 -10:00 |
pictuga
|
4a5150e030
|
readabilite: fix iter while iterating
|
2017-03-19 01:56:33 -10:00 |
pictuga
|
e65c88abf8
|
readabilite: fix re.match
|
2017-03-19 01:55:40 -10:00 |
pictuga
|
367f86987d
|
readabilite: spread score to all ancestors
Instead of just parents and grandparents
|
2017-03-18 22:24:38 -10:00 |
Florian Muenchbach
|
993ac638a3
|
Added override for auto-detected character encoding of parsed pages.
|
2017-03-08 18:45:20 -10:00 |
pictuga
|
3fc89d5359
|
readabilite: improve score for <p>
Helps a lot with bbc, le monde. Might backfire on other websites tho...
|
2017-03-01 18:02:45 -10:00 |
pictuga
|
e0f533ca31
|
readabilite: test to replace <br/> with div
|
2017-02-25 18:16:15 -10:00 |
pictuga
|
c6c113b8a8
|
readabilite: function to clean up the html code
|
2017-02-25 18:15:33 -10:00 |
pictuga
|
58d9f65735
|
readabilite: explain the use of .tail
|
2017-02-25 18:14:13 -10:00 |
pictuga
|
a5aec8c7a6
|
readability: more keywords to the filter list
Also fixed indentation
|
2017-02-25 18:13:15 -10:00 |
pictuga
|
e71fc967ce
|
readabilite: shift "good" tags to a var (list)
So that this list can later be re-used
|
2017-02-25 18:07:28 -10:00 |
pictuga
|
b14381f575
|
Use internal readability fork
Much simpler, doesn't clean the html, probably less efficient, but much faster
|
2016-05-31 02:50:03 +02:00 |