commented on
issue #3
"web_trawler should be able to trawl all web pages in a domain, especially now that there's a y/n prompt"
at
dlab-indecol / web_trawler
but we can still have the option non-interactive, just trawl all first/second order links...
commented on
issue #3
"web_trawler should be able to trawl all web pages in a domain, especially now that there's a y/n prompt"
at
dlab-indecol / web_trawler
Non-interactive mode with regard to the file links should still be default, I agree. But if you'd like the trawler to jump to linked web pages in...
Web_trawler already uses both multiprocessing and multithreading. I think all that's needed is to catch certain kinds of errors, time.sleep for a w...
commented on
merge request !1
"Add Sparrow wrapper for web_trawler tool"
at
dlab-indecol / web_trawler
Hi! Sure. No worries.
commented on
merge request !1
"Add Sparrow wrapper for web_trawler tool"
at
dlab-indecol / web_trawler
Hi,...
Yes, a good idea would be a y/n for each page/directory – rather than at the file level.
The wiod.org header bar can certainly be excluded. But it'll be hard to find a more or less universal solution. We're dependent on html standards o...
-
f161987a · made single-line docstrings follow standards