Skip to content
Snippets Groups Projects
username-removed-867288's avatar
pushed to branch master at dlab-indecol / web_trawler
  • fcb69ccb · remove some readme spelling
username-removed-867288's avatar
commented on issue #3 "web_trawler should be able to trawl all web pages in a domain, especially now that there's a y/n prompt" at dlab-indecol / web_trawler
but we can still have the option non-interactive, just trawl all first/second order links...
username-removed-1390245's avatar
commented on issue #3 "web_trawler should be able to trawl all web pages in a domain, especially now that there's a y/n prompt" at dlab-indecol / web_trawler

Non-interactive mode with regard to the file links should still be default, I agree. But if you'd like the trawler to jump to linked web pages in...

username-removed-1390245's avatar
commented on issue #4 "Make robust for "too many connections" errors" at dlab-indecol / web_trawler

Web_trawler already uses both multiprocessing and multithreading. I think all that's needed is to catch certain kinds of errors, time.sleep for a w...

username-removed-1390245's avatar
opened issue #4 "Make robust for "too many connections" errors" at dlab-indecol / web_trawler
username-removed-1390245's avatar
pushed new branch review at dlab-indecol / web_trawler
username-removed-1390245's avatar
pushed to branch develop at dlab-indecol / web_trawler
  • 72f83556 · added y/n changes to readme
username-removed-1390245's avatar
pushed to branch develop at dlab-indecol / web_trawler
  • 5b569856 · implemented y/n solution to issue #2
username-removed-341506's avatar
commented on merge request !1 "Add Sparrow wrapper for web_trawler tool" at dlab-indecol / web_trawler

Hi! Sure. No worries.

username-removed-867288's avatar
commented on merge request !1 "Add Sparrow wrapper for web_trawler tool" at dlab-indecol / web_trawler

Hi,...

username-removed-867288's avatar
closed merge request !1 "Add Sparrow wrapper for web_trawler tool" at dlab-indecol / web_trawler
username-removed-867288's avatar
pushed new branch 2-daughter-directories at dlab-indecol / web_trawler
username-removed-341506's avatar
opened merge request !1 "Add Sparrow wrapper for web_trawler tool" at dlab-indecol / web_trawler
username-removed-1188737's avatar
commented on issue #2 "daughter directories" at dlab-indecol / web_trawler

Yes, a good idea would be a y/n for each page/directory – rather than at the file level.

username-removed-1390245's avatar
commented on issue #2 "daughter directories" at dlab-indecol / web_trawler

The wiod.org header bar can certainly be excluded. But it'll be hard to find a more or less universal solution. We're dependent on html standards o...

username-removed-1188737's avatar
opened issue #2 "daughter directories" at dlab-indecol / web_trawler
username-removed-1390245's avatar
pushed new branch develop at dlab-indecol / web_trawler
username-removed-1390245's avatar
pushed to branch master at dlab-indecol / web_trawler
username-removed-867288's avatar
commented on (deleted) at dlab-indecol / web_trawler

meant processors/cores

username-removed-1390245's avatar
pushed to branch master at dlab-indecol / web_trawler
  • f161987a · made single-line docstrings follow standards