:abc: English | :mahjong: 简体中文

ScrapydWeb: Web app for Scrapyd cluster management, with support for Scrapy log analysis & visualization.

PyPI - scrapydweb Version PyPI - Python Version CircleCI codecov Coverage Status Downloads - total GitHub license Twitter

servers

Scrapyd :x: ScrapydWeb :x: LogParser

:book: Recommended Reading

:link: How to efficiently manage your distributed web scraping projects

:link: How to set up Scrapyd cluster on Heroku

:eyes: Demo

:link: scrapydweb.herokuapp.com

:star: Features

View contents - :diamond_shape_with_a_dot_inside: Scrapyd Cluster Management - :100: All Scrapyd JSON API Supported - :ballot_box_with_check: Group, filter and select any number of nodes - :computer_mouse: **Execute command on multinodes with just a few clicks** - :mag: Scrapy Log Analysis - :bar_chart: Stats collection - :chart_with_upwards_trend: **Progress visualization** - :bookmark_tabs: Logs categorization - :battery: Enhancements - :package: **Auto packaging** - :male_detective: **Integrated with [:link: *LogParser*](https://github.com/my8100/logparser)** - :alarm_clock: **Timer tasks** - :e-mail: **Monitor & Alert** - :iphone: Mobile UI - :closed_lock_with_key: Basic auth for web UI

:computer: Getting Started

View contents ### :warning: Prerequisites :heavy_exclamation_mark: **Make sure that [:link: Scrapyd](https://github.com/scrapy/scrapyd) has been installed and started on all of your hosts.** :bangbang: Note that for remote access, you have to manually set 'bind_address = 0.0.0.0' in [:link: the configuration file of Scrapyd](https://scrapyd.readthedocs.io/en/latest/config.html#example-configuration-file) and restart Scrapyd to make it visible externally. ### :arrow_down: Install - Use pip: ```bash pip install scrapydweb ``` :heavy_exclamation_mark: Note that you may need to execute `python -m pip install --upgrade pip` first in order to get the latest version of scrapydweb, or download the tar.gz file from https://pypi.org/project/scrapydweb/#files and get it installed via `pip install scrapydweb-x.x.x.tar.gz` - Use git: ```bash pip install --upgrade git+https://github.com/my8100/scrapydweb.git ``` Or: ```bash git clone https://github.com/my8100/scrapydweb.git cd scrapydweb python setup.py install ``` ### :arrow_forward: Start 1. Start ScrapydWeb via command `scrapydweb`. (a config file would be generated for customizing settings at the first startup.) 2. Visit http://127.0.0.1:5000 **(It's recommended to use Google Chrome for a better experience.)** ### :globe_with_meridians: Browser Support The latest version of Google Chrome, Firefox, and Safari.

:heavy_check_mark: Running the tests

View contents
```bash $ git clone https://github.com/my8100/scrapydweb.git $ cd scrapydweb # To create isolated Python environments $ pip install virtualenv $ virtualenv venv/scrapydweb # Or specify your Python interpreter: $ virtualenv -p /usr/local/bin/python3.7 venv/scrapydweb $ source venv/scrapydweb/bin/activate # Install dependent libraries (scrapydweb) $ python setup.py install (scrapydweb) $ pip install pytest (scrapydweb) $ pip install coverage # Make sure Scrapyd has been installed and started, then update the custom_settings item in tests/conftest.py (scrapydweb) $ vi tests/conftest.py (scrapydweb) $ curl http://127.0.0.1:6800 # '-x': stop on first failure (scrapydweb) $ coverage run --source=scrapydweb -m pytest tests/test_a_factory.py -s -vv -x (scrapydweb) $ coverage run --source=scrapydweb -m pytest tests -s -vv --disable-warnings (scrapydweb) $ coverage report # To create an HTML report, check out htmlcov/index.html (scrapydweb) $ coverage html ```

:building_construction: Built With

View contents
- Front End - [:link: Element](https://github.com/ElemeFE/element) - [:link: ECharts](https://github.com/apache/incubator-echarts) - Back End - [:link: Flask](https://github.com/pallets/flask)

:clipboard: Changelog

Detailed changes for each release are documented in the :link: HISTORY.md.

:man_technologist: Author


my8100

:busts_in_silhouette: Contributors


Kaisla

:copyright: License

This project is licensed under the GNU General Public License v3.0 - see the :link: LICENSE file for details.