A toolbox for data grabbing and processing in python 3


cutout is a Python toolbox for data grabbing and processing. Types::

This software is still under development, improvement and perfection.



You can download cutout by click here, and use it in your code like this::

from cutout import download, cutout
from cutout.common import get_html, get_argv_dict
from cutout.util import sec2time



To get baidu music pc software download url, like this::

>>> from cutout import cutout
>>> para = {} #p aram
>>> para['url'] = ''
>>> para['start'] = '<a class="downloadlink-pc"'
>>> para['end'] = '>下载PC版</a>'
>>> para['dealwith'] = { 'start':'href="', 'rid':'"', 'end':'"' } # get href url
>>> cutout(**para) # do grab

To create a cache, like this::

>>> from cutout.cache import FileCache
>>> c = FileCache('./cache') # set cache dir './cache'
>>> c.set("foo", "value")
>>> c.get("foo")
>>> c.get("missing") is None

To create a ProgressBar for download, like this::

>>> from cutout import download
>>> from cutout.common import ProgressBar
>>> bar = ProgressBar(piece_total=1);
>>> face = { 'sh_piece_division':1024, 'sh_piece_unit':'KB' }
>>> bar.face(**face)
>>> download('',showBar=bar)
'[=============================>                    ]  59.23%  14.81%/s  1280.00KB/s  5120.00KB/8644.81KB  00:00:04'

Read or run the to get more example.

Download and use the browser to open the document.zh.htm, A detailed understanding of all API.

$ python3 cutout/


cutout is developed and maintained by Yang Jie ( It can be found here:

Contact way: