pyscrapers.core package

Submodules

pyscrapers.core.ffprobe module

This is a module that returns information about video streams based on running ffprobe as a subprocess.

References: - http://stackoverflow.com/questions/3844430/how-to-get-the-duration-of-a-video-in-python/3844467

pyscrapers.core.ffprobe.duration(vid_file_path)[source]

Video’s duration in seconds, return a float number

pyscrapers.core.ffprobe.height(vid)[source]
pyscrapers.core.ffprobe.probe(vid_file_path)[source]

Give a json from ffprobe command line

@vid_file_path : The absolute (full) path of the video file, string.

pyscrapers.core.ffprobe.width(vid)[source]

pyscrapers.core.ext_requests module

class pyscrapers.core.ext_requests.ExtResponse(res: Response)[source]

Bases: object

raise_for_status()[source]
save_binary(filename: str = '/tmp/temp') None[source]
save_text(filename: str = '/tmp/temp')[source]
class pyscrapers.core.ext_requests.ExtSession(base: str = '')[source]

Bases: Session

Inherit from requests.Session and add capabilities

download_url(source: str, target: str) None[source]

Download a single url to a file

download_video_if_wider(source: str, target: str, width: int) bool[source]

Download a video if it is wider than a certain width :param source: :param target: :param width: :param self: :return:

ext_get(url: str, *args, **kwargs)[source]
get_timeout(url: str)[source]
pyscrapers.core.ext_requests.download(response, filename: str) None[source]
pyscrapers.core.ext_requests.setup()[source]

Activate the debugging features of the requests module :return:

pyscrapers.core.url_set module

class pyscrapers.core.url_set.UrlSet[source]

Bases: object

set of urls, with no duplicates. Can be downloaded

append(url: str) None[source]

add url to the list :param url: :return:

download(session)[source]

download the list :param session: :return:

extend(urls: List[str]) None[source]
get_filename(suffix: str) str[source]
print() None[source]

print the list :return:

suggest_filename(suffix: str) str[source]

pyscrapers.core.utils module

This module is s set of utilities for this entire project

pyscrapers.core.utils.add_http(url, main_url)[source]

add two urls together :param url: :param main_url: :return:

pyscrapers.core.utils.get_http_status_string(code: int)[source]

This function returns a description of an HTTP status code (404 - not found etc). Unfortunately, the requests module does not provide a clean API for this so we must access a protected member (underscore member) of ‘requests.status_code’. See: https://stackoverflow.com/questions/24718557/get-the-description-of-a-status-code-in-python-requests :param code: :return:

Module contents