pyscrapers.workers package

Submodules

pyscrapers.workers.drumeo module

Download course material from drumeo

class pyscrapers.workers.drumeo.Course[source]

Bases: object

This is an object representing one course

add_lesson(lesson)[source]

Add a lesson to the course :param lesson: :return:

add_video(video, quality)[source]

Add a video to the course :param video: :param quality: :return:

pyscrapers.workers.drumeo.download_course(course, session)[source]

Download a course :param course: :param session: :return:

pyscrapers.workers.drumeo.get_course_details(course: Course, courses: bool, session)[source]

Populate the Course type object :param course: :param courses: :param session: :return:

pyscrapers.workers.drumeo.get_course_urls(course, courses: bool, session)[source]
pyscrapers.workers.drumeo.get_courses(pages, courses: bool, session)[source]

Download the list of all the courses :param pages: :param courses: :param session: :return:

pyscrapers.workers.drumeo.get_members_url(courses: bool, number: int) str[source]
pyscrapers.workers.drumeo.get_number_of_pages(courses: bool, session: ExtSession) int[source]

Get the number of pages for all courses or pages :param courses: :param session: :return:

pyscrapers.workers.drumeo.get_url(courses: bool, page: int) str[source]
pyscrapers.workers.drumeo.get_videos(root, course, session)[source]

Get all the videos pertaining to a certain course :param root: :param course: :param session: :return:

pyscrapers.workers.facebook module

download photos from facebook

pyscrapers.workers.facebook.scrape_facebook(user_id: str, session, url_set: UrlSet) None[source]

download photos from facebook :param user_id: :param session: :param url_set: :return:

pyscrapers.workers.getpocket module

pyscrapers.workers.getpocket.getpocket_download(session: Session, _logger: Logger)[source]

This does the heavy lifting :param session: :param _logger: :return:

pyscrapers.workers.instagram module

How does this work? When you fetch the page of a user on instagram you get an html with javascript embedded in it with a json object embedded in that. This json object describes the user, his id, his profile photo and the first 12 images for that user. If you want more you have to do a follow-up AJAX request to the server.

pyscrapers.workers.instagram.get_urls(logger, session, base, url_set, user_id)[source]
pyscrapers.workers.instagram.is_rate_limit(response) bool[source]

Rate limit messages look like this: { “message”: “rate limited”, “status”: “fail” } :param response: :return:

pyscrapers.workers.instagram.scrape_instagram(user_id: str, session, url_set: UrlSet) None[source]

pyscrapers.workers.mamba_ru module

pyscrapers.workers.mamba_ru.scrape_mambaru(user_id: str, session, url_set: UrlSet) None[source]

pyscrapers.workers.pornhub module

Module to handle scraping of pornhub.

References: - https://pypi.org/project/pornhub-api/

Download movies from pornhub

pyscrapers.workers.pornhub.download_url(session) None[source]
pyscrapers.workers.pornhub.get_code(e: ValueError) int[source]
pyscrapers.workers.pornhub.get_number_of_pages(root) int[source]

return number of pages for a pornstar :param root: :return:

pyscrapers.workers.pornhub.get_urls_from_page(root) List[str][source]

return urls from page :param root: :return:

pyscrapers.workers.pornhub.print_categories(api: PornhubApi) None[source]
pyscrapers.workers.pornhub.print_stars_all(api: PornhubApi) None[source]
pyscrapers.workers.pornhub.print_stars_all_detailed(api: PornhubApi) None[source]
pyscrapers.workers.pornhub.print_tags(api: PornhubApi) None[source]
pyscrapers.workers.pornhub.url_generator(url: str)[source]

pyscrapers.workers.sxyprn module

pyscrapers.workers.sxyprn.sxyprn_download(session: Session, logger: Logger)[source]

This does the downloads :param session: :param logger: :return:

pyscrapers.workers.sxyprn.url_generator(url: str)[source]

pyscrapers.workers.travelgirls module

pyscrapers.workers.travelgirls.scrape_travelgirls(user_id: str, session, url_set: UrlSet) None[source]

pyscrapers.workers.vk module

pyscrapers.workers.vk.get_my_content(r)[source]

the return from the server in vk is not a standard HTML. this is why we must cut it up and cant use the regular ‘get_real_content’ helper.

pyscrapers.workers.vk.get_total_images(logger, session, url, user_id)[source]
pyscrapers.workers.vk.get_urls(base, got, json_obj, url_set)[source]
pyscrapers.workers.vk.scrape_vk(user_id: str, session, url_set: UrlSet) None[source]
pyscrapers.workers.vk.yield_json_objs_and_base(r)[source]

pyscrapers.workers.youtube_dl_handlers module

Module that handles the interaction with the youtube_dl library

References: - https://github.com/ytdl-org/youtube-dl/blob/master/README.md#embedding-youtube-dl - https://github.com/ytdl-org/youtube-dl/blob/3e4cedf9e8cd3157df2457df7274d0c842421945/youtube_dl/YoutubeDL.py#L137-L312

class pyscrapers.workers.youtube_dl_handlers.MyLogger[source]

Bases: object

debug(msg)[source]
error(msg)[source]
warning(msg)[source]
pyscrapers.workers.youtube_dl_handlers.youtube_dl_download_url(url: str) None[source]
pyscrapers.workers.youtube_dl_handlers.youtube_dl_download_urls(urls: List[str]) None[source]
pyscrapers.workers.youtube_dl_handlers.youtube_dl_handler() None[source]

Module contents