About this project
it-programming / web-development
Open
We are seeking an experienced web scraping professional to assist in extracting all content from a substantial website comprising approximately 700 pages. The project requirements are as follows:
Content Extraction: Scrape the entire website, ensuring all textual content, images, and media files are accurately captured.
Image Handling: Some images are hosted on external Content Delivery Networks (CDNs). These must be identified and included in the extraction process.
PDF Generation: Create PDF documents for each webpage that precisely mirror the live site's appearance in desktop view. While mobile view PDFs are not mandatory, they would be considered a valuable addition.
Static Site Deployment: Provide a fully functional static version of the website that can be rendered in a browser from a server, replicating the original site's desktop view.
Delivery Timeline: Due to time constraints, we require the completion of this project by March 22nd, 2025.
Qualifications:
Proven experience in web scraping large-scale websites.
Proficiency with tools and programming languages commonly used in web scraping (e.g., Python, Beautiful Soup, Scrapy).
Ability to handle and download externally hosted media files.
Experience in generating accurate PDF representations of web pages in desktop view.
Expertise in deploying static websites that can be served from a server and rendered correctly in browsers.
Strong attention to detail to ensure all content is captured and represented correctly.
Application Requirements:
Brief overview of your experience(previous work) with similar web scraping and static site deployment projects.
Description of the tools and methods you plan to use for this project.
Estimated timeframe to complete the project within the specified deadline.
Any potential challenges you foresee and your approach to addressing them.
We look forward to collaborating with a skilled professional who can deliver high-quality results within a tight timeframe.
Category IT & Programming
Subcategory Web development
What is the scope of the project? Create a new custom site
Is this a project or a position? Project
Required availability As needed
Roles needed Developer
Delivery term: February 20, 2025
Skills needed