Bachelor Of Engineering - Bachelor Of Technology (B.E./B.Tech.)
Payment
1000000 to 1200000
Date Posted
2024 Jan,23
HR
Mili Chavhan
Contact
mili@white-force.com
Mobile
6264800152
Job description
Responsibilities:
Develop and maintain Python scripts for web scraping and data extraction from diverse sources such as websites, APIs, and other online platforms.
Utilize Python libraries and frameworks (e.g., Beautiful Soup, Scrapy, Selenium) to automate data collection tasks efficiently.
Understand and analyze target websites or data sources to identify the best scraping approach and develop efficient scraping strategies.
Build robust and scalable data scraping systems that can handle large volumes of data while ensuring data quality and integrity.
Collaborate with data engineering and analytics teams to define data requirements, data structures, and storage mechanisms for scraped data.
Should have ability to understand the LLM ML models.
Perform data cleaning, preprocessing, and transformation tasks to prepare scraped data for downstream analysis and usage.
Monitor and troubleshoot scraping processes to identify and resolve issues such as website changes, data format variations, and anti-scraping measures.
Stay up-to-date with the latest web scraping trends, tools, and techniques to continually improve the efficiency and effectiveness of data scraping processes.
Ensure compliance with legal and ethical standards when collecting and utilizing data from online sources.
Requirements:
Strong experience in Python programming with expertise in web scraping and data extraction.
In-depth knowledge of Python libraries and frameworks commonly used for web scraping, such as Beautiful Soup, Scrapy, Selenium, and Requests.
Familiarity with HTML, CSS, XPath, and regular expressions for effective parsing and extraction of data from websites.
Understanding of HTTP protocols and web technologies to handle various website structures and handle different data formats (e.g., JSON, XML, CSV).
Experience with database systems (e.g., SQL, NoSQL) and data storage mechanisms for efficiently storing and managing scraped data.
Ability to analyze and interpret web page structures, inspect network requests, and troubleshoot scraping issues.
Strong problem-solving skills with attention to detail and ability to handle complex scraping scenarios.
Experience in Captcha breaking and worked on Proxy for rotation of IPs
Excellent communication and collaboration skills to work effectively with cross-functional teams.
Proven ability to work independently, manage multiple scraping projects simultaneously, and meet deadlines.
Preferred Qualifications:
Previous experience in scraping data from diverse domains and sources, including e-commerce websites, social media platforms, and news sites.
Knowledge of data analysis and visualization tools (e.g., Pandas, NumPy, Matplotlib, Tableau) to perform exploratory data analysis and present insights.
Familiarity with APIs and data integration techniques to combine scraped data with other data sources.
Understanding of web scraping legalities, ethical considerations, and best practices.
Join our team and contribute to our data-driven decision-making processes by leveraging your expertise in Python data scraping and extraction. Apply now and help us gather valuable insights from the vast web landscape.
Job requirements
Experience: 2 to
4
Year.
Education : Bachelor of Engineering - Bachelor of Technology (B.E./B.Tech.)