Bachelor's degree in Computer Science, Information Systems, or a related field.
3+ years of experience in ETL development with Python as the primary language.
Strong understanding of relational databases (e.g., PostgreSQL, SQL Server) and SQL proficiency.
Experience with data warehousing concepts and design principles.
Proven ability to write clean, efficient, and maintainable code.
Excellent analytical and problem-solving skills.
Strong communication and collaboration skills.
Experience with cloud data warehousing platforms (e.g., Snowflake, Redshift) is a plus.
Experience with ETL/Data Orchestration tools like Airflow, Luigi, etc. is a plus.
Responsibilities:
Collaborate with the Data Management team to understand the company's data needs and storage requirements.
Design and implement ETL (Extract, Transform, Load) processes to efficiently move data from diverse sources (e.g., databases, flat files, APIs) to our data warehouse.
Develop and maintain high-quality ETL scripts using Python as the primary language, leveraging libraries like Pandas, Pyspark, and Airflow.
Perform data transformations within the ETL process, including cleaning, filtering, and aggregation.
Design and build data warehouse schemas that optimize data storage and retrieval for various business analyses.
Implement data quality checks and monitoring mechanisms to ensure data accuracy and consistency.
Troubleshoot and resolve data extraction, transformation, and loading issues.
Collaborate with stakeholders to document ETL processes and data warehouse structures.
Stay up-to-date with the latest advancements in ETL technologies and best practices.