Role Summary:
We are seeking a skilled Data Engineer to join our team and help build, maintain, and optimize our data infrastructure and pipelines. The ideal candidate will have a strong background in data engineering, DevOps, or software development, with expertise in database technologies, infrastructure as code, and cloud-native data processing systems.
Key Responsibilities:
- Data Infrastructure & Pipeline Development
- Design, build, and maintain scalable data pipelines and ETL/ELT processes
- Implement and optimize data workflows using Apache Airflow and Temporal.io
- Develop real-time and batch data processing solutions
- Ensure data quality, reliability, and performance across all data systems
- Database Management & Optimization
- Manage and optimize multiple database technologies including PostgreSQL, Elasticsearch, Redis, StarRocks or Cassandra
- Implement database performance tuning, indexing strategies, and query optimization
- Design and maintain database schemas and data models
- Monitor database health, performance metrics, and capacity planning
- Infrastructure & DevOps
- Implement Infrastructure as Code (IaC) using tools like Terraform or Ansible
- Manage Kubernetes clusters for containerized data services
- Automate deployment, scaling, and monitoring of data infrastructure
- Implement CI/CD pipelines for data applications and services
- Streaming & Messaging Systems
- Configure and maintain Apache Kafka clusters for real-time data streaming
- Design event-driven architectures and data streaming patterns
- Implement data ingestion and processing from various sources
Required Qualifications:
-Technical Skills:
- Database Expertise: Strong knowledge of database internals, optimization techniques, and performance tuning
- Core Databases: PostgreSQL, Elasticsearch, Redis, StarRocks, ScyllaDB, or Cassandra
- Data Processing: Apache Airflow, Temporal.io, Apache Kafka
- Infrastructure: Kubernetes, Docker, Infrastructure as Code
- Programming: Python, SQL, or other scripting languages
- DevOps: CI/CD, monitoring, logging, and automation tools
- Monitoring Tools: Experience with Prometheus, Grafana, or ELK Stack
-Experience & Background:
- 3+ years of experience in data engineering, DevOps, or software development
- Proven experience with database administration and optimization - Experience with data pipeline development and workflow orchestration
- Knowledge of distributed systems and microservices architecture
-Soft Skills:
- Strong problem-solving and analytical thinking
- Excellent communication and collaboration skills
- Ability to work in a fast-paced, agile environment
- Continuous learning mindset and adaptability to new technologies
Preferred Qualifications:
- Experience with data warehousing and data lake architectures
- Knowledge of machine learning pipelines and MLOps
- Certifications in cloud platforms or database technologies
- Experience with data governance and security best practices