As a data engineer youll work as s a Big Data Developer, you will be working on Data Ingestion activities to bring large volumes of data into our Big Data Lake. The candidate will play a vital role in building new data pipelines from various structured and unstructured sources into Hadoop. You will be working closely with data consumers and source owners to create the foundation for data analytics and machine learning activities.
Duties and responsibilities
Identifying data ingestion patterns and build framework to efficiently ingest the data to our Data Lake
Performance tuning ingestion jobs to improve throughput
Improving CI/CD process by automating build, test and deployment framework
Build high-performance algorithms, prototypes, and proof of concepts
Research opportunities for data acquisition and new uses for existing data
Develop data set processes for data modeling, mining and production
Integrate new data management technologies and software engineering tools into existing framework
Work with In-memory database tools (Redis, Riak)
Collaborate with data architects, modelers and IT team members on project goals
Job specification
Education
Bachelor degree of Engineering in Computer Systems, or Computer Science.
Experience
6+ years data engineering experience building data pipelines and systems.
Experience in working with Hadoop technologies such as Spark
Experience in working with data flow tools such as Nifi, Airflow,
Prior experience in working with Cloud technologies such as GCP, AWS, is a plus
Prior experience with relational databases such as MYSQL
Knowledge and understanding of SDLC and Agile/Scrum procedures, CI/CD and Automation is required
Ability to use containers like dockers or kubernettes is a plus
Ability to write SQL queries and use tools such as Hadoop, Tableau, QlikView, and other data reporting tools. Experience in transactional and data warehouse environments using MySQL, Hive, or other database systems. Must deeply understand joins, subqueries, window functions, etc.
Strong background in designing relational databases like Postgres and NoSQL database like Mongo DB or Cassandra
Skills and abilities
Strong ability to drive complex technical solutions deployed at an enterprise level; ability to drive big data technology adoption and changes through education and partnership with stakeholders
Demonstrated experience in working with the vendor(s) and user communities to research and test new technologies to enhance the technical capabilities of existing Hadoop cluster
Ability to negotiate, resolve, and prioritize complex issues and provide explanations and information to others on difficult issues, assess alternatives and implement long-term solutions