My focus is on building data pipelines on managed AWS infrastructure covering a broad range of data engineering topics.
Examples of work I did in the past:
Design and implementation of data pipelines
- Fetch data from a REST-API, transform it using AWS Glue and store the result on AWS S3
- Build complex data transformation jobs using Apache Spark on AWS EMR
- Develop and integrate an AWS Lambda function to transform data on upload
Cloud-based data warehouse implementations
- Data warehouse implementation based on AWS Glue Catalog, LakeFormation, S3, Athena
Design and implementation of data driven services
- Implement real-time analytics dashboard using AWS Kinesis and Spark Streaming
Workshops on cloud and data architecture, process design
- Using AWS Managed Airflow (MWAA) to schedule and manage various data pipelines
- Data Management / Data Lineage / Data Governance
- Strategies and best practices in developing a cloud-based data warehouse
How I work
My approach assessing a new project is as follows:
- Define Project Objectives and Requirements
- Assess Data Sources
- Cost Analysis and Technical Feasibility
- Timeline and Milestones
- Action Plan
The timeline and milestones provide an overview of how the project will be executed according to the action plan.
Deliverables
At the time of project completion you will get all deliverables agreed upon and consisting of:
- A repository containing all code and assets required to deploy and run the developed software
- A fully automated CI/CD pipeline that provisions the infrastructure, tests and deploys the code
- A well documented code base covering usage, maintenance and extension of the deliverables
Next Steps
If you think about your data engineering needs and are looking for support,
feel free to contact me by phone (+49 30 4193 6978) or email (inquiry@crichter.io).
I offer free consulting time discussing you needs and brainstorm viable solutions.
You don't have to book me, use our discussion as an inspiration for your work.
Book meeting