I'm Nicola, a Data Engineer living in Berlin.

My knowledge focus on various programming languages, but I’m working mostly with Python and SQL to implement ETL processes, that depending on the amount of data, can run in one Worker or in a Spark Cluster with many nodes. In this year I focused on distributed systems to process a huge amount of data, e.g. Spark, but also distribute storages or SQL engines like Presto or ElasticSearch. I’m a big fan of AWS, and all their services, why maintain a Kafka Cluster with ZooKeeper if you can easily setup a Kinesis Stream with a flexible number of shards based on the number of producers? I like to maintain the AWS Infrastructure using tools like Cloudformation or Teraform. This is not simply writing config files, but involves using tested and proven software development practices e.g: version control, testing, small deployments, design patterns.

Experience

Data Engineer

Company: 8fit
Period: Dec 2017 - Now

Data Engineer

Company: Babbel
Period: Oct 2015 - Nov 2017

  • Design, write, build, test and maintain interactive web reports
  • Design, operate and scale the platform infrastructure and data products for different internal customers
  • Design, build, test and maintain data-processing pipelines with a mind toward accuracy, scalability, high performance and data quality
  • Investigate next-generation data and analytics technologies to expand the capacity and performance of Babbel’s stack
  • Lead and participate in cross-functional projects that support data applications and reporting

Technologies: SQL, Python(pandas, pytest), PySpark, Javascript(node, angular, d3.js, Highcharts, Mocha, Chai), Bash, TravisCI, Terraform AWS: EC2, EMR(Haddop, Spark, Presto), RDS, Kinesis, Firehose, DynamoDB, S3, Lambda, IAM, VPC, Cloudformation, CodeDeploy CI: Travis

Junior Business Intelligence Engineer

Company: eng.it
Period: Feb 2014 - Sep 2015

  • Datawarehouse(DWH) design using a dimensional model approach(Kimball)
  • Implement ETL(Extract Transform Load) procedures to populate DWH tables from OLTP systems
  • Design, implement and monitor KPIs related to business processes and environmental metrics
  • Implement and maintain interactive web dashboards and reports, using the business intelligence suite SpagoBI
  • Administrate, maintain and scale SpagoBI for Production

Technologies: SQL, PL/SQL, Java (Spring, Hibernate, Primefaces), Javascript (jQuery, Highcharts)

Student Software Developer

Company: TU Berlin/T-Labs
Period: May 2013 - Aug 2013

  • Implementation of an extension of a web-based data visualization tool for IPTV QoE Reports
  • KQI (Key Quality Index) design and implementation
  • KQI visualization for specific days and for specific locations

Technologies: Python, SQL, Javascript(d3.js, MapBox), HTML, CSS, Bash