In the past few years, I involved multiple data science and data engineering projects for across different industries in EMEA region, including manufactory, banking, energy, retail, telecom, gaming and media. I provide hands-on implementations, architecture design, technical leadership and advisory. Below is a list of projects I delivered, where [Prod] is short for a deployed end-to-end solution in production.

Data Science and Machine Learning Engineering
  • [Banking][Prod] Next Best Action Model Migration and Productization (ML Ops on AWS)
  • [Banking][Prod] Personal Identifiable Information Detection (NLP, classification, Spark, Airflow)
  • [Banking][Prod] Company Similarity Detection (graph network embedding, similarity computation, Airflow)
  • [Banking][Prod] Large Scale Fuzzy Entity Matching (Spark, NLP, similarity computation, Airflow)
  • [Banking][Prod] Customer Segment Leads Detection (classification, Spark, name matching)
  • [Banking][Prod] Mortgage Arrears Repayment Classification (imbalance learning, Random Forest)
  • [Banking] Company Sales Prediction (time series forecasting, Seq2Seq)
  • [Manufactory][Prod] Instagram Influencer Sale and Trend Prediction (classification, computer vision)
  • [Manufactory] Pill Images Classification on IoT Device (IoT, image classification)
  • [Energy][Prod] Dirty Cars and Smoking Person Detection in Petrol Station (IoT, image classification)
  • [Energy] Broken Device Detection on Electricity Transmission Network (object detection, multi-GPU)
  • [Telecom] Real-time Streaming Data Ingestion and Personalized Recommendation PoC (Kafka, Glue, S3, DynamoDB, Lambda, Amazon Personalized)
  • [Telecom][Prod] Data Science Model Platform (SageMaker, Glue, StepFunction, Lambda, API Gateway)
  • [HR] Job Categorical Classification from 500K Job Description (NLP Classification, multi-class multi-label)
  • [Environment] Predictive Maintenance on Dike Sensor Data (time series clustering, Dynamic Time wrapping)
Data Engineering
  • [Energy][Prod] Customer Churn Prediction Model Data Pipeline (Spark, S3, Glue, SageMaker, AWS security)
  • [Lottery][Prod] Data Platform (AWS S3, Redshift, DMS, Glue, CloudWatch)
  • [Banking][Prod] Company Financial Insight Dashboard (data pipeline, Spark, front-end, back-end, name matching, dashboarding)
  • [Banking] Data Lake Proof of Concept (AWS data lake, S3, Glue, Kafka, DynamoDB, ElasticSearch)
  • [Media] Oracle Database Event-driven automated Migration (AWS DMS, Lambda, Cloudformation)