Summary
Overview
Work History
Education
Skills
Accomplishments
Timeline
Generic

Pujan Dahal

Bhaktapur,Nepal

Summary

Data Engineer with two years of cloud data engineering experience. Specialization in building and managing large-scale data pipelines, focusing on data extraction, processing, and storage. Proficient in data modelling and leveraging tools like Azure Data Factory, Microsoft Fabric, and AWS Glue for efficient data processing. Strong analytical and problem-solving skills, thriving independently and in collaborative teams to optimize workflows and ensure data accuracy.

Overview

2
2
years of professional experience
7
7
years of post-secondary education

Work History

Data engineer

Yirifi
Remote
04.2024 - Current
  • Data Ingestion: Designed and implemented system for daily email ingestion into AWS S3.
  • Data Processing: Used medallion architecture in AWS to clean and process emails with Spark and AWS Glue scripts for scalable data handling.
  • Prompt Engineering: Engineered prompts for LLMs to extract keywords and build lexicons from emails and web articles.
  • Data Storage: Worked with MongoDB for storing and managing extracted lexicon data.

Associate data engineer

Fusemachines Nepal
Kathmandu, Nepal
08.2022 - Current

For internal departments,

  • Data Integration: Built pipelines and scripts to ingest data from SQL databases, CRM, APIs, and Google Sheets into AWS S3.
  • Data Modeling and Architecture: Designed medallion architecture for scalable data pipelines, ensuring efficient ETL and data integrity.
  • Data Processing: Used Spark with AWS Glue for data cleaning, and transformation, and AWS Athena for complex querying.
  • Data Analysis: Created Apache Superset, and Google Looker Studio dashboards for monitoring and insights.
  • Data Analysis: Designed metrics for hiring process, and CRM, reducing time and costs significantly.


For a government client,

  • Data Integration: Built pipelines to ingest SQL data into Fabric Lakehouse using medallion architecture and Fabric Notebooks.
  • Data Modeling: Created optimized data models based on existing client models to streamline API calls.
  • Data Quality: Developed data quality framework for completeness, consistency, and uniqueness, ensuring high data standards.
  • Real-Time Data Processing: Leveraged Fabric Event Stream for real-time API integration with Azure SQL Database CDC.
  • Expertise in Azure ADLS, Azure Functions, Data Factory, Fabric Notebooks, KQL, Warehouse, and Lakehouse.

Education

Bachelor of Science - Computer Engineering

Institute of Engineering Pulchowk Campus
Lalitpur, Nepal
12.2017 - 08.2022

Certificate of Higher Education - Science

Kathmandu Model Higher Secondary School
Kathmandu, Nepal
07.2015 - 08.2017

Skills

  • SQL and databases
  • Python
  • Apache Spark
  • Azure Data Factory, Data Lake
  • Azure Function
  • Microsoft Fabric
  • AWS S3, Lambda, Glue, Athena
  • Data Modeling
  • Data Pipeline Design
  • Data Quality Assurance
  • Git Version Control
  • API Design and Development
  • Big Data Analysis
  • Agile Methodology

Accomplishments

Recognition for Exceptional Performance and Motivation, Yirifi

Timeline

Data engineer

Yirifi
04.2024 - Current

Associate data engineer

Fusemachines Nepal
08.2022 - Current

Bachelor of Science - Computer Engineering

Institute of Engineering Pulchowk Campus
12.2017 - 08.2022

Certificate of Higher Education - Science

Kathmandu Model Higher Secondary School
07.2015 - 08.2017
Pujan Dahal