Data Engineer
I am a Data Engineer with expertise in designing and implementing scalable cloud data pipelines, ETL automation, and data infrastructure optimization. My work focuses on building reliable data systems that enable data-driven decision making across healthcare, e-commerce, and fintech domains.
I hold a Bachelor of Engineering in Computer Engineering from Tribhuvan University and am currently based in Kathmandu, Nepal.
As a data engineering professional, I specialize in building scalable data infrastructure and implementing efficient data solutions for enterprise environments. My expertise spans cloud data engineering, large-scale data processing, and machine learning operations, with a focus on delivering measurable business value through robust, production-ready systems.
Over 5+ years of professional experience in data engineering, I have worked on diverse projects spanning healthcare data systems, large-scale analytics platforms, financial services infrastructure, and machine learning applications. My work focuses on building scalable, reliable data systems that enable data-driven decision making.
Developed a comprehensive sentiment analysis system for Twitter data using advanced natural language processing techniques. Implemented Word2Vec for feature extraction and XGBoost for classification, achieving high accuracy in sentiment prediction.
View on GitHub →Major research project analyzing big data for product bundle recommendations. Utilized PySpark for distributed processing, Word2Vec for text embedding, and K-means clustering combined with bi-gram frequency analysis to generate intelligent product recommendations.
View on GitHub →Applied SARIMA (Seasonal AutoRegressive Integrated Moving Average) modeling for passenger count forecasting. Conducted comprehensive time series analysis achieving RMSE of 68.132, demonstrating proficiency in statistical modeling and predictive analytics.
View on GitHub →Developed a weather prediction system using SARIMA modeling with comprehensive data analysis and visualization. Achieved high accuracy with RMSE of 2.19, demonstrating strong capability in time series analysis and meteorological data processing.
View on GitHub →I am interested in opportunities involving data engineering, machine learning systems, and scalable data infrastructure. I am open to relocation for the right opportunity in Canada, USA, or Europe.