AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit

Welcome to my GitHub page! Here, you'll find a variety of open source projects that reflect my passion for collaborative development and innovation.
IBM
Java Backend developer
June 12, 2026 – Present
Preprocessing-Unstructured-Data-for-LLM-Applications-RAG
December 3, 2024 – December 3, 2024
Preprocessing-Unstructured-Data-for-LLM-Applications-RAG Here’s a detailed README for your project. It highlights the key features, dependencies, usage, and instructions to work with PDFs containing text, images, and tables.
View ProjectloRA-LLM-Fine-tuning
December 3, 2024 – December 3, 2024
This repository contains a step-by-step guide to fine-tuning Large Language Models (LLMs) using Low-Rank Adaptation (LoRA). The goal is to train and evaluate domain-specific LLMs with a centralized approach, leveraging the robust ChatGLM2-6B base model and medAlpaca dataset for practical examples.
View ProjectRag-Milvus-Spark
November 8, 2024 – November 8, 2024
Using Apache Spark to ingest, chunk, embed and store it in a Vector DB
View ProjectScala-Akka-CRUD
August 7, 2024 – August 7, 2024
Akka HTTP server and make API requests, you can use the sample data to test the CRUD operations
View ProjectLinear-Regression-Pyspark
June 13, 2023 – August 6, 2024
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression.
View ProjectPyspark-Streaming-Bank-Analysis
April 17, 2023 – August 6, 2024
This project marks our initial venture into Apache Spark. We focused on online banking analysis by leveraging various datasets from Kaggle, including loans, customer credit cards, and transactions. After downloading these datasets, we cleaned the data and utilized tools and technologies like Spark, HDFS, and Hive to execute various use cases.
View ProjectPySpark-Dataframe-Guide
April 15, 2023 – August 6, 2024
PySpark Dataframe Complete Guide (with COVID-19 Dataset) Spark which is one of the most used tools when it comes to working with Big Data. While once upon a time Spark used to be heavily reliant on RDD manipulations, Spark has now provided a DataFrame API for us Data Scientists to work with.
View ProjectPySpark-SQL
April 13, 2023 – August 6, 2024
Apache Spark is a cluster computing system that offers comprehensive libraries and APIs for developers. SparkSQL is a module in Apache Spark for processing structured data with the help of the DataFrame API.
View ProjectETL-Spark-Scala
March 20, 2023 – August 6, 2024
This script contains analysis for movie data. It demonstrates how to use Spark for processing and analyzing large datasets
View ProjectHadoop-Hive-SQL-Scala
March 20, 2023 – August 6, 2024
This repository contains resources and examples for working with Hadoop, Hive, and Spark, with a focus on SQL operations and Scala-based implementations. It also covers the basics of Spark job submission and scheduling with Cron.
View ProjectCultural Fit Analysis
The candidate's projects are heavily focused on data engineering, machine learning, and big data, primarily using Python and Scala. While there is a stated target role of 'Java Backend developer' and an 'IBM Java Backend developer' experience entry, the project portfolio does not align with core Java backend development. This indicates a potential mismatch with the target role's primary technology stack, suggesting a lower cultural fit for a dedicated Java backend position without further evidence of Java proficiency.
Soft Skills & Operational Fit
Insufficient data to assess soft skills and operational fit. The candidate's project descriptions indicate an ability to work with complex technical topics, but no direct evidence of collaboration, problem-solving under pressure, or communication in a team setting is available.