
AI Engineer with less than a year in NLP, LLM & RAG Systems and Machine Learning
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Second-year AI & Data Science undergraduate who has independently built and shipped three full NLP/ML systems alongside coursework, including a self-built 40,000-pair Sinhala transliteration corpus, a published open-source PyPI package, and a hybrid retrieval legal assistant covering 100+ Sri Lankan legal documents. Comfortable across the full pipeline: data collection, model fine-tuning (T5, mT5, Gemma 3 with LoRA/PEFT), evaluation, and lightweight deployment. Seeking an internship to apply this foundation in a production ML/AI team.
Robert Gordon University
BSc (Hons) · Artificial Intelligence & Data Science
August 1, 2023 – June 30, 2028
Dual-Architecture Singlish-to-Sinhala Transliteration System & sin-transliterator PyPI Package
January 1, 2026 – June 1, 2026
Built a 40,000-pair Sinhala transliteration corpus from scratch (no pre-existing labelled data) and fine-tuned multiple transformer architectures (T5, mT5, Gemma 3); Gemma 3 with LoRA/PEFT achieved CER 0.16 and WER 0.31 on a held-out 2,549-row test set, outperforming widely-used existing tools on ad-hoc and code-mixed Singlish that they typically fail to handle. Designed a two-stage inference pipeline (lightweight seq2seq model for real-time use, LLM refinement layer for ambiguous/code-mixed input), quantised to INT8 via CTranslate2 for CPU-only hosting, and published it as the sin-transliterator PyPI package with automatic CPU/GPU detection and versioned Hugging Face weights. Built a custom stochastic data augmenter to generate realistic ad-hoc Singlish variations and scraped code-mixed training examples from YouTube Live Chat, expanding the corpus beyond formal text and directly improving the model's robustness on real-world informal input.
View ProjectMyLawLLM: Sri Lankan Legal RAG Assistant
January 1, 2026 – June 1, 2026
Built a hybrid retrieval pipeline combining dense vector search with BM25 keyword matching over 100+ Sri Lankan legal documents, pairing a plain-English explanation with the underlying legal basis so non-experts can get oriented on routine legal questions without consulting a lawyer first. Implemented end-to-end with a FastAPI backend, Qdrant Cloud for vector storage, and a lightweight web interface; BM25 re-ranking on top of pre-indexed vectors keeps response latency low. Designed the retrieval and prompting layer to cite the specific source document for every answer, so users can trace any explanation back to the original legal text rather than relying on an unverifiable summary.
View ProjectCustomer Churn Prediction: Neural Network vs. Decision Tree Study
January 1, 2026 – June 1, 2026
Built and benchmarked a custom ANN against a Decision Tree on structured customer data, with a full evaluation suite covering confusion matrix, precision, recall, F1-score, and ROC-AUC; addressed class imbalance using random oversampling and SMOTE so both models learned from minority churn cases. Extracted human-readable decision rules from the tree model to surface the strongest churn predictors, giving a non-technical stakeholder a clear basis for prioritising retention efforts alongside the neural network's performance metrics. Compared model performance across the full evaluation suite to recommend which model fits which use case, balancing the decision tree's interpretability against the ANN's stronger raw predictive accuracy.
View ProjectFine-Tuning Large Language Models
Hugging Face
January 1, 2025 – Present
Professional Certificate in Machine Learning
IIT PDU
January 1, 2025 – Present
Supervised Machine Learning: Regression & Classification
ULSA
January 1, 2025 – Present
Cultural Fit Analysis
The candidate's personal projects demonstrate a strong passion for AI/ML and a self-driven learning approach, which aligns well with an innovative and growth-oriented culture. The open-source contributions (PyPI package, GitHub projects) indicate a collaborative spirit and willingness to share knowledge. The diversity of projects, from transliteration to legal RAG and churn prediction, shows a broad interest in applying AI to different domains. However, the lack of team-based project experience or professional roles makes it difficult to fully assess cultural fit in a corporate environment.
Soft Skills & Operational Fit
The candidate's project descriptions indicate strong problem-solving skills, particularly in addressing real-world challenges like ad-hoc Singlish variations and legal document retrieval. The ability to work independently on complex projects and publish open-source packages suggests initiative and a proactive approach. The focus on interpretability and user-centric design (e.g., citing sources in MyLawLLM) points to a thoughtful and practical mindset. However, without direct work experience, operational fit in a team setting and stress handling are not directly assessable from the provided data.