AI Engineer with 2+ years in Machine Learning & NLP Development
AI is analyzing your overall score…
Identifying your key strengths…
Evaluating your skill match against the job requirements…
Assessing your cultural and operational fit
Machine learning researcher and AI systems developer with a BSc in Computer Science and Engineering (2022). I have hands-on experience building end-to-end ML pipelines, conversational AI systems (RAG, intent classification, STT), and NLP applications. I am an active researcher with one published paper on ArXiv, one under review at a Springer Q1 journal, and an ongoing Bangla-language voice assistant project targeting digitally low-literate users in Bangladesh. I am familiar with JavaScript and actively building Node.js proficiency to work within enterprise low-code AI environments.
Metropolitan University
BSc · Computer Science & Engineering
January 1, 2018 – January 1, 2022
OpenGenus Foundation
Research Scientist, Intern
January 1, 2022 – December 31, 2023
India
Bangla Voice Assistant for Low-Literacy Users
January 1, 2025 – Present
Designed a voice-driven assistant targeting digitally low-literate Bengali speakers, motivated by direct observation of smartphone usability barriers among digitally low-literate family members — a problem underserved by existing solutions. Completed: purpose-built intent classification dataset of 1,228 annotated samples across 10 smartphone-navigation intents (call, messaging, camera, settings, etc.), designed for a constrained low-resource language environment. Architecture: server-side STT using BanglaSpeech2Text, lightweight intent classifier, and a document-based intent-to-response mapping system for the 10 fixed intents - chosen over LLM-based responses to ensure reliability for low-literate users with no tolerance for hallucination. Next phases: full pipeline integration, latency optimization, commercial release, and academic publication.
ChefBot - RAG-Powered Cooking Assistant
January 1, 2025 – Present
Built a full Retrieval-Augmented Generation (RAG) pipeline that ingests multiple PDF cookbooks, creates vector embeddings using all-mpnet-base-v2 (selected for semantic richness over smaller alternatives), and retrieves context via cosine similarity in ChromaDB before generating responses via LLaMA 3 through Groq API. Addressed LLM hallucination by engineering system prompts to strictly constrain responses to retrieved cookbook context only — iterated through multiple prompt versions, diagnosing failures by inspecting embedding quality, chunk sizing, and similarity scores. Solved deployment constraint by migrating from local Ollama to Groq API for cloud hosting on HuggingFace Spaces, requiring architectural refactoring of the inference layer. Implemented streaming Gradio interface for responsive UX; project is live at huggingface.co/spaces/Munfa007/chef-bot.
Air Quality & Health Risk Assessment System
January 1, 2025 – Present
Built an end-to-end ML system: user inputs a city name → system resolves coordinates from local reference data → calls external weather/pollutant APIs → trained ML model (XGBoost) predicts AQI → result stored in PostgreSQL → dashboard visualizes trends over time. Primary engineering challenge was designing the FastAPI pipeline so all five stages (coordinate resolution, API fetching, inference, database write, retrieval) executed sequentially and reliably — required careful async handling and error recovery to prevent silent failures. Configured PostgreSQL on Neon with schema for time-series AQI storage; managed deployment on Render including dependency resolution challenges with requirements.txt for cloud environment. Built two visualization layers: interactive Streamlit dashboard (live at air-quality-health-risk-1.onrender.com) and Looker Studio dashboard showing AQI trends, health risk distributions, and city-level comparisons.
Semantic Movie Search Engine
January 1, 2025 – Present
Built a natural language movie search engine over ~3,000 films (2000–2025) fetched from the TMDB API, supporting queries like 'psychological thriller set in Japan' rather than keyword matching. Pipeline: TMDB API data collection → preprocessing (title + genre + overview concatenation) → embedding with all-MiniLM-L6-v2 → FAISS index for fast similarity search → Streamlit UI displaying matched results with poster images. Key challenge: genre data from TMDB is returned as integer IDs, not strings — built a genre ID-to-name mapping layer to make genre information usable in semantic search strings. Selected all-MiniLM-L6-v2 after evaluating multiple HuggingFace embedding models for the balance of semantic quality and inference speed suitable for this retrieval task.
Multi-Model News Sentiment Analysis
January 1, 2025 – Present
Designed a comparative NLP pipeline applying four sentiment models (TextBlob, VADER, DistilBERT SST-2, ROBERTa emotion classifier) to the same news dataset to analyze how different architectures interpret the same text. Key technical challenge: transformer models (DistilBERT, RoBERTa) do not support single-sample inference efficiently via HuggingFace pipelines — had to implement batched inference and then re-align batch outputs back to individual article rows to match the row-by-row results from TextBlob and VADER. Fetched live news via NewsAPI, stored results in CSV, and produced visualizations showing label distribution per model and weekly sentiment trend lines — revealing meaningful divergence between rule-based and transformer-based approaches on the same corpus.
Springer Q1 paper under review
Springer
June 1, 2026 – Present
A Dual Pipeline ML Framework for Sleep Disorder Screening
ArXiv
January 1, 2026 – Present
Cultural Fit Analysis
The candidate's project portfolio is diverse, covering NLP, RAG, traditional ML, and conversational AI, which aligns well with the broad scope often found in AI engineering roles. Their proactive engagement in personal projects and research, including publications, demonstrates a strong passion for AI and continuous learning, which is a positive cultural indicator. The focus on real-world problems (low-literacy users, cooking assistant) and end-to-end system development suggests a results-oriented mindset. The experience with various deployment platforms (HuggingFace Spaces, Render, Streamlit Cloud) indicates flexibility and a willingness to adapt to different operational environments.
Soft Skills & Operational Fit
The candidate demonstrates strong problem-solving skills, evidenced by their detailed descriptions of challenges and solutions in their projects (e.g., async handling in FastAPI, batched inference for transformers, LLM hallucination mitigation). Their solo developer roles on multiple complex projects suggest high autonomy and initiative. The research background and technical writing experience indicate strong analytical and communication skills, crucial for documenting and explaining complex AI systems. The focus on user-centric design (low-literacy users) and responsive UX (Gradio streaming) points to a user-aware and practical approach to development.