onsite
Senior Data Scientist
Data Scientist
As a Senior Data Scientist at Pegasystems, you will be instrumental in developing and fine-tuning NLP & NLG services for chat, voice to text, and email bots using deep learning models, particularly transformer-based algorithms like GPT. You will research, implement, and prototype text analytics solutions, working closely with engineering, UI, and support teams to integrate machine learning models and improve their performance for Pega's diverse customer base.
About the role
Meet Our Team
Pega’s Machine Learning tribe involves developing NLP & NLG services which provides decisions for chat, voice to text and email bots using models built on customer data. We work on deep learning models, training & finetune transformer-based algorithms like GPT to solve analytics & generative AI use cases. There are multiple teams that are engaged heavily in building Development Studios that will ease Data Scientists and Business Analysts to integrate Machine Learning Models into their business use cases and help them to build / monitor / update / simulate new models.
Picture Yourself At Pega
- Pega’s Machine learning and NLP is deployed in thousands of organizations (including major Fortune 500 companies) to solve a range of use cases from chat and email bots, intelligent automation, marketing ad display
- With the advent of Generative AI like ChatGPT we are not only connecting and fine-tuning models for our use cases but hosting transformer models inhouse. In this role, you will Keep track of GPT developments and suggest novel approaches to solve Pega use cases Test & fine tune open source GPT models which can be used across hundreds of Pega customers
- Engage in understanding new techniques from NLP
- Work with the engineering team to implement your Machine Learning approaches which will be deployed through our machine learning services
- Work with the UI and support teams to provide guidance on how our business users can better understand the behavior of Pega’s machine learning and recommend solutions to them which can improve model performance
What You'll Do At Pega
- Understanding new business requirements related to text analytics and researching into necessary tools and technologies (Python)
- Understand transformer models landscape including GPT
- Ability to fine tune transformer models from Huggingface with client data
- Knowledge of prompt engineering with common models like ChatGPT etc.
- Knowledge in Java is preferred but not mandatory
- Looking at gaps in technology space and providing solutions to fill the same
- Keeping abreast of newer technologies in the NLP space
- Integrating support of additional languages
- Implementation of text analytics algorithms (Machine Learning / Deep Learning / Shallow Learning)
- Research into feature engineering and bring in best practices
- Prototyping of solutions and benchmarking for performance and accuracy (Boosting / Bagging / Treatments )
Who You Are
- 4 years of relevant data science experience
- Strong knowledge of machine learning fundamentals
- Strong knowledge of Python and Python based ML libraries
- Knowledge in non NLP data science algorithms is preferred.
- Good understanding and experience in text analytics including linguistic and statistical algorithms
- Strong maths and statistic proficiency
- Experience in using algorithms like Naïve Bayes, Maximum Entropy, SVM, Logistic Regression, Neural Networks, Transformers, GPT, BERT etc.
- Experience in semantic technologies
- Experience in building/using Rules engines using algorithms like RETE
- Good Communication skills
What You've Accomplished
- Leading the research and development of core algorithms in Text Analytics using Lingustic, Statistical and Semantic technologies
- Building domain specific NLP models that will be adopted by Financial / Telco / Health Science space
- Build Generic NLP models that can sense social / business intents
- Dabbled in CHATGPT and consider yourself an expert in prompt engineering.