Saish's Page

Saish Desai

GitHub LinkedIn Kaggle

πŸ‘¨β€πŸ’» About Me

AI Software Engineer holding a Master’s in Information Management (Data Science & Analytics) from University of Illinois at Urbana-Champaign. Experienced in building RAG pipelines, vector indexes, and LLM-powered applications on GCP and AWS. Key skills include NLP, Prompt Engineering, Multimodal Data Ingestion, Machine Learning, Deep Learning, and Production-grade Data Pipelines, leveraging Python, PyTorch, TensorFlow, SQL, and Cloud-native tools. Passionate about designing and productionizing AI systems enabling semantic search, retrieval, and agentic applications.

πŸš€ LLM Journey

I have gained hands-on experience with RAG (Retrieval-Augmented Generation), Prompt Engineering, and vector-based retrieval systems, building 50+ vector indexes and 20+ ETL pipelines using Vertex AI and Elasticsearch to power enterprise-scale conversational applications and agentic AI solutions. Through my academic work, I developed expertise in language modeling, neural machine translation, text summarization, and dialogue systems, implementing projects with PyTorch, TensorFlow, Hugging Face, NLTK, spaCy, and LangChain, including a medical chatbot for dental queries. I have also conducted RAG application testing, developed multimodal data ingestion frameworks, and co-authored research on dialogue system architectures, continuously enhancing pipelines with state-of-the-art techniques to improve retrieval performance and response quality.


πŸ€– Generative AI Experience


πŸ›  Technical Skills

Domain Skills
Programming Python (PyTorch, TensorFlow, pandas, scikit-learn, NLTK, spaCy, NumPy, Matplotlib, pymoo), R, SQL, PySpark
Tools & Platforms Jupyter Notebook, RStudio, SQLite, PostgreSQL, Git, Tableau, HuggingFace, Airflow, Doccano, Hadoop, Spark, Weights & Biases, LangChain, OpenAI, Ollama
AWS Cloud S3, Glue, Lambda, Athena, EC2, SageMaker, Bedrock
GCP Cloud Cloud Storage, BigQuery, Cloud Run, Vertex AI, Cloud Composer

πŸ”¬ Research

Assessment of NER Tools for detecting Funding Organizations (Information Quality Lab, UIUC)

Named Entity Recognition (NER) is a key element within the Natural Language Processing (NLP) pipeline of information extraction. NER helps discover valuable insights from textual documents by detecting entities mentioned in unstructured text and categorizing them into predefined categories such as person, organization, location, date, etc. In the past few years, the developers within the NLP community have developed some NER tools for detecting entities such as the organization names. The role of research funders in science is important. In order to better understand NER tools’ accuracy in identifying sponsors in the research funding domain, further research is needed to analyze research funding acknowledgement statements. My Research explores how well existing NER tools recognize funding organizations. Specifically, the most common existing NER tools have been evaluated for their performance to identify scenarios that need improvement, which will enable new research pertaining to Named Entity Recogniton in the research funding field.

πŸ”— View Repository


πŸ— Projects

🧠 NLP and LLM Projects

πŸ“ˆ Forecasting and Supply Chain Projects

πŸ“Š Data Science Projects


πŸ“š Literature Reviews