Senior AI/ML Manager @ Apple
I am a Senior Applied Research Manager with 8 years of management experience and over 10 years of industry experience. I work in the Siri and Search team at Apple where I work on building Apple Intelligence AI features that powers natural interactions across Siri, Spotlight and Safari.
My research interests lie at the intersection of Generative AI, Question Answering, Dialog Systems and Knowledge Graphs. I have 30+ patents and 20+ publications with 1000+ citations in top tier research conferences like ACL, NAACL, EMNLP, AAAI, etc.
Education
M.S. in Intelligent Information Systems
Advisor:
Jamie Callan
B.E. in Computer Engineering
Work Experience
Oct 2022 - Present
Lead the applied research team for Knowledge Graph Machine Learning where we build features that powers interactions across Siri, Safari and Spotlight Search. I currently work on Question Answering, Semantic Annotation, Entity Linking and Knowledge Graphs.
Feb 2015 - Sep 2022
Manager and lead for the applied research team that designed and developed algorithms for IBM's conversational AI product - Watson Assistant. I worked on the Natural Language Understanding components of Waston Assistant which includes intent classification, entity recognition, spellcheck and irrelevant detection across multiple languages. The algorithms are designed to be custom-trained for customers globally, deployed at scale with hundreds of thousands of models in production and serves more than 1.9% of the world’s population every month.
May 2014 - Aug 2014
My work revolved around using Distributional Semantics to improve the Watson Question Answering system. We used query expansion, synonym generation and question classification to improve the Watson Question Answering system.
2012 - 2013
Assistant System Engineer,
Selected Honors and Awards
2024
Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs
2023
Special Jury Recognition
2022
https://www.technologyreview.com/
2022
VentureBeat
2022
Title recognizing sustained and outstanding contributions to IP
2021
Outstanding technical contributions to the Watson Assistant product which resulted in high business impact
2021
State-of-the-art algorithms for intent classification and entity recognition in Watson Assistant
2020
Meta-Learning for Low-Resource NLP
2020
Outstanding technical contributions for language enablement for Watson Services which resulted in high business impact
2019
I was one of the 25 women leaders across IBM selected for the fully funded eCornell certificate program
2018
Outstanding technical contributions to the Watson Conversation Service product which resulted in high business impact
2018
Awarded to less than 1000 employees globally contributions to IBM’s business
2017
Awarded for exceptional contributions to IBM Watson
2016
Watson Conversation Service
2017, 2018, 2019, 2020, 2021, 2022
Publications
Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs
Simone Conia
,
Daniel Lee
,
Min Li
,
Umar Farooq Minhas
,
Saloni Potdar
,
Yunyao Li
EMNLP (Main Track). 2024.
AGRaME Any-Granularity Ranking with Multi-Vector Embeddings
Revanth Gangi Reddy
,
Omar Attia
,
Yunyao Li
,
Heng Ji
,
Saloni Potdar
EMNLP (Main Track). 2024.
ConvKGYarn Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models
Ronak Pradeep
,
Daniel Lee
,
Ali Mousavi
,
Jeff Pound
,
Yisi Sang
,
Jimmy Lin
,
Ihab Ilyas
,
Saloni Potdar
,
Mostafa Arefiyan
,
Yunyao Li
EMNLP (Industry Track). 2024.
Entity Disambiguation via Fusion Entity Decoding
Junxiong Wang
,
Ali Mousavi
,
Omar Attia
,
Saloni Potdar
,
Alexander Rush
,
Umar Farooq Minhas
,
Yunyao Li
NAACL (Main Track). 2024.
Do Large Language Models Have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs
Yanzhu Guo
,
Simone Conia
,
Zelin Zhou
,
Min Li
,
Saloni Potdar
,
Henry Xiao
arXiv preprint. 2024.
Distinguish Sense from Nonsense Out-of-Scope Detection for Virtual Assistants
Cheng Qian
,
Haode Qi
,
Gengyu Wang
,
Ladislav Kunc
,
Saloni Potdar
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing Industry Track. 2022.
Fast and Light-Weight Answer Text Retrieval in Dialogue Systems
Hui Wan
,
Siva Sankalp Patel
,
William Murdock
,
Saloni Potdar
,
Sachindra Joshi
NAACL (Industry Track). 2022.
Benchmarking Language-agnostic Intent Classification for Virtual Assistant Platforms
Gengyu Wang
,
Cheng Qian
,
Lin Pan
,
Haode Qi
,
Ladislav Kunc
,
Saloni Potdar
Proceedings of the Workshop on Multilingual Information Access (MIA). 2022.
Comparing Model Development Practices in B2B vs B2C Machine Learning Teams
Navneet Rao
,
Saloni Potdar
Workshop on Applied Machine Learning Management - Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022.
Improved text classification via contrastive adversarial training
Lin Pan
,
Chung-Wei Hang
,
Avirup Sil
,
Saloni Potdar
Proceedings of the AAAI Conference on Artificial Intelligence. 2022.
Narrative Question Answering with Cutting-Edge Open-Domain QA Techniques A Comprehensive Study
Xiangyang Mou
,
Chenghao Yang
,
Mo Yu
,
Bingsheng Yao
,
Xiaoxiao Guo
,
Saloni Potdar
,
Hui Su
Transactions of the Association for Computational Linguistics. 2021.
Benchmarking Commercial Intent Detection Services with Practice-Driven Evaluations
Haode Qi
,
Lin Pan
,
Atin Sood
,
Abhishek Shah
,
Ladislav Kunc
,
Mo Yu
,
Saloni Potdar
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Industry Papers. 2021.
Multilingual BERT Post-Pretraining Alignment
Lin Pan
,
Chung-Wei Hang
,
Haode Qi
,
Abhishek Shah
,
Mo Yu
,
Saloni Potdar
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2020.
Frustratingly Hard Evidence Retrieval for QA Over Books
Xiangyang Mou
,
Mo Yu
,
Bingsheng Yao
,
Chenghao Yang
,
Xiaoxiao Guo
,
Saloni Potdar
,
Hui Su
Proceedings of the 1st Joint Workshop on Narrative Understanding, Storylines, and Events. 2020.
Diverse Few-Shot Text Classification with Multiple Metrics
Mo Yu
,
Xiaoxiao Guo
,
Jinfeng Yi
,
Shiyu Chang
,
Saloni Potdar
,
Gerald Tesauro
,
Wang, Haoyu
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. 2018.
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Haoyu Wang
,
Ming Tan
,
Mo Yu
,
Shiyu Chang
,
Dakuo Wang,
,
Kun Xu
,
Xiaoxiao Guo
,
Saloni Potdar
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
Context-Aware Conversation Thread Detection in Multi-Party Chat
Ming Tan
,
Dakuo Wang
,
Yupeng Gao
,
Haoyu Wang
,
Saloni Potdar
,
Xiaoxiao Guo
,
Shiyu Chang
,
Mo Yu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
Out-of-Domain Detection for Low-Resource Text Classification Tasks
Ming Tan
,
Yang Yu
,
Haoyu Wang
,
Dakuo Wang
,
Saloni Potdar
,
Shiyu Chang
,
Mo Yu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019.
Identifying student leaders from MOOC discussion forums through language influence
Seungwhan Moon
,
Saloni Potdar
,
Lara Martin
Proceedings of the EMNLP 2014 Workshop on Analysis of Large Scale Social Interaction in MOOCs. 2014.
Neural Models for Sequence Chunking
Feifei Zhai
,
Saloni Potdar
,
Bing Xiang
,
Bowen Zhou
AAAI Conference on Artificial Intelligence. 2017.
Robust Task Clustering for Deep Many-Task Learning
Mo Yu
,
Xiaoxiao Guo
,
Jinfeng Yi
,
Shiyu Chang
,
Saloni Potdar
,
Gerald Tesauro
,
Wang, Haoyu
,
Bowen Zhou
arXiv preprint arXiv:1708.07918. 2017.
Patents
Configuring artificial intelligence-based virtual assistants using response modes
Matthew Richard Arnold
,
Eric Donald Wayne
,
Saloni Potdar
US Patent App. 18085257. 2024.
Pretraining of Split Layer Portions for Multilingual Model
Lin Pan
,
Haode Qi
,
Ladislav Kunc
,
Saloni Potdar
US Patent App. 18063788. 2024.
Detecting out-of-domain text data in dialog systems using artificial intelligence
Cheng Qian
,
Haode Qi
,
Saloni Potdar
,
Ladislav Kunc
US Patent App. 17/897,887. 2024.
Out of domain sentence detection
Haode Qi
,
Cheng Qian
,
Ladislav Kunc
,
Saloni Potdar
,
Eric Wayne
US Patent App. US17/815,630. 2024.
Conversational AI with multi-lingual human chatlogs
Haode Qi
,
Lin Pan
,
Abhishek Shah
,
Ladislav Kunc
,
Saloni Potdar
US Patent 11,853,712. 2023.
Out-of-domain encoder training
Ming Tan
,
Dakuo Wang
,
Mo Yu
,
Haoyu Wang
,
Yang Yu
,
Shiyu Chang
,
Saloni Potdar
US Patent 11,645,514. 2023.
Domain specific model compression
Haoyu Wang
,
Yang Yu
,
Ming Tan
,
Saloni Potdar
US Patent 11,620,435. 2023.
Artificial intelligence based context dependent spellchecking
Panos Karagiannis
,
Ladislav Kunc
,
Saloni Potdar
,
Haoyu Wang
,
Navneet Rao
US Patent 11,301,626. 2022.
Intent classification distribution calibration
Haoyu Wang
,
Ming Tan
,
Dakuo Wang
,
Chuang Gan
,
Saloni Potdar
US Patent 11,436,528. 2022.
Contextual question answering using human chat logs
Yang Yu;
,
Ming Tan
,
Shasha Lin
,
Saloni Potdar
US Patent 11,443,117. 2022.
Routing text classifications within a cross-domain conversational service
Ming Tan
,
Ladislav Kunc
,
Yang Yu;
,
Haoyu Wang
,
Saloni Potdar
US Patent 11,270,077. 2022.
Evaluating text classification anomalies predicted by a text classification model
Ming Tan
,
Saloni Potdar
,
Lakshminarayanan Krishnamurthy
US Patent 11,537,821. 2022.
Hybrid model for short text classification with imbalanced data
Yang Yu
,
Ming Tan
,
Ravi Nair
,
Haoyu Wang
,
Saloni Potdar
US Patent 11,328,221. 2022.
Unintended bias detection in conversational agent platforms with machine learning model
Navneet Rao
,
Ming Tan
,
Haode Qi
,
Yang Yu
,
Panos Karagiannis
,
Saloni Potdar
Google Patents. 2022.
Generating question answer pairs
Dakuo Wang
,
Mo Yu
,
Chuang Gan
,
Saloni Potdar
US Patent App. 17/302,550. 2022.
Intent Classification using non-correlated features
Abhishek Shah
,
Ladislav Kunc
,
Haode Qi
,
Lin Pan
,
Saloni Potdar
US Patent App. 17/350,116. 2024.
Weak supervised abnormal entity detection
Haode Qi
,
Ming Tan
,
Yang Yu
,
Navneet Rao
,
Ladislav Kunc
,
Saloni Potdar
US Patent 11,423,227. 2022.
Intent boundary segmentation for multi-intent utterances
Ming Tan
,
Haoyu Wang
,
Saloni Potdar
,
Yang Yu
,
Navneet Rao
,
Haode Qi
US Patent 11,308,944. 2022.
Mechanisms for continuous improvement of automated machine learning
Haode Qi
,
Ming Tan
,
Ladislav Kunc
,
Saloni Potdar
US Patent 11,423,333. 2022.
Feature reweighting in text classifier generation using unlabeled data
Yang Yu
,
Haode Qi
,
Haoyu Wang
,
Ming Tan
,
Navneet Rao
,
Saloni Potdar Robert Yates
US Patent 11,216,619. 2022.
Suggestion of new entity types with discriminative term importance analysis
Haode Qi
,
Ming Tan
,
Yang Yu
,
Navneet Rao
,
Saloni Potdar
,
Haoyu Wang
US Patent 11,379,666. 2022.
Learning Parameter Sampling Configuration for Automated Machine Learning
Haode Qi
,
Ming Tan
,
Ladislav Kunc
,
Saloni Potdar
Google Patents. 2021.
Privacy Protection Through Template Embedding
Haode Qi
,
Saloni Potdar
,
Ming Tan
,
Navneet Rao
Google Patents. 2021.
Bias Detection in Conversational Agent Platforms
Navneet Rao
,
Ming Tan
,
Haode Qi
,
Yang Yu
,
Panos Karagiannis
,
Saloni Potdar
Google Patents. 2021.
Adversarial training data augmentation data for text classifiers
Ming Tan
,
Ruijian Wang
,
Inkit Padhi
,
Saloni Potdar
US Patent 11,093,707. 2021.
Displaying text classification anomalies predicted by a text classification model
Ming Tan
,
Saloni Potdar
,
Lakshminarayanan Krishnamurthy
US Patent 11,068,656. 2021.
Updating an online multi-domain sentence representation generation module of a text classification system
Ming Tan
,
Ladislav Kunc
,
Yang Yu
,
Haoyu Wang
,
Saloni Potdar
US Patent 11,120,225. 2021.
Cross-domain multi-task learning for text classification
Ming Tan
,
Haoyu Wang
,
Ladislav Kunc
,
Yang Yu
,
Saloni Potdar
US Patent 10,937,416. 2021.
Displaying text classification anomalies predicted by a text classification model
Ming Tan
,
Saloni Potdar
,
Lakshminarayanan Krishnamurthy
US Patent 11,074,414. 2021.
Weighting features for an intent classification system
Yang Yu
,
Ladislav Kunc
,
Haoyu Wang
,
Ming Tan
,
Saloni Potdar
US Patent 10,977,445. 2021.
Out-of-domain sentence detection
Inkit Padhi
,
Ruijian Wang
,
Haoyu Wang
,
Saloni Potdar
US Patent 11,023,683. 2021.
Adversarial training data augmentation for generating related responses
Ming Tan
,
Ruijian Wang
,
Inkit Padhi
,
Saloni Potdar
US Patent 11,189,269. 2021.
Implementing dynamic confidence rescaling with modularity in automatic user intent detection systems
Yang Yu
,
Ladislav Kunc
,
Saloni Potdar
Google Patents. 2020.
Services
Apple Ph.D. Fellowship Selection Committee
(
Information Retrieval and Knowledge Graph)
2023,
2024
Apple Internal AIML Conference
(Knowledge Bases and Search Track Chair)
2022,
2023,
2024
ACL/NAACL/EMNLP ARR
(
Area Chair)
2024
EMNLP Industry Track
(
Area Chair)
2024
NAACL Industry Track
(
Reviewer)
2024
WAMLM KDD
(
Co-organizer)
2024
WAMLM KDD
(
Co-organizer)
2023
EMNLP Industry Track
(
Reviewer)
2023
ACL Industry Track
(
Reviewer)
2023
Web Conference Industry Track
(
Reviewer)
2023
EMNLP Industry Track
(
Reviewer)
2022
NAACL Industry Track
(
Reviewer)
2022
ACL/NAACL/EMNLP ARR
(
Reviewer)
2021
Workshop for Women in Machine Learning 2019
(
Reviewer)
2019
Collaborators and Interns
Jun 2023 - present
PhD, University of Illinois Urbana-Champaign
Jan 2023 - present
PhD, Sapienza NLP
Jun 2023 - present
PhD, University of Waterloo
Feb 2023 - present
Bachelor, University of Calgary
Jul 2023 - present
Masters, Shanghai Jiao Tong University
Jun 2023 - Dec 2023
PhD, Cornell University
Summer 2021
PhD, Cornell University
Summer 2020
PhD, University of Chicago
Articles and Blogs
Oct 2024
Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs
Oct 2024
ConvKGYarn Spinning Configurable and Scalable Conversational Knowledge Graph QA Datasets with Large Language Models
Aug 2024
Spinning Configurable and Scalable Conversational Knowledge Graph QA Datasets with Large Language Models
Jun 2024
AGRaME Any Granularity Ranking with Multi-Vector Embeddings
Aug 2024
Entity Disambiguation via Fusion Entity Decoding
May 26 2021
Under the hood - all the natural language understanding technology that makes Watson Assistant powerful
Apr 23 2021
AI Lifecycle for Virtual Assistants
Nov 14 2019
Why Zero-Effort Irrelevance is Relevant
Jul 29 2019
A New State-of-the-Art Method for Relation Extraction
Press
Aug 17 2023
Finalists and Special Jury Recognitions announced for Women in AI Awards North America 2023
Jul 10 2022
Putting more knowledge at the fingertips of non-English speakers
July 8 2022
Meet the nominees for the 2022 VentureBeat Women in AI Awards
Apr 19 2021
5 reasons NLP for chatbots improves performance
Dec 10 2020
Watson Assistant improves intent detection accuracy, leads against AI vendors cited in published study
Jul 15 2020
Announcing nominees for the second annual Women in AI Awards