Hello, I am

Sakshi Asati

Data Scientist & ML Engineer

I architect reliable AI systems that turn complex data into intelligent experiences. From resilient data pipelines to production-grade models, I help teams move fast without sacrificing rigor.

Sakshi Asati portrait
Currently

M.S. Data Science, University of Colorado Boulder

“Innovation happens when curiosity meets disciplined execution.”

- Sakshi Asati
0 Years Experience
0 End-to-end Projects
0 Max Model Accuracy
0 Production Deployments
Scroll

About

I design data products that balance experimentation with production reliability. My experience spans ML research, data engineering, and stakeholder enablement across global teams.

What I Do

  • Engineer GDPR-compliant Hadoop and Spark pipelines, instrumented with automated quality checks and monitoring alerts.
  • Automate Airflow orchestration and Jenkins-based CI/CD so analytics teams ship code with fewer manual steps.
  • Prototype and productionize predictive models that cut operational overhead for supply chain and customer-support programs.
  • Partner with stakeholders - product owners, analysts, and engineers - to troubleshoot pipelines and share reproducible ML playbooks.

Education

Aug 2024 – May 2026 (Expected)

M.S. Data Science

University of Colorado Boulder, USA

Teaching Assistant – Data Mining

2018 – 2022

B.E. Information Technology

G.H. Raisoni College of Engineering, India

Core Skills

Machine Learning
Data Engineering
LLM & GenAI Ops
Data Analytics
Python

Experience

Partnered with product managers, data engineers, and platform teams to deliver AI capabilities that improve business outcomes.

Jan 2026 – Present

Entrepreneurship Fellow

Innovation & Entrepreneurship Initiative – CU Boulder · Boulder, CO

  • Built an automated Python ETL pipeline to collect and enrich 1,000+ CU Boulder alumni founder records from 5+ public sources using web scraping and API integrations.
  • Designed normalized SQLite schema and implemented cleaning, enrichment, and deduplication workflows to produce high-quality, CRM-ready datasets.
  • Generated Salesforce-ready exports and productionized weekly runs with logging, rate limiting, and error handling, replacing manual research with a fully automated process.
Jun 2024 – Aug 2024

Data & ML Intern

The Recess App · Denver, CO

  • Replatformed the analytics backend by migrating MySQL tables into a Supabase-hosted PostgreSQL schema, unlocking faster queries and smoothing upcoming feature rollouts.
  • Automated retention outreach by wiring Supabase triggers into Customer.io journeys so real-time events kick off onboarding and follow-up sequences without manual intervention.
  • Replaced high-latency AWS Lambda jobs with Supabase Edge Functions (Deno), improving response times and simplifying day-to-day infrastructure upkeep.
May 2022 – May 2024

Programmer Analyst

Cognizant Technology Solutions · Pune, India

  • Monitored and managed ETL flows ingesting multi-format datasets into Hadoop with GDPR-compliant encryption so sensitive financial data stayed secure.
  • Automated UAC-based data pipelines with Apache Airflow, adding retry logic and SLA notifications that cut manual oversight by 50% while improving delivery reliability.
  • Resolved production Spark failures by tracing stack traces, rerunning impacted jobs, and partnering with platform teams to keep business-critical workloads online.
  • Built SQL/Python validation suites and Jenkins + Bitbucket pipelines that reduced incident response time by 15% and tightened deployment quality.
May 2022 – May 2023

Programmer Analyst Trainee

Cognizant Technology Solutions · Pune, India

  • Partnered with senior engineers to productionize Hadoop ingestion frameworks, implementing data-quality checkpoints and audit dashboards.
  • Scripted Python utilities that reconciled staging vs. production data, accelerating remediation efforts and supporting the team’s on-call rotations.
  • Documented pipeline runbooks and KT sessions while onboarding new team members, helping scale the data engineering practice.
Jan 2022 – May 2022

Data Engineering Intern

Cognizant Technology Solutions · Pune, India

  • Assisted in developing PySpark data loaders and validation scripts that fed enterprise reporting dashboards.
  • Shadowed release cycles, learning CI/CD and deployment safeguards that informed later trainee responsibilities.

Toolbelt

Blending modern ML stacks with dependable engineering practices to ship production-ready analytics.

Programming & Frameworks

Python R SQL (PostgreSQL, MySQL) FastAPI Streamlit

Machine Learning & AI

Scikit-Learn TensorFlow Keras PyTorch XGBoost YOLOv8 LangChain GPT-4 FAISS Model Evaluation Statistical Analysis

Data & Cloud Platforms

AWS GCP Kubernetes Airflow Hadoop PySpark Docker Jenkins Redis MongoDB

Visualization & Tools

Tableau Power BI Excel Git Linux / Unix Jira Postman

Selected Projects

A snapshot of initiatives where measurable impact and stakeholder adoption mattered as much as model accuracy.

Healthcare AI

Secure Medical Notes AI

Aug 2025 – Nov 2025 · Graduate Studio Project

Built a LangChain + GPT-4 documentation assistant with FAISS retrieval so clinicians review notes faster, while Streamlit dashboards, FastAPI services, Celery, and Redis coordinate authenticated workflows backed by PostgreSQL inside Docker.

  • Clinical reviewers cut note synthesis time by 35% with contextual retrieval prompts.
  • Role-based Streamlit workspaces and background Celery jobs deliver sub-2s response times.
FastAPI LangChain GPT-4 Streamlit PostgreSQL Redis Docker
Computer Vision

VisionStock: Retail Inventory Detection

Retail Analytics Project

Fine-tuned YOLOv8 on retail shelf images to detect and classify 34 product categories for automated inventory tracking. Built a FastAPI inference service and Streamlit dashboard to process images and return predictions in under 2 seconds, enabling near real-time stock checks during store operations.

  • Automated inventory tracking across 34 product categories with computer vision.
  • FastAPI inference service delivers predictions in under 2 seconds for real-time stock checks.
YOLOv8 FastAPI PostgreSQL Streamlit Docker GCP
GenAI Workflow

SupportSync: Intelligent CRM Ops

Feb 2025 – Apr 2025 · Product Internship

Delivered a full-stack CRM platform (MongoDB, Node.js, React) that centralizes support tickets and integrates an AI chatbot for FAQs and password resets, cutting agent workload by 30% and accelerating resolution times.

  • Unified multi-channel tickets into a single dashboard with live SLA tracking.
  • GenAI assistant deflected 30% of routine requests and trimmed first-response time by 18%.
MongoDB Node.js React GCP
Forecasting

NexChain: Supply Signal Optimization

Jan 2025 – May 2025 · Retail Analytics Challenge

Benchmarked ML ensembles to tame irregular demand, reaching 97% forecasting accuracy and clustering SKUs by behavior so planners cut overstock risk by 20% and tighten procurement decisions.

  • Feature-store pipelines in Python keep demand signals fresh for nightly re-trains.
  • Merchandising dashboards surfaced 20% inventory risk reduction for top 50 SKUs.
Python XGBoost Clustering Power BI
Analytics

Food Allergy Risk Profiler

Feb 2025 – Apr 2025 · Academic Research

Analyzed 400+ food profiles with ANOVA, t-tests, and interpretable logistic regression to surface high-risk ingredients, uncovering a 0.31 correlation between dairy, wheat, shellfish and allergy severity to inform safer meal planning.

  • Interactive Tableau dashboards let dieticians probe ingredient-level risk scores.
  • Regulatory brief summarised findings for labeling compliance teams.
Pandas SciPy Tableau
Computer Vision

SCANET: Multi-label Chest X-ray Diagnosis

Aug 2024 – Dec 2024 · Clinical AI Capstone

Fine-tuned DenseNet121 over 112K NIH chest radiographs with mixup augmentation and focal loss, reaching 91.7% macro AUC across 14 pathologies. Deployed triage-ready inference service with Grad-CAM explanations.

  • Grad-CAM heatmaps gave radiologists transparent decision support within 1.5s.
  • Firebase-hosted inference API processed 5k studies/week with automated quality checks.
TensorFlow Firebase Mapbox

Honors

Recognitions that highlight rapid delivery, stakeholder trust, and campus leadership.

Calendar Modernization Award · Cognizant

Delivered a calendar refresh across 36 production repositories in one week with zero escalations or post-release issues, earning direct appreciation from the client product owner.

Exceptional Contribution Certificate

Celebrated for reliable delivery, deadline discipline, and hands-on technical support that kept stakeholders confident in the program roadmap.

Leadership Course Assistant · CU Boulder

Supporting Professor Alfonso Bastias in the Data Mining course by guiding classmates through analytics labs and facilitating active-learning discussions.

Treasurer · DASSA, CU Boulder

Serving as treasurer for the Data Science Student Association, coordinating budgets, events, and member engagement to strengthen the campus data community.

Let’s Collaborate

Open to full-time opportunities, internships, research collaborations, and data-driven product challenges.

Phone 303-356-2393
Location

Boulder, Colorado

Résumé Open PDF