ANKITA GHOSH
Machine Learning Engineer | Oracle-Certified Generative AI Professional | Data Science & NLP Specialist
Building Intelligent Systems | Solving Real-World Problems with AI & Deep Learning
About Me
I am a solution-driven AI/ML Engineer with a passion for transforming complex data into actionable intelligence. My journey in AI began with curiosity about how machines can learn and adapt—today, I specialize in building scalable, end-to-end machine learning systems that solve real-world challenges across NLP, generative AI, and deep learning domains.
With hands-on expertise in Python, TensorFlow, PyTorch, and advanced ML frameworks, I develop ML pipelines that combine accuracy, efficiency, and practical impact, achieving results such as 95%+ model accuracy and 40% faster processing through optimized MLOps workflows. I thrive on bridging research and production, ensuring that AI solutions are both innovative and deployable at scale.
My Expertise
AI/ML Engineering
Building and deploying scalable ML systems.
Generative Models
Leveraging advanced generative AI for innovative solutions.
Deep Learning
Applying CNNs, LSTMs, and Transformers to complex problems.
NLP & Data Science
Specializing in natural language processing and data analysis.
Experience: AI/ML Internships
InternPe (Oct-Dec 2024)
AI-ML Intern
  • Built and optimized ML models (Regression, Classification, Clustering) achieving >90% accuracy.
  • Developed deep learning models (CNN, LSTM, ANN) for image and text-based tasks using TensorFlow/Keras.
  • Designed end-to-end ML pipelines: data collection → EDA → preprocessing → model training → evaluation → deployment.
  • Applied feature engineering, hyperparameter tuning, and implemented automated model retraining for continuous improvement.
  • Ensured reproducible code and version control using Git, improving collaboration and maintainability.
  • Conducted model performance visualization with matplotlib/seaborn for effective reporting to stakeholders.
  • Focused on clean, modular code and scalable workflows suitable for production deployment.
Orbitor (Jul-Dec 2023)
Machine Learning Intern
  • Developed and deployed predictive ML models (Regression, Classification, Clustering) achieving 95%+ accuracy and improving key performance metrics (F1-score, recall) by 20–30%.
  • Designed scalable ML workflows using MLflow, Airflow, and Docker, reducing manual processing time by 20%.
  • Applied advanced feature engineering, dimensionality reduction (PCA), and hyperparameter tuning for optimal model efficiency.
  • Integrated trained models into real-time applications and microservices via FastAPI, enabling seamless deployment.
  • Conducted data visualization and reporting with matplotlib, seaborn, and Tableau dashboards for stakeholder insights.
  • Collaborated in agile sprints, presenting model outputs to cross-functional teams, improving adoption of AI-driven solutions.
Key Projects
95+
AI-Powered PO Automation Agent
From chaotic PDFs to clean, structured formats in seconds - no more data copy-pasting!
🎯Use Cases🎯
  • Accounting & Finance Automation: Reduced manual PO processing by 90%, accelerating ERP workflows.
  • Enterprise Scalability: Supports multi-PO files and scanned/messy documents, enabling 100% scalable deployment.
  • Data Accuracy & Compliance: Minimizes human errors in PO entries, improving record accuracy by 95%.
  • Operational Efficiency: Speeds up accounting/ERP teams’ workflow, saving 40+ hours/month in manual effort.
85 +
MATSRL
Multi-Agent Text Summarization with Reinforcement Learning
🎯Use Cases🎯
  • Healthcare & Legal Summaries: Provides adaptive, accurate summaries of large documents, improving review efficiency by 50%.
  • Research & Education: Trained on 700K+ articles, enabling scalable summarization for academic and professional use.
  • Model Performance: Outperforms traditional Transformer models by 45% on ROUGE metrics.
  • Enterprise Integration: Deployable across industries for automated document summarization, reducing manual processing time by 60%.
90+
VPUFS
Variance Score and Pearson Similarity-based Unsupervised Feature Selection
🎯Use Cases🎯
  • Genomic Research: Improved gene expression clustering accuracy to 90%, enabling more precise biological insights.
  • Cost Efficiency: Reduced computational and analysis costs by 50% in high-dimensional datasets.
  • Scalable Analysis: Applicable to large-scale bioinformatics pipelines, accelerating discovery in genomics and biomedical studies.
89+
Cybersecurity
Deep learning-based phishing website detection system with high accuracy.
🎯Use Cases🎯
  • Real-Time Threat Mitigation: Detect phishing URLs instantly, protecting users and enterprise networks.
  • Automation at Scale: Automate monitoring and flagging of suspicious domains, reducing manual review by 60%.
  • Integration: Embed in browsers, email clients, and security tools to prevent malicious access.
  • Enterprise Protection: Minimize data breaches and financial loss by proactively identifying phishing attempts.
95+
AnonAPI
A versatile engine that anonymizes, scrambles, and encrypts text using 14+ modular models, supporting batch and single-text processing.
🎯Use Cases🎯
  • Data Privacy & Testing: Mask sensitive logs, PII, and chats; generate anonymized text for QA or NLP pipelines.
  • Education & Research: Demonstrate cryptography, text transformations, and controlled obfuscation for ML robustness testing.
  • Fun & Engagement: Enable hidden messages, text puzzles, and gamified text transformations in apps.
  • Flexible Integration: Easily extendable modular API supports rapid deployment across different platforms.
Achievements & Research
  • Vice-President, Technical Club "Techniche" (2022–2023): Led and managed a team of 80+ members, organizing technical workshops, hackathons, and events.
  • National Engineering Olympiad (NEO 6.0) – 2022: Secured All India Rank 21 among thousands of participants nationwide.
Technical Skills
Core Expertise
  • Machine Learning & AI: Machine Learning, Deep Learning, Reinforcement Learning, Generative AI, Agentic AI, RAG (Retrieval-Augmented Generation), NLP
  • Data Structures & Algorithms: Problem Solving, Algorithm Design, Object-Oriented Programming (OOP)
  • Neural Networks & Models: CNNs, RNNs, LSTMs, Transformers, Large Language Models , Prompt Engineering, Fine-Tuning
  • Data & Analytics: Statistical Analysis, Exploratory Data Analysis (EDA), Feature Engineering, ETL Pipelines
Programming & Tools
  • Languages: Python, C, C++, SQL – proficient in building scalable, production-ready code
  • Frameworks & Libraries: TensorFlow, PyTorch, LangChain, LangGraph, Pandas, NumPy, Scikit-learn – hands-on experience with AI-ML pipelines
  • MLOps & Deployment: Docker, MLflow, Airflow, Git, GitHub, FastAPI, Streamlit – automating workflows, and managing reproducible pipelines
  • Data Visualization & Analytics: Tableau, Matplotlib, Seaborn – designing interactive dashboards and generating actionable insights
Certifications
1
Education
1
2023 – 2025
M.Tech, Computer Science & Engineering
Maulana Abul Kalam Azad University of Technology
2
2019 – 2023
B.Tech (Hons), Computer Science
Neotia Institute of Technology, Management and Science
3
2017 – 2019
Higher Secondary
(Physics,Chemistry,Math)
Majilpur J.M. Training School
4
2008 – 2017
Secondary Education
Nimpith Ashram Sarada Vidya Mandir
Connect with Me
Made with