Ernesto Diaz • Data & AI Portfolio

I'm an AI & data engineer finishing my BS in Business Analytics & Information Systems at the University of South Florida (May 2026). I build end-to-end pipelines that span LLM fine-tuning, clinical data engineering, and business intelligence — bridging the gap between raw data and real decisions.

Currently I'm a Research Assistant at Blue Cross Blue Shield, where I fine-tuned a medical LLM using QLoRA on 15,000+ clinical prompt-response pairs and engineered synthetic oncology datasets from TCGA standards. I hold a Bright Futures Academic Scholarship, studied abroad in Florence, Italy, and consistently earn 5-star reviews tutoring data analytics at USF.

My work spans Python, PyTorch, SQL, Tableau, and cloud platforms (AWS, GCP, Azure). I'm passionate about applying AI and data engineering to healthcare, finance, and wherever meaningful impact hides in messy data.

Blue Cross Blue Shield — AI & Machine Learning Research Assistant

Oct 2025 – Present

Fine-tuned Med42-8B LLM using a QLoRA pipeline in Python, reducing training loss from 2.49 → 0.09 across 15,000+ clinical prompt-response pairs for oncology decision support.
Engineered a synthetic dataset of 5,200+ TCGA-LUAD–style patient records covering NCCN guideline adherence and mutation-driven reasoning, generating 30,000+ instruction-tuned training pairs.
Resolved a JSONL data quality gap, redesigning pipeline logic to push clinical citation coverage to 75%+ and significantly improve model reliability.
Deployed and evaluated Med42 LLM, achieving 72% accuracy on USMLE benchmarks — outperforming GPT-3.5 by ~11 points.
Built SQL-based pipelines to analyze 10,000+ lung cancer EHR records for patient survival patterns.

Python PyTorch QLoRA / LoRA Hugging Face SQL GCP

Knack — Data Analytics Tutor

May 2024 – Present

Created individualized learning plans for 20+ students with data-driven insights, resulting in a 25% average improvement in scores.
Earned consistent 5-star ratings across sessions, simplifying complex concepts in Excel and Statistics into accessible outlines.
Leveraged Tableau and Excel to provide visual, data-driven feedback that accelerated student comprehension and performance.

Tableau Excel Statistics

Colegio Yannette — Data Analytics Intern

Apr 2020 – Jun 2020

Managed registration and updates of critical data for 200+ students, ensuring data integrity.
Reviewed invoices and payments, identifying opportunities for process improvements and enhancing clarity and efficiency.
Strengthened data quality control and improved operational efficiency through data entry in Excel.

Excel Data Quality Process Improvement

🧬 Oncology AI Clinical Decision Support

Fine-tuned Med42-8B on a synthetic TCGA-LUAD dataset using QLoRA and 4-bit quantization, cutting training loss from 2.49 → 0.09 over 15,000+ clinical pairs. Engineered 5,200+ patient records and 30,000+ instruction-tuned training pairs spanning treatment recommendations and mutation reasoning. Achieved 72% USMLE accuracy, outperforming GPT-3.5 by ~11 points.

PythonPyTorchQLoRAHugging FaceHexGCS

📈 S&P 500 Market Regime & Sector Rotation

Engineered 15+ technical features across 2.9M+ daily prices for multi-horizon return forecasting. Designed a market regime classification system producing 9 distinct market states and surfacing historically outperforming sectors. Built macro-level indicators including % above SMA-50/200 and advance/decline ratios.

PythonPandasNumPyMatplotlibSeabornyfinance

🏥 Hospital Operations & Patient Flow Analytics

Analyzed 1,000+ appointment records to surface high-traffic departments handling ~30% above-average load. Designed a 7-table relational schema (patients, doctors, departments, appointments) on Google Cloud, then built SQL-driven KPIs and Tableau dashboards aimed at reducing wait times.

PostgreSQLPythonTableauGCP

📱 Mobile Expense Tracker

Built a full-featured mobile app to manage 100+ expense records with CRUD functionality using React Native and SQLite. Designed SQL queries to aggregate daily spending totals and power a time-series chart visualization across selectable date ranges.

React NativeSQLiteJavaScriptGit

Skills & Tools

Resume

📄 Download Resume 🌐 View Resume