Jose Andres Espinoza - Data Engineer

About me

Hey! I'm Jose Andres 🚀

B.S in Computer Science. I’ve worked in Data Engineering building pipelines and data models that turn complex raw data into insights and business value.

I believe that strong teamwork and open communication lead to the best solutions. I enjoy collaborating across teams, sharing knowledge and supporting others in their growth. Seeing the impact of good collaboration is something I truly value.

I’m an enthusiastic Databricks user who believes generative AI will reshape analytics. In my free time I dive into LLMs and how these technologies work.

My areas of interest

Data Engineering

Pipelines and models that make data reliable and ready to use.
Generative AI

Exploring LLMs to automate tasks and create new value.
Cloud

Scalable platforms to store and process data efficiently.
Analytics

From raw numbers to simple, useful insights.

Clouds

Currently working on improving my cloud skills 💪

Resume

Education

Universidad Cenfotec
2024 — 2025
Master's Degree in Databases and Analytics

In Progress

Balancing work, a Master's Degree and personal life has been a challenge, some days I win, some days I just try again tomorrow.
Unversidad Fidélitas
2018 — 2023
Bachelor's Degree in Systems Computer Enginnering/Computer Science.
Click here to see the Degree!

Experience

Associate Data Engineer

Arkose Labs
Escazú, Costa Rica
01/2025 - Present

As a Data Engineer in the Customer Success team, I turn raw data into clean datasets that power an automated BI reporting platform in Looker Studio that CS teams use to present value to more than +70 customers and that executives use to monitor portfolio health and renewals. I drive cross-team technical alignment with Professional Services, Account Management and Solutions Engineering to maintain a single source of truth across reporting use cases, shared definitions and KPI calculations. I lead the data engineering work to evaluate and prioritize requests, define scope and turn the most valuable ones into production datasets that support business value.

Key achievements

• Owned and maintained the pipelines and data warehouse (Postgres) that power the Customer Success automated BI reporting platform; improved freshness to twice-daily runs (from once daily) by optimizing queries and schedules. Reduced manual prep time:

QBRs: from 2 hours to 10 minutes per customer.

Monthly reports: from 1 hour to 5 minutes per customer.

Post-attack reports: from 2 hours to 10 minutes per incident.

Threat-Intel reports data source: from 6 hours getting exports of data from different sources to 30min getting the data centralized from the BI reporting platform; latest example of the last quarter's report: Threat Actor Report – Scammer Focus.

• Led end-to-end data engineering workflow for two CS churn-prevention initiatives): from source systems discovery and API integrations, transforming raw (bronze) to -> clean datasets (silver)-> and loading into postgres tables (gold).

Account Health Score: blends onboarding, adoption, performance, and retention to flag risk up to 6 months earlier.

Customer Quadrant Classification: segments by health × value to focus actions and target ≥5 expansion opps/quarter.

• Planned, scoped and executed the migration of Python ETL code from ECS Fargate/ECR to Airflow (Amazon MWAA) to achieve more reliable runs, better isolation of data processing, stronger QA checks and less manual work. I designed a phased migration plan with a pipeline inventory and dependencies. In Airflow, DAG pipelines run under one scheduler with clear retries and simple monitoring; heavy jobs were split into smaller tasks to isolate failures. 5 pipelines migrated successfully; 2 remain to finish in Q3.

• Built a Slack alerting system in Python that queries Gold tables (PostgreSQL) and posts account-specific alerts to CS/account slack channels, flagging account health changes and threshold breaches for proactive action.

• Ran an 8-lesson AI upskilling program for a 6-person team (8 weeks): designed a structured training on LLMs/ChatGPT, how tokens work, prompt, RAG/retrieval, evaluation and agentic workflows; delivered slide decks and hands-on labs. Outcome: teammates reported better day-to-day use of AI tools and stronger problem-solving.

• Launched a Security+ cohort for my team: scoped phases, led weekly study sessions, got materials and scheduled practice tests to prepare for certification.
Data Engineer Intern

Arkose Labs
Escazú, Costa Rica
04/2024 - 01/2025

I worked in a fast-paced and iterative startup environment where adaptability and problem-solving were critical. I was given significant responsibility early on and exposed to real business challenges, allowing me to grow quickly. This was an invaluable learning experience that prepared me for delivering data solutions with real impact.

Skills

Data Engineering: Python · SQL · ETL · APIs integration
AWS Cloud: S3 · Glue Catalog · Athena · RDS PostgreSQL · ECS (Fargate) · ECR · CloudWatch · EventBridge
Orchestration & DevOps: Airflow · Git/GitHub · Docker · Terraform · Linux
Business Intelligence: Looker Studio · Customer‑facing & Executive Reporting
Agile & Collaboration: Scrum · JIRA · Confluence · Technical Documentation · Sprint Planning · Mentoring
Third‑Party Tools: Slack · Zendesk · Salesforce · Gainsight · Harvest · Pylon