Associate Data Engineer
Arkose Labs
Escazú, Costa Rica
01/2025 - Present
As a Data Engineer in the Customer Success team, I turn raw data into clean datasets that power an automated BI reporting platform in Looker Studio that CS teams use to present value to more than +70 customers and that executives use to monitor portfolio health and renewals. I drive cross-team technical alignment with Professional Services, Account Management and Solutions Engineering to maintain a single source of truth across reporting use cases, shared definitions and KPI calculations. I lead the data engineering work to evaluate and prioritize requests, define scope and turn the most valuable ones into production datasets that support business value.
Key achievements
• Owned and maintained the pipelines and data warehouse (Postgres) that power the Customer Success automated BI reporting platform; improved freshness to twice-daily runs (from once daily) by optimizing queries and schedules. Reduced manual prep time:
QBRs: from 2 hours to 10 minutes per customer.
Monthly reports: from 1 hour to 5 minutes per customer.
Post-attack reports: from 2 hours to 10 minutes per incident.
Threat-Intel reports data source: from 6 hours getting exports of data from different sources to 30min getting the data centralized from the BI reporting platform; latest example of the last quarter's report: Threat Actor Report – Scammer Focus.
• Led end-to-end data engineering workflow for two CS churn-prevention initiatives): from source systems discovery and API integrations, transforming raw (bronze) to -> clean datasets (silver)-> and loading into postgres tables (gold).
Account Health Score: blends onboarding, adoption, performance, and retention to flag risk up to 6 months earlier.
Customer Quadrant Classification: segments by health × value to focus actions and target ≥5 expansion opps/quarter.
• Planned, scoped and executed the migration of Python ETL code from ECS Fargate/ECR to Airflow (Amazon MWAA) to achieve more reliable runs, better isolation of data processing, stronger QA checks and less manual work. I designed a phased migration plan with a pipeline inventory and dependencies. In Airflow, DAG pipelines run under one scheduler with clear retries and simple monitoring; heavy jobs were split into smaller tasks to isolate failures. 5 pipelines migrated successfully; 2 remain to finish in Q3.
• Built a Slack alerting system in Python that queries Gold tables (PostgreSQL) and posts account-specific alerts to CS/account slack channels, flagging account health changes and threshold breaches for proactive action.
• Ran an 8-lesson AI upskilling program for a 6-person team (8 weeks): designed a structured training on LLMs/ChatGPT, how tokens work, prompt, RAG/retrieval, evaluation and agentic workflows; delivered slide decks and hands-on labs. Outcome: teammates reported better day-to-day use of AI tools and stronger problem-solving.
• Launched a Security+ cohort for my team: scoped phases, led weekly study sessions, got materials and scheduled practice tests to prepare for certification.