Skip to content
View JohaAlarcon's full-sized avatar
🌱
Grow Up
🌱
Grow Up

Block or report JohaAlarcon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
JohaAlarcon/README.md

👩‍💻 Johana Alarcón Moya

Data Engineer | Backend Developer | AI-Driven Automation Specialist

PortfolioLinkedInPlatzi


💬 About Me

Hello! I’m Johana, a versatile Data Engineer with a background in Electronic Engineering and extensive experience in project management.
My passion lies in turning raw information into scalable, production-grade data products on Google Cloud Platform (GCP). I design serverless architectures, orchestrate complex automations, and apply NLP & AI to extract insights from massive legislative document collections.


🏆 Highlights

  • Cloud-Native Pipelines: End-to-end design and deployment of data pipelines on GCP (Cloud Functions, Cloud Run, Workflows, Cloud Storage, Pub/Sub & managed PostgreSQL) handling millions of records daily.
  • Automation Orchestration: Orchestrated large-scale scraping, bulk PDF downloads, OCR, and database loading using n8n and Google Workflows.
  • OCR & AI Services: Integrated Tesseract (Cloud Run) and OpenAI APIs for text extraction, automatic summarization, embeddings generation, and topic classification.
  • Secure Microservices: Built FastAPI microservices containerized with Docker, secured by dynamic ID-tokens (Cloud Run → Cloud Run) to enable credential-free calls from n8n & Workflows.
  • Knowledge Graphs: Created semantic graphs of legislative projects, authors, and topics using text-embedding-3-small embeddings and graph visualization libraries.
  • Cost & Security Optimization: Implemented fine-grained IAM, VPC connectors, and storage class tuning to cut GCP costs ~40 % while meeting compliance requirements.
  • Domain Expertise: Specialized in large-scale processing of government & legislative documents across LATAM.

🛠️ Skills

Area Tech & Tools
Languages Python (Advanced), SQL (Advanced)
Data Engineering GCP (Cloud Run, Cloud Functions, Workflows, Pub/Sub, Cloud Storage), Docker, dbt, Apache Airflow, CrateDB
Automation / Orchestration n8n, Google Workflows
NLP & AI OpenAI GPT, text-embedding-3-small, spaCy, LangChain
OCR Tesseract (+ custom wrapper in Cloud Run)
Data Viz / BI Metabase, Looker Studio, Power BI, Kepler.gl
Dev Tools Git & GitHub, FastAPI, Poetry, VS Code
Databases PostgreSQL (Managed / Self-hosted), Snowflake, MySQL, SQLite

🌱 Currently Learning

  • Advanced LangGraph / CopilotKit patterns for agentic workflows.
  • Vertex AI pipelines for scalable model serving on GCP.

👩‍💼 Professional Experience

**Data Engineer | AI & Cloud Automation Specialist @ iMakia @Kitsune **

Oct 2023 – Present

  • Architected GCP serverless data platform powering legislative intelligence products.
  • Implemented multi-stage ETL/ELT pipelines with Cloud Run + Workflows, reducing manual processing time by 80 %.
  • Deployed OCR & NLP microservices (Tesseract + OpenAI) generating rich metadata, summaries, and embeddings for >50 K documents.
  • Led cost-optimization initiative: storage tiering & idle-instance scheduling cut monthly spend from $1.2k → $700.

Service Center Director @ ISEC SA

Apr 2022 – May 2023

  • Managed electronic-security projects for public & private sectors (incl. Ecopetrol).
  • Introduced data-driven KPIs (Excel + Power BI) boosting SLA adherence by 15 %.

Project Professional & Support Engineer @ ISEC SA

Feb 2012 – Mar 2022

  • Oversaw maintenance of >1 000 surveillance devices; dropped mean-time-to-repair by 25 %.
  • Championed root-cause analysis culture, improving system reliability.

📚 Certifications

  • Big Data Certified Professional – Talento Tech MINTIC (Oct 2024)
  • Data Analytics Certified Professional – Talento Tech MINTIC (Oct 2024)
  • Project Management Master – ENEB (May 2023)
  • Data Analysis with Python – Platzi (Jun 2023)

🌐 Community

  • PyLadies Bogotá – Active Member
  • Python Colombia – Contributor
  • Volunteer – JS Conf CO 2023, PyCon CO 2024
  • Creator – FastAPI Workshop Chapter

📫 Contact

Pinned Loading

  1. Data-with-Snowflake-and-Airflow Data-with-Snowflake-and-Airflow Public

    Ingesta de datos de ligas de Futbol

    Python