Data Architecture
The following graph showcases our data archicture:

Data Sources
Our data pipeline begins by aggregating a wide range of structured and semi-structured data from multiple sources, including:
APIs
External and Internal Databases
Files (e.g. csv, json, parquet)
RSS Feeds
We standardize these inputs to keep traceability and consistency.
Data Processing (ETL)
The data flows into our GCP environment where we run ETL processes orchaestred to transform and structure data inside our Data Warehouse and Data Marts.
Data Delivery
After the data is processed and validated, it's distributed through multiple delivery mechanisms to meet diverse client needs, including:
Databricks
Snowflake
AWS S3
GCP (Biquery & Cloud Storage)
Aterio's Marketplace
Aterio's API
Last updated