AQA Data Integration Pipeline

Interactive Workflow — ARPA + ERA5 + Sentinel-5P

1️⃣ BPMN Process

flowchart TD A1([Set Parameters]) --> A2([Load ARPA Metadata]) A2 --> A3([Download ARPA Measurements]) A3 --> A4([Download ERA5 Data]) A4 --> A5([Download Sentinel-5P]) A5 --> A6([Merge + Harmonize]) A6 --> A7([Compute Period Means]) A7 --> A8([Export Integrated Dataset])

2️⃣ ER Diagram

erDiagram CONFIG ||--o{ ARPA_SENSOR : defines ARPA_SENSOR ||--o{ ARPA_MEASUREMENT : records CONFIG ||--o{ ERA5_FILE : triggers ERA5_FILE ||--o{ ERA5_VARIABLE : contains ERA5_VARIABLE }o--|| GRID_POINT : located_on SENTINEL_OBSERVATION }o--|| GRID_POINT : observed_at SUMMARY }o--|| GRID_POINT : summarized_at

3️⃣ Block Architecture

graph TD UI[User Interface] --> ACQ[Data Acquisition] ACQ --> PROC[Processing] PROC --> STORE[Storage] STORE --> VIS[Visualization] ACQ1[ARPA API] --> ACQ ACQ2[ERA5 CDS API] --> ACQ ACQ3[Sentinel-5P GEE] --> ACQ

4️⃣ UML Use Case

%%{init: {'theme': 'default'}}%% flowchart TD %% Define actors R([Researcher]) API([External API]) %% Use cases UC1([Configure Analysis]) UC2([Download Data]) UC3([Analyze Results]) UC4([Export Dataset]) UC5([Provide Data]) UC6([Merge and Harmonize]) UC7([Generate Outputs]) %% Relationships R --> UC1 R --> UC2 R --> UC3 R --> UC4 API --> UC5 UC2 --> UC6 UC6 --> UC7

5️⃣ UX / User Flow

flowchart TD A["User opens the app"] B["Selects date range and area"] C["System sets configuration parameters"] D["Selects province and pollutant"] E["Downloads ARPA, ERA5, and Sentinel data"] F["Merges datasets into unified grid"] G["Computes daily mean values"] H["Generates CSV and GeoDataFrame"] I["Visualizes maps and charts"] J["User explores results"] A --> B B --> C C --> D D --> E E --> F F --> G G --> H H --> I I --> J

6️⃣ Pipeline Data Flow

graph LR A[ARPA Metadata] --> B[Sensor Catalog] C[ERA5 Files] --> D[ERA5 Variables] E[Sentinel-5P] --> F[Pollutant Maps] B --> G[Integration] D --> G F --> G G --> H[Unified Dataset] H --> I[Export CSV/GeoDataFrame]

7️⃣ Integration & Interpolation

flowchart TD ERA1[ERA5 Variables] --> ERA2[Interpolate on Grid] ERA2 --> ERA3[ERA5 Interpolated Table] S1[Sentinel-5P Data] --> S2[Daily Mean] S2 --> S3[Join with Grid] ERA3 & S3 --> F[Integrated ERA5 + Sentinel + ARPA]

8️⃣ Output Data Model

classDiagram class IntegratedDataset { int grid_id float lat float lon string pollutant float prev_mean float curr_mean float sentinel_no2 float sentinel_co float t2m_c float sp_hpa float wind_speed float blh_km float tp date date }

9️⃣ Tools Summary

graph LR A[APIs] -->|Fetch| B[Python Scripts] B -->|Process| C[DataFrame] C -->|Store| D[results/ Folder] D -->|Visualize| E[Jupyter / GEE] E -->|Publish| F[GitHub Pages]