Healpyxel
  • Home
  • Quickstart
  • Source Code
  • Report a Bug
  1. API Reference
  2. HEALPix Aggregate
  • Start
  • Examples
    • Quickstart
    • Visualization
    • Visualization : Gaussian PSF - WIP!
    • Accumulation - WIP!
    • Streaming - WIP!
  • API Reference
    • Package Structure
    • HEALPix Sidecar
    • HEALPix Aggregate
    • HEALPix Accumulator
    • HEALPix Finalize
    • Generate HEALPix sidecar
    • Optional Dependencies
    • Geospatial

On this page

  • Usage Example
  • Function Reference
    • Core Aggregation Functions
    • Sidecar Management
    • File Operations
    • Batch Processing
    • Aggregation Functions
    • Batch Processing Features
  • Report an issue

Other Formats

  • CommonMark
  1. API Reference
  2. HEALPix Aggregate

HEALPix Aggregate

Aggregate data by HEALPix cells (batch processing)

Usage Example

See the main() function for CLI usage, or import functions directly for programmatic use.

Function Reference

Core Aggregation Functions

  • aggregate_by_sidecar() - Main aggregation function that merges sidecar mappings with original data and computes statistics by HEALPix cell
  • densify_healpix_aggregates() - Fills sparse HEALPix grid to include all cells (empty cells filled with NaN)

Sidecar Management

  • collect_sidecar_outputs() - Scans directory for sidecar files matching input file stem, parses metadata from filenames
  • validate_sidecar_metadata() - Validates .meta.json files and checks source_file consistency
  • extract_nside_from_filename() - Extracts nside parameter from filename using regex (fallback method)

File Operations

  • generate_output_filename() - Creates output filename following naming convention: <stem>-aggregated.<sidecar_suffix>.parquet
  • print_parquet_schema() - Displays parquet file schema and metadata
  • print_sidecar_summary() - Shows formatted table of available sidecars with statistics
  • print_dry_run_summary() - Preview of batch processing operations without execution

Batch Processing

  • process_single_sidecar() - Processes one sidecar file with full aggregation workflow
  • parse_arguments() - CLI argument parser with validation for batch mode options
  • main() - Entry point supporting single/batch processing with comprehensive error handling

Aggregation Functions

Available statistical functions in AGG_LOOKUP: - mean - Arithmetic mean (ignores NaN) - median - Median value (ignores NaN) - std - Standard deviation - min / max - Minimum/maximum values - mad - Median Absolute Deviation (robust statistic) - robust_std - MAD × 1.4826 (approximates std for normal distributions)

Batch Processing Features

New in this version: - --sidecar-index all - Process all sidecars in batch mode - --sidecar-index 0 1 2 - Process specific sidecar indices - --stop-on-error - Halt batch processing on first error (default: continue) - --list-sidecars --stats - Show sidecar statistics (row counts, unique cells) - --sidecar-schema INDEX - Display schema of specific sidecar - --dry-run - Preview operations without writing files - Comprehensive batch summary with success/error reporting - Metadata validation with lenient/strict modes

  • Report an issue