Peddy.jl Documentation Overview

Welcome to Peddy.jl – a comprehensive Julia package for processing eddy covariance data with a modular, high-performance pipeline architecture.

What is Peddy.jl?

Peddy.jl provides a complete framework for eddy covariance data processing, from raw measurements to publication-ready results. It features:

Modular Pipeline Architecture: Each processing step is pluggable and can be customized or extended
High-Performance Implementation: Optimized Julia code for fast processing of large datasets
Multiple Sensor Support: Built-in support for Campbell CSAT3/CSAT3B, LI-COR IRGASON, and custom sensors
Comprehensive Processing Steps: Quality control, despiking, gap filling, coordinate transformation, and multiresolution decomposition
Flexible Output: Write results to CSV, NetCDF, or memory for further analysis

Documentation Structure

This documentation is organized by use case and technical depth:

For New Users

Start here if you're new to Peddy.jl or eddy covariance processing:

Quick Reference – Copy-paste examples for common tasks
- Minimal working example
- Common configurations
- Data access patterns
- Cheat sheets for each step
Tutorial – Hands-on guide with detailed explanations
- Installation and setup
- Working with synthetic data
- Configuring processing steps
- Running the pipeline
- Accessing results
Data Format & Architecture – Understanding Peddy's data model
- DimensionalData.jl basics
- High-frequency and low-frequency data structure
- Accessing and modifying data
- Working with missing values

For Active Users

Use these guides while working with Peddy.jl:

Sensor Configuration – Choosing and configuring sensors
- Supported sensors (CSAT3, CSAT3B, IRGASON, LICOR)
- Sensor-specific workflows
- Diagnostic interpretation
- Creating custom sensors
API Reference – Complete function and type documentation
- All pipeline steps with parameters
- Input/output interfaces
- Logging system
- Utility functions
Troubleshooting & FAQ – Solving common problems
- Common errors and solutions
- Performance optimization
- Data quality issues
- Debugging techniques

For Advanced Users

Extend and customize Peddy.jl for your needs:

Extending Peddy.jl – Creating custom pipeline steps
- Custom quality control
- Custom despiking algorithms
- Custom gap filling methods
- Custom output formats
- Custom sensors
- Best practices and patterns
Best Practice – Julia development guidelines
- Project organization
- Dependency management
- Development workflow
- Publishing and reproducibility

By Task

I want to...

Process data quickly: Start with Quick Reference → Tutorial
Understand the data format: Read Data Format & Architecture
Choose a sensor: See Sensor Configuration
Configure a specific step: Check API Reference
Fix an error: Look in Troubleshooting & FAQ
Create a custom step: Follow Extending Peddy.jl
Organize my project: Read Best Practice

By Experience Level

Beginner (new to Julia or eddy covariance):

Quick Reference - minimal example
Tutorial - detailed walkthrough
Data Format & Architecture - understand the data

Intermediate (familiar with Julia and eddy covariance):

Quick Reference - quick lookup
API Reference - function details
Sensor Configuration - sensor setup
Troubleshooting & FAQ - problem solving

Advanced (want to extend Peddy.jl):

API Reference - understand interfaces
Extending Peddy.jl - create custom steps
Data Format & Architecture - understand data flow
Best Practice - project organization

Pipeline Overview

Peddy.jl processes data through a configurable pipeline with these steps (in order):

Raw Data
   ↓
1. Quality Control (optional)
   ↓
2. Gas Analyzer Correction (optional)
   ↓
3. Despiking (optional)
   ↓
4. Make Continuous (optional)
   ↓
5. Gap Filling (optional)
   ↓
6. Double Rotation (optional)
   ↓
7. Multi-Resolution Decomposition (optional)
   ↓
8. Output
   ↓
Processed Data

Each step:

Is optional (set to nothing to skip)
Modifies data in-place (except MRD which stores results separately)
Can be customized or replaced with your own implementation

Key Concepts

DimensionalData.jl

Peddy.jl uses labeled arrays (DimArray) for all data:

hf = DimArray(
    data_matrix,
    (Var([:Ux, :Uy, :Uz, :Ts]), Ti(times))
)

# Access by label, not index
ux = hf[Var=At(:Ux)]

Benefits:

Type-safe dimension access
Self-documenting code
Less error-prone than plain matrices

Missing Data Handling

Peddy.jl uses NaN to represent missing values:

# Check for missing
n_missing = count(isnan, hf)

# Get statistics ignoring NaN
mean_val = Peddy.mean_skipnan(hf[Var=At(:Ux)])

Modular Steps

Each pipeline step is a type implementing an abstract interface:

struct MyCustomStep <: AbstractDespiking
    # fields
end

function Peddy.despike!(step::MyCustomStep, hf, lf; kwargs...)
    # implementation
end

This allows easy extension and customization.

Common Workflows

Minimal Processing

pipeline = EddyPipeline(
    sensor=CSAT3(),
    quality_control=PhysicsBoundsCheck(),
    output=MemoryOutput()
)
process!(pipeline, hf, lf)

Standard Processing

pipeline = EddyPipeline(
    sensor=CSAT3(),
    quality_control=PhysicsBoundsCheck(),
    despiking=SimpleSigmundDespiking(),
    gap_filling=GeneralInterpolation(),
    output=ICSVOutput("/path/to/output")
)
process!(pipeline, hf, lf)

Full Processing with MRD

pipeline = EddyPipeline(
    sensor=IRGASON(),
    quality_control=PhysicsBoundsCheck(),
    gas_analyzer=H2OCalibration(),
    despiking=SimpleSigmundDespiking(),
    gap_filling=GeneralInterpolation(),
    double_rotation=WindDoubleRotation(),
    mrd=OrthogonalMRD(),
    output=NetCDFOutput("/path/to/output")
)
process!(pipeline, hf, lf)

Data Format Summary

High-Frequency Data

Fast measurements (typically 10-20 Hz):

hf = DimArray(
    data_matrix,
    (Var([:Ux, :Uy, :Uz, :Ts, :H2O, :P, :CO2]), Ti(times))
)

Required variables depend on sensor:

All sensors: Ux, Uy, Uz, Ts
IRGASON/LICOR: also CO2, H2O, P

Low-Frequency Data

Slow measurements (typically 1 Hz or slower), optional:

lf = DimArray(
    data_matrix,
    (Var([:TA, :RH, :P]), Ti(times))
)

Used for:

H₂O correction (needs :TA, :RH)
Reference measurements
Meteorological context

Processing Steps at a Glance

Step	Purpose	When to Use	Example
Quality Control	Remove physically impossible values	Always	`PhysicsBoundsCheck()`
Gas Analyzer	Correct H₂O measurements	If using LICOR/IRGASON	`H2OCalibration()`
Despiking	Remove measurement spikes	Usually	`SimpleSigmundDespiking()`
Make Continuous	Insert missing timestamps	If data has gaps	`MakeContinuous()`
Gap Filling	Interpolate small gaps	Usually	`GeneralInterpolation()`
Double Rotation	Align with mean wind	For flux calculations	`WindDoubleRotation()`
MRD	Multiresolution analysis	For spectral analysis	`OrthogonalMRD()`
Output	Write results	Always	`ICSVOutput()`

Installation

# Julia 1.11 required
julia +1.11 --project=.
julia> using Pkg; Pkg.instantiate()

Or add to your environment:

using Pkg
Pkg.add("Peddy")

Getting Help

Check the documentation: Use the navigation above
Search for your error: Troubleshooting & FAQ
Look for examples: Quick Reference and Tutorial
Read the API docs: API Reference
Create a custom step: Extending Peddy.jl

Key Files and Directories

Peddy.jl/
├── src/                          # Source code
│   ├── Peddy.jl                 # Main module
│   ├── pipeline.jl              # EddyPipeline and process!
│   ├── despiking.jl             # Despiking implementation
│   ├── interpolation.jl         # Gap filling
│   ├── h2o_correction.jl        # H2O calibration
│   ├── double_rotation.jl       # Wind rotation
│   ├── logging.jl               # Logging system
│   ├── make_continuous.jl       # Time axis continuity
│   ├── IO/                      # Input/output
│   ├── QC/                      # Quality control
│   ├── Sensors/                 # Sensor definitions
│   └── MRD/                     # Multiresolution decomposition
├── docs/
│   ├── src/
│   │   ├── index.md            # Tutorial
│   │   ├── api.md              # API reference
│   │   ├── extending.md        # Extension guide
│   │   ├── data_format.md      # Data format guide
│   │   ├── sensors.md          # Sensor guide
│   │   ├── troubleshooting.md  # Troubleshooting
│   │   ├── quick_reference.md  # Quick reference
│   │   ├── best_practice.md    # Best practices
│   │   └── overview.md         # This file
│   └── make.jl                 # Documentation builder
├── test/                        # Tests
├── examples/                    # Example scripts
├── Project.toml                # Dependencies
└── README.md                   # Project README

Core Concepts

Abstract Types and Dispatch

Peddy.jl uses Julia's type system for extensibility:

abstract type PipelineStep end
abstract type AbstractQC <: PipelineStep end
abstract type AbstractDespiking <: PipelineStep end
# ... etc

Each step type implements its corresponding function:

quality_control!(qc::AbstractQC, hf, lf, sensor; kwargs...)
despike!(desp::AbstractDespiking, hf, lf; kwargs...)
# ... etc

This allows you to create custom steps by defining new types and implementing the interface.

In-Place Modifications

Most steps modify data in-place for efficiency:

# Data is modified in-place
quality_control!(qc, hf, lf, sensor)
despike!(desp, hf, lf)

# hf now contains modified data

Exception: MRD stores results separately:

decompose!(mrd, hf, lf)
results = get_mrd_results(mrd)  # Get results from mrd object

Optional Steps

All steps are optional – set to nothing to skip:

pipeline = EddyPipeline(
    sensor=sensor,
    quality_control=nothing,      # Skip QC
    despiking=SimpleSigmundDespiking(),
    gap_filling=nothing,          # Skip gap filling
    output=output
)

Performance Considerations

Memory Usage

In-place operations: Use @view to avoid copies
Large datasets: Process in time chunks
Output format: NetCDF is more efficient than CSV for large files

Speed

Disable expensive steps: Set to nothing if not needed
Reduce MRD parameters: Smaller M or larger shift
Use appropriate interpolation: Linear() is fastest, Cubic() is most accurate
Disable logging: Use NoOpLogger() for production

Reproducibility

Peddy.jl supports reproducible processing:

# 1. Use Project.toml for dependency management
# 2. Enable logging to track what was done
logger = ProcessingLogger()
pipeline = EddyPipeline(..., logger=logger)
process!(pipeline, hf, lf)
write_processing_log(logger, "processing_log.csv")

# 3. Save configuration
# 4. Document sensor setup and calibration

Citation

If you use Peddy.jl in your research, please cite:

@software{leibersperger2024peddy,
  title={Peddy.jl: A Julia package for eddy covariance data processing},
  author={Leibersperger, Patrick and Asemann, Patricia and Engbers, Rainette},
  year={2024},
  url={https://github.com/pleibers/Peddy.jl}
}

Contributing

Contributions are welcome! See Best Practice for development guidelines.

License

Peddy.jl is licensed under the MIT License. See LICENSE file for details.

Next Steps

New to Peddy.jl? → Start with Quick Reference
Want to learn more? → Read the Tutorial
Need specific help? → Check Troubleshooting & FAQ
Want to extend? → See Extending Peddy.jl
Looking for details? → Consult API Reference

Happy processing! 🎉