Error Handling
Exception Matchers
geneva.debug.error_store.Retry
Bases: ExceptionMatcher
Retry on matching exceptions with backoff
Parameters:
-
*exceptions(type[Exception], default:()) –Exception types to match
-
match(str, default:None) –Regex pattern to match in exception message. Simple strings work as substring matches (e.g., "rate limit"). Use (?i) for case-insensitive matching.
-
max_attempts(int, default:3) –Maximum number of attempts (default: 3)
-
backoff(str, default:'exponential') –Backoff strategy: "exponential" (default), "fixed", or "linear"
Examples:
>>> Retry(ConnectionError, TimeoutError, max_attempts=3)
>>> Retry(ValueError, match="rate limit", max_attempts=5)
>>> Retry(APIError, match=r"429|rate.?limit")
>>> Retry(APIError, match=r"(?i)rate limit") # case-insensitive
Source code in geneva/debug/error_store.py
geneva.debug.error_store.Skip
Bases: ExceptionMatcher
Skip row (return None) on matching exceptions
Parameters:
-
*exceptions(type[Exception], default:()) –Exception types to match
-
match(str, default:None) –Regex pattern to match in exception message
Examples:
Source code in geneva/debug/error_store.py
geneva.debug.error_store.Fail
Bases: ExceptionMatcher
Fail job immediately on matching exceptions
Parameters:
-
*exceptions(type[Exception], default:()) –Exception types to match
-
match(str, default:None) –Regex pattern to match in exception message
Examples:
Source code in geneva/debug/error_store.py
Helper Functions
geneva.debug.error_store.retry_transient
Retry transient network errors (ConnectionError, TimeoutError, OSError).
Parameters:
-
max_attempts(int, default:3) –Maximum number of attempts (default: 3)
-
backoff(str, default:'exponential') –Backoff strategy: "exponential" (default), "fixed", or "linear"
Returns:
-
list[ExceptionMatcher]–Matcher list for use with on_error parameter
Examples:
>>> @udf(data_type=pa.int32(), on_error=retry_transient())
>>> @udf(data_type=pa.int32(), on_error=retry_transient(max_attempts=5))
Source code in geneva/debug/error_store.py
geneva.debug.error_store.retry_all
Retry any exception.
Parameters:
-
max_attempts(int, default:3) –Maximum number of attempts (default: 3)
-
backoff(str, default:'exponential') –Backoff strategy: "exponential" (default), "fixed", or "linear"
Returns:
-
list[ExceptionMatcher]–Matcher list for use with on_error parameter
Examples:
>>> @udf(data_type=pa.int32(), on_error=retry_all())
>>> @udf(data_type=pa.int32(), on_error=retry_all(max_attempts=5))
Source code in geneva/debug/error_store.py
geneva.debug.error_store.skip_on_error
Skip (return None) for any exception.
Returns:
-
list[ExceptionMatcher]–Matcher list for use with on_error parameter
Examples:
Source code in geneva/debug/error_store.py
geneva.debug.error_store.fail_fast
Fail immediately on any exception (default behavior).
Returns:
-
list[ExceptionMatcher]–Empty matcher list (no special handling)
Examples:
Source code in geneva/debug/error_store.py
Configuration
geneva.debug.error_store.ErrorHandlingConfig
Configuration for UDF error handling behavior
Source code in geneva/debug/error_store.py
validate_compatibility
Validate that this error config is compatible with the given task
Args: map_task: The MapTask to validate against
Raises: ValueError: If SKIP_ROWS is used with RecordBatch UDF
Source code in geneva/debug/error_store.py
geneva.debug.error_store.UDFRetryConfig
Retry configuration for UDF execution using tenacity semantics
Source code in geneva/debug/error_store.py
no_retry
no_retry() -> UDFRetryConfig
retry_transient
retry_transient(max_attempts: int = 3) -> UDFRetryConfig
Retry common transient errors (network, timeouts)
Parameters:
-
max_attempts(int, default:3) –Maximum number of attempts including the initial try
Source code in geneva/debug/error_store.py
geneva.debug.error_store.ErrorRecord
UDF execution error record, stored in geneva_errors table
Source code in geneva/debug/error_store.py
timestamp
timestamp: datetime = field(
factory=dt_now_utc,
metadata={"pa_type": timestamp("us", tz="UTC")},
)