Use Case ID: UC-010
This skill converts accounting-formatted numeric values into clean Python floats before any calculation or analysis is performed.
It is designed to:
(123) to signed negatives -123This skill is preprocessing only.
It does not interpret business meaning, validate totals, or replace a reviewer’s reconciliation step.
Use only approved source data such as:
All transformations should be applied to a copy of the source data. The raw layer must never be overwritten.
Never apply this skill directly to:
If a column contains mixed identifiers and amounts, stop and ask for a cleaned source file.
def clean_accounting_number(series):
return (
series.astype(str)
.replace(r'[\$,]', '', regex=True) # Remove currency symbols and commas
.replace(r'\((.*?)\)', r'-\1', regex=True) # Convert (123) → -123
.replace(r'^\s*$', '0', regex=True) # Handle blank strings
.astype(float)
)
Handle additional real-world formatting edge cases:
def clean_accounting_number_advanced(series):
return (
series.astype(str)
.str.strip()
.replace(r'[\$,]', '', regex=True)
.replace(r'\((.*?)\)', r'-\1', regex=True)
.replace(r'–', '-', regex=True) # Special dash characters
.replace(r'-$', '', regex=True) # Trailing negatives like 123-
.replace(r'^\s*$', '0', regex=True)
.astype(float)
)
A clean pandas Series with:
float640.0 (or NaN if preferred — document the choice)Before using cleaned output in any calculation:
NaN valuesfloat64)| Issue | Cause |
|---|---|
| Values not converting | Hidden characters or non-standard formatting |
| Positives instead of negatives | Parentheses pattern not matched |
NaN values appear |
Non-numeric strings present in column |
| Totals off | Partial cleaning or mixed formats in same column |
0 or NaN — both are valid, but be explicitUse this skill when a column from an Excel export contains accounting-formatted values and calculations are returning unexpected results.
Example prompt:
“Use the accounting-number-normalization skill on the
Amountcolumn in the transactions DataFrame. Preserve the raw data and save the cleaned column to the processed layer.”
Log transformation steps to:
workings/ or evidence/run-logs/
Suggested file naming pattern:
YYYY-MM-DD_accounting-number-normalization_[source-file].md
Also available in the PythonMuse Workflow Kit
| *Tag: #Top10Traps #1 | #DataCleaning #AccountingAI #PythonMuse* |