# Ejemplos de Cambios de Código - Propreports Normalizer

## Índice

1. [CRÍTICO: Portfolio/Reference_Code in Hash](#1-crítico-portfolioreference_code-in-hash)
2. [Required Fields Validation](#2-required-fields-validation)
3. [NASD Fee Missing from Sum](#3-nasd-fee-missing-from-sum)
4. [Side Validation Post-Mapping](#4-side-validation-post-mapping)
5. [Commission Column Fallback](#5-commission-column-fallback)
6. ["COVER" Side Mapping](#6-cover-side-mapping)
7. [Swap Calculation from ECN](#7-swap-calculation-from-ecn)

---

## 1. CRÍTICO: Portfolio/Reference_Code in Hash

### Problema

**Hash match rate actual: ~0%** ⚠️⚠️⚠️

El legacy hashea el order object SIN portfolio/reference_code. La nueva implementación AÑADE estos campos ANTES de hashear, resultando en hashes completamente diferentes y 100% de trades duplicados en re-imports.

### Código Legacy (propreports_export.py:476-482)

```python
try:
    # Remove milliseconds from date/time format
    # Converts "6/11/2025 15:03:12.000" → "6/11/2025 15:03:12"
    order['date/time'] = order['date/time'].split('.')[0]
except:
    pass

# Create copy for original_file_row
original_file_row = json.loads(json.dumps(order))

# Hash WITHOUT portfolio/reference_code
njson = json.dumps(order)
njson = hashlib.md5(njson.encode('utf-8')).hexdigest()
```

### Código Actual (propreports.py:163-178) ❌

```python
# Build order for hash (preserving original key order)
order_for_hash = dict(order)

# Strip milliseconds from date/time
if datetime_key in order_for_hash:
    dt_value = str(order_for_hash[datetime_key])
    order_for_hash[datetime_key] = dt_value.split('.')[0]

# ADD portfolio and reference_code BEFORE hashing ← PROBLEMA
order_for_hash["portfolio"] = portfolio
order_for_hash["reference_code"] = reference_code

# Hash WITH these fields included
file_row_hash = hashlib.md5(json.dumps(order_for_hash).encode('utf-8')).hexdigest()
```

**Resultado:** 0% hash match rate

### Código Propuesto ✅

```python
# propreports.py lines 163-178
# Build order for hash (preserving original key order)
order_for_hash = dict(order)

# Strip milliseconds from date/time (legacy compatibility)
datetime_key = "date/time" if "date/time" in order_for_hash else "Date/Time"
if datetime_key in order_for_hash:
    dt_value = str(order_for_hash[datetime_key])
    # Split at '.' to remove milliseconds: "6/11/2025 15:03:12.000" → "6/11/2025 15:03:12"
    order_for_hash[datetime_key] = dt_value.split('.')[0]

# DO NOT add portfolio/reference_code to hash (legacy compatibility)
# Legacy hashes WITHOUT these fields - propreports_export.py:482
# These fields are metadata added AFTER hashing for organization purposes

# Hash the JSON (legacy compatible - no sort_keys to preserve original order)
file_row_hash = hashlib.md5(json.dumps(order_for_hash).encode('utf-8')).hexdigest()
```

### Tests Requeridos

```python
# tests/brokers/test_propreports.py

import hashlib
import json


def test_hash_without_portfolio_reference_code():
    """Verifica que hash NO incluye portfolio/reference_code"""
    order = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45.123",
        "symbol": "AAPL",
        "price": "150.25",
        "qty": "100",
        "b/s": "B",
    }

    # Hash WITHOUT portfolio/reference_code (CORRECT - legacy compatible)
    order_for_hash = dict(order)
    order_for_hash["date/time"] = order_for_hash["date/time"].split('.')[0]
    hash_without = hashlib.md5(json.dumps(order_for_hash).encode()).hexdigest()

    # Hash WITH portfolio/reference_code (INCORRECT - current bug)
    order_with_extra = dict(order_for_hash)
    order_with_extra["portfolio"] = "PropReports2513"
    order_with_extra["reference_code"] = "2513"
    hash_with = hashlib.md5(json.dumps(order_with_extra).encode()).hexdigest()

    # Hashes MUST be different
    assert hash_without != hash_with

    # The correct hash should NOT include portfolio/reference_code
    assert "portfolio" not in json.dumps(order_for_hash)
    assert "reference_code" not in json.dumps(order_for_hash)


def test_hash_matches_legacy_format():
    """Verifica que hash matches legacy formula exactly"""
    # Sample order from legacy system
    order = {
        "propreports id": "82041",
        "date/time": "6/11/2025 15:03:12.000",
        "symbol": "SPY",
        "price": "450.50",
        "qty": "100",
        "b/s": "B",
        "comm": "1.00",
        "ecn fee": "0.30",
    }

    # Legacy formula (propreports_export.py:476-482)
    order_legacy = dict(order)
    order_legacy["date/time"] = order_legacy["date/time"].split('.')[0]
    legacy_hash = hashlib.md5(json.dumps(order_legacy).encode()).hexdigest()

    # New formula (should match)
    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order]),
        portfolio="PropReports2513",
        reference_code="2513"
    )

    new_hash = df["_file_row_hash"][0]

    # Hashes MUST match
    assert new_hash == legacy_hash, f"New: {new_hash}, Legacy: {legacy_hash}"


def test_hash_deterministic():
    """Hash debe ser determinístico (mismo input → mismo output)"""
    json_content = '''
    [
        {
            "propreports id": "12345",
            "date/time": "1/14/2026 10:30:45.000",
            "symbol": "AAPL",
            "price": "150.25",
            "qty": "100",
            "b/s": "B"
        }
    ]
    '''

    # Process twice with same input
    df1 = PropreportsInterpreter.parse_json_content(
        json_content,
        portfolio="test",
        reference_code="123"
    )
    df2 = PropreportsInterpreter.parse_json_content(
        json_content,
        portfolio="test",
        reference_code="123"
    )

    # Hashes must be identical
    assert df1["_file_row_hash"][0] == df2["_file_row_hash"][0]


def test_hash_milliseconds_stripped():
    """Verifica que milliseconds son eliminados antes de hash"""
    order_with_ms = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45.123",
        "symbol": "AAPL",
    }

    order_without_ms = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
    }

    # Process both
    df1 = PropreportsInterpreter.parse_json_content(
        json.dumps([order_with_ms]),
        portfolio="test",
        reference_code="123"
    )
    df2 = PropreportsInterpreter.parse_json_content(
        json.dumps([order_without_ms]),
        portfolio="test",
        reference_code="123"
    )

    # Hashes must match (milliseconds stripped)
    assert df1["_file_row_hash"][0] == df2["_file_row_hash"][0]


def test_hash_json_key_order():
    """Verifica que key order no afecta hash (legacy no usa sort_keys)"""
    # Same data, different key order
    order1 = {
        "propreports id": "12345",
        "symbol": "AAPL",
        "price": "150.25",
    }

    order2 = {
        "price": "150.25",
        "propreports id": "12345",
        "symbol": "AAPL",
    }

    # Note: Legacy preserves order from API, so we should too
    # If orders come from API in same order, hashes should match
    hash1 = hashlib.md5(json.dumps(order1).encode()).hexdigest()
    hash2 = hashlib.md5(json.dumps(order2).encode()).hexdigest()

    # Hashes WILL be different if key order differs
    # This is expected - API order is preserved
    # Document this behavior
    assert hash1 != hash2  # Different order = different hash


def test_hash_with_sample_data():
    """Integration test con datos reales de user 4359"""
    # Sample from integration fixture
    sample_order = {
        "propreports id": "sample_id_001",
        "date/time": "1/14/2026 10:30:45.000",
        "account": "2513",
        "b/s": "B",
        "qty": "100",
        "symbol": "AAPL",
        "price": "150.25",
        "comm": "1.00",
        "ecn fee": "0.30",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([sample_order]),
        portfolio="PropReports2513",
        reference_code="2513"
    )

    # Verify hash exists and is MD5 format (32 hex chars)
    assert df is not None
    assert len(df) > 0
    hash_value = df["_file_row_hash"][0]
    assert len(hash_value) == 32
    assert all(c in '0123456789abcdef' for c in hash_value)


def test_hash_user_4359_compatibility():
    """Verifica compatibilidad con user 4359 (newer format)"""
    # Este test requiere datos reales de la DB
    # Marcar como integration test
    pytest.skip("Requires real DB data - run in integration tests")


def test_hash_user_40888_compatibility():
    """Verifica compatibilidad con user 40888 (older format)"""
    # Este test requiere datos reales de la DB
    # Marcar como integration test
    pytest.skip("Requires real DB data - run in integration tests")
```

### Verificación de Éxito

```python
# Verificación manual del hash
import json
import hashlib

order = {
    "propreports id": "82041",
    "date/time": "6/11/2025 15:03:12.000",
    "symbol": "SPY",
    "price": "450.50",
}

# Legacy approach (CORRECT)
order_legacy = dict(order)
order_legacy["date/time"] = order_legacy["date/time"].split('.')[0]
hash_legacy = hashlib.md5(json.dumps(order_legacy).encode()).hexdigest()

# Current approach (INCORRECT - includes portfolio/ref)
order_current = dict(order_legacy)
order_current["portfolio"] = "PropReports2513"
order_current["reference_code"] = "2513"
hash_current = hashlib.md5(json.dumps(order_current).encode()).hexdigest()

# After fix (CORRECT - no portfolio/ref)
order_fixed = dict(order_legacy)
hash_fixed = hashlib.md5(json.dumps(order_fixed).encode()).hexdigest()

print(f"Legacy hash:  {hash_legacy}")
print(f"Current hash: {hash_current}")
print(f"Fixed hash:   {hash_fixed}")
print(f"Legacy == Fixed: {hash_legacy == hash_fixed}")  # Should be True
print(f"Legacy == Current: {hash_legacy == hash_current}")  # Should be False
```

**Resultado Esperado:** Hash match rate >= 95%

---

## 2. Required Fields Validation

### Problema

Órdenes con campos requeridos vacíos pueden entrar al pipeline causando errores downstream.

### Código Legacy (propreports_export.py:489-490)

```python
if not order['symbol'] or not order['date'] or not order['price'] or not action:
    continue
```

### Código Actual ❌

Sin validación de required fields.

### Código Propuesto ✅

```python
# propreports.py después de línea 183 (en parse_json_content)
for row_idx, order in enumerate(orders):
    # ... existing code ...

    # Normalize keys to handle both API (lowercase) and CSV (original case)
    def get_value(key_lower: str, key_original: str, default=""):
        return order.get(key_lower) or order.get(key_original) or default

    # Required fields validation
    symbol = get_value("symbol", "Symbol")
    date_time = get_value("date/time", "Date/Time")
    price = get_value("price", "Price")
    side = get_value("b/s", "B/S")

    if not symbol or not date_time or not price or not side:
        logger.warning(
            f"[PROPREPORTS] Skipping order {order.get('propreports id', 'unknown')}: "
            f"missing required fields (symbol={bool(symbol)}, date={bool(date_time)}, "
            f"price={bool(price)}, side={bool(side)})"
        )
        continue

    # ... rest of processing ...
```

### Tests Requeridos

```python
def test_required_fields_validation():
    """Todos los required fields deben estar presentes"""
    complete_order = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "qty": "100",
        "b/s": "B",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([complete_order]),
        portfolio="test",
        reference_code="123"
    )

    assert len(df) == 1  # Should be accepted


def test_empty_symbol_skipped():
    """Orders con symbol vacío deben ser skipped"""
    order_empty_symbol = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "",  # Empty
        "price": "150.25",
        "b/s": "B",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_empty_symbol]),
        portfolio="test",
        reference_code="123"
    )

    assert len(df) == 0  # Should be skipped


def test_empty_date_skipped():
    """Orders con date vacío deben ser skipped"""
    order_empty_date = {
        "propreports id": "12345",
        "date/time": "",  # Empty
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_empty_date]),
        portfolio="test",
        reference_code="123"
    )

    assert len(df) == 0  # Should be skipped


def test_empty_price_skipped():
    """Orders con price vacío deben ser skipped"""
    order_empty_price = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "",  # Empty
        "b/s": "B",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_empty_price]),
        portfolio="test",
        reference_code="123"
    )

    assert len(df) == 0  # Should be skipped


def test_empty_side_skipped():
    """Orders con side vacío deben ser skipped"""
    order_empty_side = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "",  # Empty
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_empty_side]),
        portfolio="test",
        reference_code="123"
    )

    assert len(df) == 0  # Should be skipped
```

---

## 3. NASD Fee Missing from Sum

### Problema

Legacy incluye NASD fee en la suma total de fees. Nueva implementación NO incluye NASD, afectando accuracy financiera.

### Código Legacy (propreports_export.py:534-542)

```python
ecn = float(order['ecn fee']...) if 'ecn fee' in order else 0.00
sec = float(order['sec']...) if 'sec' in order else 0.00
taf = float(order['taf']...) if 'taf' in order else 0.00
nscc = float(order['nscc']...) if 'nscc' in order else 0.00
nasd = float(order['nasd']...) if 'nasd' in order else 0.00  # ← INCLUIDO
misc = float(order['misc']...) if 'misc' in order else 0.00
clr = float(order['clr']...) if 'clr' in order else 0.00

fees_sum = str(sec + taf + nscc + nasd + clr + misc)  # ← NASD en suma
```

### Código Actual (propreports.py:97-99) ❌

```python
FEE_COLUMNS: ClassVar[list] = [
    "ecn fee", "sec", "orf", "cat", "taf", "nfa", "nscc", "acc", "clr", "misc"
]  # ← NASD faltante

FEE_COLUMNS_ORIGINAL: ClassVar[list] = [
    "Ecn Fee", "SEC", "ORF", "CAT", "TAF", "NFA", "NSCC", "Acc", "Clr", "Misc"
]  # ← NASD faltante
```

### Código Propuesto ✅

```python
FEE_COLUMNS: ClassVar[list] = [
    "ecn fee", "sec", "orf", "cat", "taf", "nfa", "nscc", "nasd", "acc", "clr", "misc"
]  # Added "nasd"

FEE_COLUMNS_ORIGINAL: ClassVar[list] = [
    "Ecn Fee", "SEC", "ORF", "CAT", "TAF", "NFA", "NSCC", "NASD", "Acc", "Clr", "Misc"
]  # Added "NASD"
```

### Tests Requeridos

```python
def test_nasd_fee_included_in_sum():
    """NASD fee debe estar incluido en suma total"""
    order_with_nasd = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
        "qty": "100",
        "ecn fee": "0.30",
        "sec": "0.10",
        "nasd": "0.05",  # Should be included
        "nscc": "0.02",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_with_nasd]),
        portfolio="test",
        reference_code="123"
    )

    # Fee sum should include NASD
    expected_fee_sum = 0.30 + 0.10 + 0.05 + 0.02  # 0.47
    assert df["_fee_sum"][0] == pytest.approx(expected_fee_sum, abs=0.01)


def test_fee_calculation_with_nasd():
    """Verifica que FEE_COLUMNS incluye nasd"""
    assert "nasd" in PropreportsInterpreter.FEE_COLUMNS
    assert "NASD" in PropreportsInterpreter.FEE_COLUMNS_ORIGINAL


def test_all_fee_columns_summed():
    """Verifica que todos los fee columns son sumados"""
    order_all_fees = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
        "qty": "100",
        "ecn fee": "0.30",
        "sec": "0.10",
        "orf": "0.01",
        "cat": "0.01",
        "taf": "0.02",
        "nfa": "0.01",
        "nscc": "0.02",
        "nasd": "0.05",
        "acc": "0.01",
        "clr": "0.01",
        "misc": "0.01",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_all_fees]),
        portfolio="test",
        reference_code="123"
    )

    # Sum all fees
    expected_sum = 0.30 + 0.10 + 0.01 + 0.01 + 0.02 + 0.01 + 0.02 + 0.05 + 0.01 + 0.01 + 0.01
    assert df["_fee_sum"][0] == pytest.approx(expected_sum, abs=0.01)
```

---

## 4. Side Validation Post-Mapping

### Problema

Después de mapear sides, valores inválidos pueden pasar sin filtrado.

### Código Legacy (brokers_propreports.py:124-125)

```python
if n['action'] not in ['BUY', 'SELL']:
    continue
```

### Código Actual ❌

No hay validación post-mapping.

### Código Propuesto ✅

```python
# propreports.py en normalize() después de side mapping (~línea 310)
.with_columns([
    # Side mapping
    pl.when(pl.col("b/s").is_in(["B", "BUY", "COVER"]))
    .then(pl.lit("BUY"))
    .when(pl.col("b/s").is_in(["S", "SELL", "SHORT", "T"]))
    .then(pl.lit("SELL"))
    .otherwise(pl.lit("INVALID"))
    .alias("side")
])
# Filter out invalid sides
.filter(pl.col("side").is_in(["BUY", "SELL"]))
```

### Tests Requeridos

```python
def test_side_validation_post_mapping():
    """Solo BUY/SELL deben pasar validación"""
    valid_order = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
        "qty": "100",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([valid_order]),
        portfolio="test",
        reference_code="123"
    )

    lf = PropreportsInterpreter().normalize(df.lazy(), user_id=1, account_id="test")
    result = lf.collect()

    # Valid side should pass
    assert len(result) == 1
    assert result["side"][0] in ["BUY", "SELL"]


def test_invalid_sides_filtered():
    """Sides inválidos deben ser filtrados"""
    # This test would require injecting invalid side post-mapping
    # In practice, SIDE_MAP handles this with default="INVALID"
    # and .filter() removes them
    pass  # Covered by integration tests


def test_only_buy_sell_pass():
    """Verificar que solo BUY/SELL pasan el filtro"""
    orders = [
        {"b/s": "B", "symbol": "A", "price": "1", "date/time": "1/1/2026 10:00:00", "propreports id": "1"},
        {"b/s": "S", "symbol": "B", "price": "2", "date/time": "1/1/2026 10:00:00", "propreports id": "2"},
        {"b/s": "INVALID", "symbol": "C", "price": "3", "date/time": "1/1/2026 10:00:00", "propreports id": "3"},
    ]

    df = PropreportsInterpreter.parse_json_content(
        json.dumps(orders),
        portfolio="test",
        reference_code="123"
    )

    lf = PropreportsInterpreter().normalize(df.lazy(), user_id=1, account_id="test")
    result = lf.collect()

    # Only 2 valid sides should pass
    assert len(result) == 2
    assert all(side in ["BUY", "SELL"] for side in result["side"])
```

---

## 5. Commission Column Fallback

### Problema

Legacy usa 'exec' column como fallback si 'comm' está vacío. Nueva solo usa 'comm'.

### Código Legacy (propreports_export.py:544-547)

```python
comm1 = float(order['exec'].replace(...)) if 'exec' in order and order['exec'] else \
        float(order['comm'].replace(...)) if 'comm' in order and order['comm'] else 0.00

comm_sum = str(ecn + comm1)
comm_sum = str(comm1)  # Final value
```

### Código Actual (propreports.py:~330) ❌

```python
pl.coalesce(pl.col("comm"), pl.col("Comm"), pl.lit(0.0))
.abs()
.alias("commission")
```

### Código Propuesto ✅

```python
# propreports.py línea ~330
# Commission calculation with exec fallback
pl.when(
    (pl.col("comm").is_null()) |
    (pl.col("comm").cast(pl.Utf8).str.strip_chars() == "") |
    (pl.col("comm") == 0)
).then(
    pl.coalesce(pl.col("exec"), pl.col("Exec"), pl.lit(0.0))
).otherwise(
    pl.coalesce(pl.col("comm"), pl.col("Comm"), pl.lit(0.0))
).cast(pl.Float64)
.abs()
.alias("commission")
```

### Tests Requeridos

```python
def test_commission_exec_fallback():
    """Exec debe usarse como fallback si comm está vacío"""
    order_with_exec = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
        "qty": "100",
        "comm": "",  # Empty
        "exec": "1.50",  # Should use this
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_with_exec]),
        portfolio="test",
        reference_code="123"
    )

    lf = PropreportsInterpreter().normalize(df.lazy(), user_id=1, account_id="test")
    result = lf.collect()

    # Should use exec value
    assert result["commission"][0] == pytest.approx(1.50, abs=0.01)


def test_commission_comm_preferred():
    """Comm debe usarse preferentemente sobre exec"""
    order_with_both = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
        "qty": "100",
        "comm": "2.00",  # Should use this
        "exec": "1.50",  # Ignore this
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_with_both]),
        portfolio="test",
        reference_code="123"
    )

    lf = PropreportsInterpreter().normalize(df.lazy(), user_id=1, account_id="test")
    result = lf.collect()

    # Should use comm value
    assert result["commission"][0] == pytest.approx(2.00, abs=0.01)


def test_commission_exec_when_comm_empty():
    """Exec debe usarse cuando comm es cero string"""
    order_comm_zero = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
        "qty": "100",
        "comm": "0",  # Zero
        "exec": "1.50",  # Should use this
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_comm_zero]),
        portfolio="test",
        reference_code="123"
    )

    lf = PropreportsInterpreter().normalize(df.lazy(), user_id=1, account_id="test")
    result = lf.collect()

    # Should use exec value
    assert result["commission"][0] == pytest.approx(1.50, abs=0.01)
```

---

## 6. "COVER" Side Mapping

### Problema

Legacy mapea "COVER" → "BUY". Nueva implementación NO tiene este mapping.

### Código Legacy (propreports_export.py:486)

```python
action = 'BUY' if action == 'BUY' or action=='COVER' or action=='B' else \
         'SELL' if action == 'SELL' or action == 'SHORT' or action=='S' or action=='T' else ''
```

### Código Actual (propreports.py:90-94) ❌

```python
SIDE_MAP: ClassVar[dict] = {
    "B": "BUY",
    "BUY": "BUY",
    "S": "SELL",
    "SELL": "SELL",
    "SHORT": "SELL",
    "T": "SELL",
}  # ← COVER faltante
```

### Código Propuesto ✅

```python
SIDE_MAP: ClassVar[dict] = {
    "B": "BUY",
    "BUY": "BUY",
    "COVER": "BUY",  # Add this - legacy compatibility
    "S": "SELL",
    "SELL": "SELL",
    "SHORT": "SELL",
    "T": "SELL",  # Short sell
}
```

### Tests Requeridos

```python
def test_cover_side_mapping():
    """COVER debe mapear a BUY"""
    order_cover = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "COVER",
        "qty": "100",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_cover]),
        portfolio="test",
        reference_code="123"
    )

    lf = PropreportsInterpreter().normalize(df.lazy(), user_id=1, account_id="test")
    result = lf.collect()

    # COVER should map to BUY
    assert result["side"][0] == "BUY"


def test_cover_maps_to_buy():
    """Verifica que SIDE_MAP incluye COVER → BUY"""
    assert "COVER" in PropreportsInterpreter.SIDE_MAP
    assert PropreportsInterpreter.SIDE_MAP["COVER"] == "BUY"
```

---

## 7. Swap Calculation from ECN

### Problema

Legacy calcula swap como ECN fee invertido. Nueva implementación usa swap = 0.0.

**Nota:** Posible cambio intencional de business logic. Requiere clarificación.

### Código Legacy (propreports_export.py:581)

```python
order['swap'] = ecn*-1
```

### Código Actual (propreports.py:~360) ❌

```python
pl.lit(0.0).alias("swap")
```

### Código Propuesto (SI CAMBIO NECESARIO) ✅

```python
# propreports.py línea ~360
# Swap calculation from ECN fee (legacy compatibility)
# Extract ECN fee from the order
pl.when(
    pl.col("_original_order").str.contains("ecn fee")
).then(
    # Extract and negate ECN fee value
    # This requires parsing JSON which is complex in Polars
    # Alternative: pre-compute in parse_json_content
    pl.col("_ecn_fee").mul(-1.0)
).otherwise(
    pl.lit(0.0)
).alias("swap")

# Alternative: Pre-compute in parse_json_content
# Add to record:
# "ecn_fee": ecn_value  # For later swap calculation
```

### Tests Requeridos

```python
def test_swap_calculation():
    """Si implementado, swap debe ser -ECN"""
    # Requiere decisión de stakeholder primero
    pytest.skip("Pending business logic clarification")


def test_ecn_in_fees():
    """Si NO cambio, verificar que ECN está en fees"""
    order_with_ecn = {
        "propreports id": "12345",
        "date/time": "1/14/2026 10:30:45",
        "symbol": "AAPL",
        "price": "150.25",
        "b/s": "B",
        "qty": "100",
        "ecn fee": "0.30",
    }

    df = PropreportsInterpreter.parse_json_content(
        json.dumps([order_with_ecn]),
        portfolio="test",
        reference_code="123"
    )

    lf = PropreportsInterpreter().normalize(df.lazy(), user_id=1, account_id="test")
    result = lf.collect()

    # ECN should be in fees, not swap
    assert result["fees"][0] >= 0.30
    assert result["swap"][0] == 0.0  # Current behavior
```

---

## Resumen de Impacto

| Validación | Criticidad | Líneas Código | Tests | Effort |
|------------|-----------|---------------|-------|--------|
| 1. Portfolio/Ref Hash | ⚠️⚠️⚠️ CRÍTICO | ~10 | 8 | 1-2 días |
| 2. Required Fields | ⭐⭐⭐ ALTA | ~15 | 5 | 0.5 días |
| 3. NASD Fee | ⭐⭐⭐ ALTA | ~2 | 3 | 0.1 días |
| 4. Side Validation | ⭐⭐ MEDIA-ALTA | ~5 | 3 | 0.25 días |
| 5. Commission Fallback | ⭐⭐ MEDIA | ~10 | 3 | 0.25 días |
| 6. COVER Mapping | ⭐⭐ MEDIA | ~1 | 2 | 0.1 días |
| 7. Swap Calculation | ⭐ MEDIA | ~10 | 2 | 0.5 días |
| **TOTAL** | | **~53** | **26** | **2.7-3.7 días** |

**Total Tests:** 26 nuevos tests (actuales: 54, post-cambios: ~80)

**Total Líneas propreports.py:** 423 (actual) → ~476 (post-cambios) = +53 líneas (+12.5%)

**Total Líneas test_propreports.py:** 554 (actual) → ~700 (post-cambios) = +146 líneas (+26%)

---

## Verificación Final

### Comando de Tests

```bash
# Run all Propreports tests
pytest tests/brokers/test_propreports.py -v

# Run specific test categories
pytest tests/brokers/test_propreports.py::test_hash -v -k "hash"
pytest tests/brokers/test_propreports.py::test_validation -v -k "validation"

# Run with coverage
pytest tests/brokers/test_propreports.py --cov=pipeline.p01_normalize.brokers.propreports --cov-report=term-missing
```

### Métricas de Éxito

- [ ] Hash match rate >= 95% (vs actual ~0%)
- [ ] All hash tests passing (8 tests)
- [ ] All validation tests passing (18 tests)
- [ ] NASD fee correctamente incluido
- [ ] Commission fallback funcionando
- [ ] COVER mapping correcto
- [ ] Rejection rate < 0.1%
- [ ] Code coverage >= 90%

---

**Última Actualización:** 2026-01-14
**Broker:** propreports (240)
**Versión:** 1.0
