While running the CityDataLab factor model on 210,000 HDB resale transactions (2017–2025), the Price Tier histogram looked off. Price Tier buckets every transaction into a septile from −3 (cheapest) to +3 (dearest) based on its own price-per-sqm versus the trailing 12-month distribution. By construction the buckets should each hold roughly one-seventh of transactions. They don't — the right tail (+2/+3) is materially heavier than the left, and the middle is hollowed out:
The +2 and +3 buckets each hold ~16% of transactions; the −3 bucket holds only 10%. Something is consistently pushing transactions toward the dear end of the trailing distribution. We tested every plausible covariate.
For each candidate, we compute the mean Price Tier on each side of a binary split. The biggest spread wins:
| Covariate (binary split) | On count | On mean | Off mean | Spread |
|---|---|---|---|---|
| floor_level ≥ 20 | 11,989 | +2.35 | +0.12 | +2.23 |
| lease_year ≥ 2010 | 53,882 | +1.89 | −0.32 | +2.21 |
| building_age < 10 | 49,076 | +1.87 | −0.25 | +2.12 |
| is_studio (< 50sqm) | 3,867 | +2.09 | +0.22 | +1.88 |
| floor_level ≥ 10 | 80,661 | +0.95 | −0.19 | +1.14 |
| floor_level ≤ 3 | 37,092 | −0.65 | +0.44 | −1.09 |
| lease_year < 1990 | 87,555 | −0.22 | +0.59 | −0.81 |
| building_age > 30 | 86,369 | −0.21 | +0.57 | −0.78 |
Three of the top four splits ride together: high floor level, recent lease commencement, and small floor area. They all describe the same kind of flat — a post-2010 Build-to-Order (BTO) in a newer town, in a taller block, with the more compact floor plans that became the standard after 2010. Splitting the population by lease cohort makes the bimodality almost surgical:
Post-2010 BTO flats land in the dearest two septiles 68% of the time. They are barely present in the bottom three septiles at all (2.5% combined). The right hump in the overall histogram is, almost in its entirety, this segment.
Several structural premiums compound:
Floor level alone tells a similar story:
Floor area shows the size-premium pattern, with studios (< 50 sqm) the only sub-segment averaging above the +1 mark:
Price Tier is partly capturing "is this a new BTO?" rather than the within-window money-chases-money signal we want. In the factor selection log it shows up at ρ = +0.26 with log_building_age: moderate but real. Total log-price contribution from Price Tier in the final model is −1.05% over 2017–2025 — small — but the OLS already attributes most of the "new BTO premium" to log_building_age directly (+23.2% cumulative, the dominant factor), leaving Price Tier with the residual.
If we wanted a cleaner Price Tier we could compute it within age cohorts — each cohort getting its own −3..+3 spread — or residualise log price-per-sqm against age before bucketing. That would make Price Tier orthogonal to age and remove the bimodal artifact, at the cost of changing what the factor measures (it becomes "dear-given-vintage" rather than "dear absolute").
For now, the headline finding stands: Singapore's HDB market is two markets stacked on top of each other. Post-2010 BTOs operate at a structurally higher price-per-sqm, and they are the right hump of the Price Tier histogram. The selection algorithm correctly identified that this signal exists; the unusual histogram shape is the model telling us where the regime break is.
All 10 factors with rolling betas, p-values and contributions for HDB resales 2017–2025.
Open Singapore Factors →Methodology: Price Tier is computed as the septile bucket of each transaction's log price-per-sqm against the trailing 12-month distribution, scored on {−3, −2, −1, 0, +1, +2, +3}. Data: 210,000 HDB resale transactions from data.gov.sg covering Jan 2017 to Dec 2025. Full methodology at about.