Methodology · Last reviewed May 2026
How Stickybeak works
Every number on Stickybeak comes from a government data source. Here is how we turn raw data into the insights you see on each suburb page.
What the grade is and isn't
The Stickybeak grade is a relative ranking against Greater Sydney. A B- means the suburb sits roughly in the middle of the Sydney distribution. It is not an absolute quality score, not a price prediction, and not financial advice. Every grade is built from public data, sourced, dated, and verifiable on the suburb page. The grade is a starting point for research, not an endpoint.
The four dimensions and their factors
The composite grade is built from four sub-grades, each weighted equally:
- Safety. Crime rate (0.30), road safety (0.20), flood exposure (0.20), bushfire exposure (0.15), building defect orders (0.15). Sources: BoCSAR, TfNSW, NSW EPI, NSW RFS, NSW Building Commission.
- Family. School quality ICSEA (0.30), childcare NQS rating (0.25), walkability (0.20), commute to CBD (0.15), socioeconomic advantage (0.10). Sources: ACARA, ACECQA, OSM, GTFS, ABS SEIFA.
- Finance. Price-to-income ratio (0.25), annual growth (0.20), rental yield (0.15), supply pipeline (0.15), days on market (0.15), FHB eligibility (0.10). Sources: NSW VG, ABS Census, SQM Research, NSW Planning.
- Lifestyle. Walkability score (0.30), amenity density (0.30), cafe density (0.20), parks density (0.20). Sources: OSM amenity data.
How factors are scored
For each factor, we compute a z-score: the factor’s distance from the Greater Sydney mean, in standard deviations. A z of 0 sits at the Sydney mean. A z of +1.0 is one standard deviation better than the mean for that factor. We invert factors where lower is better (flood exposure, bushfire exposure, road crashes, building defect orders). The z-scores are then weighted into a single dimension score, and the four dimension scores are averaged into the composite.
Grade boundaries
The letter grade is derived from the suburb’s percentile rank across Greater Sydney. The mapping is:
| Grade | Percentile range |
|---|---|
| A+ | Top 5% (≥ 95th) |
| A | 5–15% (85th–94th) |
| A– | 15–25% (75th–84th) |
| B+ | 25–40% (60th–74th) |
| B | 40–60% (40th–59th) |
| B– | 60–75% (25th–39th) |
| C+ | 75–85% (15th–24th) |
| C | 85–92% (8th–14th) |
| C– | 92–97% (3rd–7th) |
| D | Bottom 3% (below 3rd) |
The percentile is computed against all Sydney metro suburbs. A suburb at the 72nd percentile scores higher than 72% of suburbs in the comparison universe. The primary number shown on each suburb page is this percentile; the letter is a shorthand derived from it.
The comparison universe is currently Greater Sydney. Support for LGA, region, and price-band comparisons is planned.
Factor preprocessing
Raw factor values are preprocessed before z-scoring to handle skewed distributions and outliers. Three transforms are used:
- Winsorisation (all factors). Values below the 1st percentile or above the 99th percentile are clamped to those boundary values. This prevents a single outlier postcode (e.g. Vaucluse on price, Sydney CBD on amenity count) from dominating the distribution.
- Log transform. Applied to right-skewed counts and ratios: amenity density, cafe density, parks density, and price-to-income ratio. Compresses the long right tail so the z-score discriminates across the bulk of suburbs, not just the extremes.
- Rank transform. Applied to crime rate, road safety (serious/fatal crashes), and building orders. These distributions have many zeros and a heavy right tail. Converting to percentile rank [0, 1] ensures the z-score is well-behaved regardless of the raw distribution shape.
All other factors (flood risk, bushfire risk, school ICSEA, childcare quality, commute time, SEIFA, annual growth, rental yield, supply pipeline, days on market, walkability score) use raw values without transformation, since their distributions are roughly normal or bounded.
Bayesian smoothing
Some postcodes have very thin data (few sales, low population, suppressed cells). Their raw z-scores can be unstable. We smooth thin-postcode z-scores toward the Greater Sydney mean using a Bayesian prior, with the smoothing strength tied to the sample size. A postcode with 1000 underlying observations is barely smoothed; a postcode with 10 observations is smoothed heavily toward the mean. This makes the grades comparable across suburb sizes.
Missing-data handling
We use a two-tier approach to missing data, replacing the earlier blanket 50% exclusion rule:
- Partial data. If a single factor representing more than 30% of the dimension’s weight is missing, the dimension still scores using the remaining factors (re-normalised), but a “partial data” badge appears on the dimension tile so you know the grade is less reliable.
- Suppressed. If factors representing more than 60% of the dimension’s weight are missing, the dimension is suppressed entirely. The tile renders an em-dash. No grade is assigned.
The composite grade is computed from the available dimensions. If more than two of four dimensions are suppressed, the whole grade is suppressed.
Low-discrimination dimensions
When the standard deviation of a dimension’s scores across Greater Sydney is very small (below 0.15), the ranking signal is weak. Small differences between suburbs are noise, not signal. We surface a “low discrimination” badge on the dimension tile so you don’t over-read small ranking differences. This is honest about where the data is doing real work versus producing noise.
How affordability is calculated
The serviceability card answers: “what would a typical mortgage look like in this suburb?” The calculation follows four steps:
- Median price. We take the most recent median sale price from the NSW Valuer General, filtered to the suburb’s dominant dwelling type (house or unit, determined by ABS Census dwelling-mix data).
- Loan amount.
loan = median_price × 0.80(assumes a 20% deposit). - Monthly repayment. Standard P&I (principal and interest) formula over 30 years at the current Sydney variable owner-occupier rate from the RBA’s Table F6.
- Repayment-to-income ratio.
RTI = (monthly_repayment / monthly_income) × 100where monthly income is the suburb’s median weekly household income (ABS Census 2021) multiplied by 52/12.
The 30% stress line
The 30% threshold comes from housing economics: when mortgage repayments exceed 30% of a household’s gross income, the household is considered to be in “housing stress.” This benchmark is used by the ABS, the RBA, and most lenders as a rule-of-thumb affordability ceiling.
Stickybeak uses three verdict labels based on this threshold:
- Within reach (RTI ≤ 30%): repayments are below the stress threshold.
- Tight (30-50%): possible but leaves little room for other costs.
- Stretch (> 50%): typical local incomes would not service this loan.
These verdicts are based on the suburb’s median income, not yours. For a personalised verdict, enter your own deposit and income in the “Your numbers” panel (coming soon).
FHB scheme eligibility
The eligibility card checks the suburb’s median price against the current price caps for each scheme (FHBAS, FHOG, First Home Guarantee, Help to Buy). These caps are set by the NSW and Commonwealth governments and refresh each July. Stickybeak updates its thresholds within a week of each cap change.
Eligibility is indicative only. Your actual eligibility depends on factors Stickybeak does not know: your citizenship, prior property ownership, income, and the specific property you are buying.
Air quality
AQI (Air Quality Index) is calculated from pollutant concentrations (PM2.5, PM10, NO&sub2;, O&sub3;, SO&sub2;) reported by NSW DCCEEW monitoring stations. Each suburb is assigned to the nearest station with active data. The AQI value shown is the highest sub-index across all pollutants at that station.
The station distance is disclosed on every air quality card. If the station is more than 5 km away, a proximity notice appears.
School ratings
ICSEA (Index of Community Socio-Educational Advantage) is published by ACARA for every Australian school. The national average is 1000. Higher scores correlate with stronger academic outcomes. Stickybeak shows individual school ICSEA scores and an average for the postcode.
ICSEA is a community-level indicator, not a direct measure of teaching quality. A school with a lower ICSEA may still deliver excellent education.
Crime data
Crime data comes from the NSW Bureau of Crime Statistics and Research (BOCSAR), drawn from the Recorded Criminal Incidents open dataset. Each suburb on Stickybeak is matched to BOCSAR’s own suburb geography (police-recorded incidents are coded at suburb level, not postcode) and we surface the 12-month rolling total alongside the rate per 100,000 residents, indexed against Greater Sydney.
The card shows four things:
- Rate per 100k. Total recorded incidents in the last 12 months divided by ABS population, scaled to 100,000 residents. This is the headline comparator.
- Vs Greater Sydney. The percentage difference between this suburb’s rate and the Greater Sydney average across the same 12-month window. Suburbs within ±5% are labelled “around average” with no colour cue.
- Top three offence categories. The three offence categories (out of 21 ANZSOC-mapped categories) that contributed the most incidents in this suburb.
- Domestic vs non-domestic assault split. BOCSAR flags assault as either DV-related or non-DV. We separate the two so the headline isn’t distorted: rising DV reports often reflect better reporting rather than more violence.
Small suburbs (under 5,000 residents) have their rate Bayesian-shrunk toward the LGA mean to reduce small-count noise. The card discloses when smoothing has been applied. Raw counts are visible on hover.
These are incidents reported to and recorded by NSW Police. Not all crime is reported; reporting rates vary by offence type, and DV reporting in particular has risen materially over the last decade as legal and cultural conditions have shifted. Treat trends with that lens.
Data is from BOCSAR’s suburb-level file (annual refresh) and LGA monthly file (quarterly refresh), both published under CC BY 4.0 by the NSW Department of Communities and Justice.
Climate projections
Climate projection data comes from NARCliM2.0, a regional climate modelling program run by the NSW Government through AdaptNSW (August 2024 release). NARCliM downscales global climate models to a 4 km grid across NSW, producing localised projections that are more relevant to suburb-level decisions than global averages.
Stickybeak shows three metrics across two time horizons (2030 and 2050), compared to a historical baseline:
- Days above 35°C. The projected number of days per year where maximum temperature exceeds 35°C. More extreme heat days affect comfort, energy costs, and health risk.
- Severe fire weather days. Days per year where the Forest Fire Danger Index (FFDI) exceeds the “severe” threshold. Higher counts indicate increased bushfire weather risk.
- Rainfall intensity multiplier. A factor applied to current-period rainfall intensity for extreme events (e.g. 1-in-100-year storms). A value of 1.2 means 20% more intense rainfall, which affects flood risk and drainage adequacy.
Two emission scenarios are available:
- SSP2-4.5 (moderate). Assumes emissions peak around 2040 then decline. This is the “middle of the road” pathway.
- SSP3-7.0 (high). Assumes emissions continue rising through mid-century. This is a higher-risk pathway useful for stress-testing property decisions.
Projections are aggregated at the SA2 level (Statistical Area Level 2, roughly suburb-sized). Where a postcode spans multiple SA2s, values are population-weighted. The card discloses which SA2(s) contributed to the figure.
Climate projections are inherently uncertain. They represent the best available science, not predictions. Use them as one signal among many when evaluating long-term liveability.
Known limitations
The grade is postcode-aggregate, not property-specific. School catchments cross postcode boundaries; we use the schools whose primary catchment overlaps the postcode polygon, which is a rough heuristic. Sale prices use the NSW Valuer General 12-month window and lag the live market by 6 to 12 months. Crime data is suburb-level from BOCSAR, with Bayesian shrinkage applied to suburbs under 5,000 residents to reduce small-count noise. The Safety dimension excludes air quality (which lives under Lifestyle); the dimensions are not independent and a buyer who cares about a specific factor should read the per-card data, not just the composite.
All data has limits. Medians can be skewed by small sample sizes. Census data is from 2021 and may not reflect recent demographic shifts. Air quality readings from distant stations are approximations. For a full list of caveats, see data limitations.
Nothing on Stickybeak is financial advice. Always consult a licensed mortgage broker or financial adviser before making a purchase decision.
Risk floors
A suburb with a buyer-harm factor in the worst decile (top 10% of risk across Greater Sydney) has its composite grade capped at C+, regardless of how strong other dimensions are. A buyer doesn’t trade off “great cafes” against “house might burn down.”
The following factors can trigger a cap:
- Bushfire risk (Very High). Postcode-level bushfire-mapped area in the worst decile.
- Flood risk (High). Postcode-level flood-mapped area in the worst decile.
- Crime rate. Rate per 100,000 residents in the worst decile across Greater Sydney.
- Building defect orders. Active building commissioner orders in the worst decile.
When a cap fires, the grade strip shows a non-dismissable callout naming the risk factor: “Capped at C+: bushfire risk (Very High).” The cap is a single floor; there is no harsher cap for multiple triggers.
Does it work?
We test whether higher-graded suburbs have, in retrospect, delivered stronger capital growth. We are not predicting future growth. We are checking whether the dimensions we weight are correlated with an outcome buyers care about. Past performance is not a forecast.
The backtest compares each dimension score (and the balanced composite) against annual capital growth from NSW Valuer General data. Calibration charts are updated annually with each new VG data release. Confounds we did not control for include rezoning, infrastructure announcements, and the broader rate cycle.
Even a weak positive correlation builds credibility. The current absence of any test would be the bigger problem. Full calibration charts are available when the backtest has been run against the current grade engine.
Last reviewed
Last reviewed May 2026. Material methodology changes are logged on this page.