Case Study

Building a Lightweight AI Recommendation Engine for Real Estate.

By Enzo García.

Recently, we implemented a high-performance recommendation system for a Real Estate client.

In the world of software development, “Artificial Intelligence” is often synonymous with massive neural networks, expensive GPUs, and weeks of model training. But for many business applications, that approach is overkill.

At weKnow Inc., we believe in the principle of Right-Sized Engineering. Recently, we implemented a high-performance recommendation system for a Real Estate client. The goal? To solve the “Cold Start” problem and deliver hyper-personalized property alerts without the overhead of “Heavy AI.”

Here is a deep dive into how we built a Two-Stage Content-Based Filtering System—the same architectural strategy used by Spotify’s Discover Weekly and Netflix—using a lightweight, statistical learning approach.

1. Signal Processing: Time Decay Algorithm.

Objective: Transform the clickstream (visit history) into a quantifiable Interest Score.

We use an Implicit Feedback model. Unlike a Like or Rating (which are explicit), interest is inferred based on frequency and recency.

Mathematical Formula:
For each property p and user u, we calculate the score Su,p:

  • Log_10(1 + Freq): We apply logarithmic smoothing to frequency. This prevents a “bot” user or a very obsessive one who clicks 100 times from distorting the model. The difference between 1 and 10 visits is significant; between 50 and 60 is marginal.
  • Exponential Decay: Delta t is the time elapsed since the last visit, and H is the Half-Life, set to 14 days.

Effect: The value of a visit is reduced by 50% every 14 days. This ensures the system prioritizes current behavior over historical behavior (“What have you done lately?”).

The “What Have You Done Lately?” Philosophy

By setting a Half-Life of 14 days, the value of a user’s interaction drops by 50% every two weeks. This ensures the system prioritizes current behavior over historical data. If a user looked for rentals a year ago but is looking for luxury purchases today, the system adapts automatically.

2. User Profiling: Centroid Vectorization (Weighted Centroids).

Objective: Dimensionality Reduction. Convert n visited properties into a single “User Vector” (V_u).

Instead of traditional Collaborative Filtering, which requires massive matrices of User-Item data, we used Feature Weighted Averaging to calculate the “Center of Gravity” of the user’s interests.

To determine the Target Price (P_target), we do not use a simple arithmetic average (which would be a statistical error). We use the Score-weighted mean Score S:

Where P_i is the price of property i and S_u,i is the score calculated in the previous phase.

Technical Impact: This creates a “Centroid” that dynamically shifts toward the properties the user visited more recently or more frequently. If the user viewed a $5M home a year ago (low Score) and ten $200k homes yesterday (high Score), the centroid will be at $200k.

Intent Detection: Weighted counters are applied to classify the user into behavior clusters: Buyer, Renter, or Hybrid.

3. Retrieval & Ranking: Gaussian Density Functions.

Objective: Probabilistic matchmaking over new inventory (Cold Start Items).

For the final search, we do not use simple binary filters (SQL  WHERE price = X). We use OpenSearch Function Score Query, applying a probability density function.

The Technique: Gaussian Decay Function
To rank candidate properties by price, we apply a bell curve (Gaussian) centered on the user’s P_target.

  • Origin (P_target): The highest point of the curve (Score 1.0).
    Scale (sigma): We define a tolerance of 20%.
    Decay: How quickly we penalize deviation.

 

Algorithm Visualization:

  • A property with the exact price gets 1.0.
  • A property 10% more expensive gets ~0.9.
  • A property 50% more expensive gets ~0.1.

 

This enables a “Fuzzy” Ranking: The system understands “Close is good enough,” mimicking human search cognition, rather than being a rigid database system.

Executive Summary of Technologies

Component

Technology / Algorithm

Industry Standard

Data Ingestion

Batch Processing, Implicit Feedback

Google Analytics, Mixpanel

Feature Engineering

Logarithmic Smoothing, Exponential Decay

Reddit (Hot Algorithm), HackerNews

Profiling

Weighted Centroids (Vector Space)

Stitch Fix, Pinterest

Search Engine

OpenSearch (Lucene), Gaussian Function Score 

Amazon, Uber, Tinder

This system is, in essence, a lightweight AI engine that learns from immediate user interaction to predict their next purchasing preference.

FAQ.

Technically, any system that makes autonomous decisions based on data to achieve a goal, simulating intelligent behavior, is AI.

Your system doesn’t have hand-written rules (“If the user is Juan, show him houses in Escazú”). Instead, it:

  • Learns: Observes behavior (clicks, dates).
  • Generalizes: Deduces an abstract pattern (the Centroid/Profile).
  • Predicts: Guesses what the user will like from a list of things they have never seen.

By using vectors (user profile), weighted averages, and probability functions (Gaussian), you are using Statistical Learning, which is the fundamental basis of Machine Learning.

Here is the key difference compared to systems like Netflix or ChatGPT:

FeatureHeavy AI (Deep Learning)Engine (Lightweight AI)
TrainingRequires days/hours of training (model.fit) with millions of data points before it can function.Does not require training. It computes the model “on the fly” in milliseconds when the script runs.
InfrastructureRequires expensive GPUs (graphics cards) and large amounts of RAM.Runs on a standard (low-cost) CPU or in a Serverless function.
Complexity“Black Box.” It is difficult to know why the neural network recommended something.“White Box.” It is explainable. We know exactly why it recommended House X because the Price Score was 0.9.
DataRequires millions of interactions to avoid errors.Works well with little data (Cold Start). With just 3 clicks, it already starts working.

Are you looking to build intelligent, scalable platforms?

At weKnow Inc., we provide the specialized engineering talent to turn complex data problems into elegant, efficient solutions.