Have Any Questions?
Get in Touch
Project Information
HealthCare
Woe LR
Power Bi
Azure Databricks
Synapse
Improved Patient Care
Increaed Productivity
Reduction in cost associated with unnecessary staff
Industry:
Retail
Techniques:
GPT-powered Large Language Models
Technology Stack:
Power BI, Azure Databricks, Synapse Analytics
Business Benefits:
  • Structuring of free text fields for analysis
  • Improved decisioning regarding cannibalization
  • New insights
  • Ready to unlock the same results?

    Overview

    TrueNorth partnered with a retail organization seeking to unlock deeper insights from unstructured product data.


    The goal was to transform free-text product descriptions into structured, analyzable data that could be used to assess product overlap, cannibalization, and substitution trends across the portfolio.

    The Challenge

    The client’s vast inventory contained thousands of product entries with inconsistent or incomplete descriptions often recorded manually or through third-party imports.

    This created several challenges:

    Difficulties comparing similar products due to non-standardized naming conventions
    Limited visibility into substitute or competing items
    Time-consuming manual analysis by category managers

    The business needed a solution that could automatically extract and structure critical product details such as brand, measurement, and quantity from unstructured text fields at scale.

    Our Approach

    TrueNorth implemented an AI-powered text extraction framework using large language models (LLMs) to standardize and enrich product metadata.

    1
    Data Ingestion & Processing
    All product descriptions were captured as free text from the client’s existing retail database, creating a comprehensive corpus for processing.
    2
    Intelligent Text Parsing with LLMs
    Using Azure OpenAI APIs, custom-engineered prompts extracted key product details including:

  • Brand names
  • Product types
  • Measurement units
  • Quantities

  • This information was validated and mapped into a structured data format suitable for analysis.
    3
    Integration & Visualization
    The structured data was looped into Power BI dashboards and integrated into Azure Databricks and Synapse Analytics, enabling the business to visualize relationships, overlaps, and cannibalization effects between similar SKUs.
    1
    Collect & Correlate
    Collected and structured key datasets, including:
  • Demographics and population health data
  • Epidemiology and diagnosis statistics
  • Historical patient volumes and level-of-care requirements
  • 2
    Application submission
    Lorem ipsum dolor sit amet consectet adipiscing elit, sed do eiusmod tempor incididunt ut labore et.
    3
    Inspection
    Lorem ipsum dolor sit amet consectet adipiscing elit, sed do eiusmod tempor incididunt ut labore et.
    4
    Release Letter
    Lorem ipsum dolor sit amet consectet adipiscing elit, sed do eiusmod tempor incididunt ut labore et.
    5
    Premium Collection
    Lorem ipsum dolor sit amet consectet adipiscing elit, sed do eiusmod tempor incididunt ut labore et.
    6
    Insurance Permit
    Lorem ipsum dolor sit amet consectet adipiscing elit, sed do eiusmod tempor incididunt ut labore et.
    The Results
    Structured thousands of free-text fields into actionable product data
    Improved decision-making on product substitution and assortment planning
    Delivered new insights into product cannibalization and overlap
    Business Impact

    The AI-driven data structuring pipeline allowed the client to transform fragmented text data into a reliable, searchable dataset. Significantly reducing manual analysis time and improving product-level decisioning.

    By integrating LLMs into the retail analytics stack, TrueNorth enabled the business to move from raw text to intelligent product intelligence at scale.

    Can this approach be applied beyond retail data?
    Yes, the same LLM-based framework can structure free-text data in manufacturing, logistics, or healthcare environments.
    How accurate is the extraction?
    Over 95% accuracy was achieved after fine-tuning prompts and validating extracted data through cross-referencing with existing fields.
    Does the model continue learning?
    Yes. The model is continuously improved with feedback loops from new data imports, improving precision over time.