Exploring the Impact of LLMs on Data Analysis in Business
Written on
When reflecting on the difficulties of interpreting complex systems, I recall a specific experience from my time at Tripadvisor. I was collaborating with our Machine Learning team to analyze customer behaviors that predicted high lifetime value (LTV) for the Growth Marketing team. We worked alongside a skilled Ph.D. Data Scientist who developed a logistic regression model and presented its coefficients as an initial assessment.
During our review with the Growth team, confusion arose. Logistic regression coefficients can be challenging to interpret due to their non-linear scale, and the most predictive features weren't easily manipulable by the Growth team. After a brief moment of contemplation, we created a follow-up ticket for further analysis, but, as often happens, both teams soon shifted to their next priorities. The Data Scientist had urgent tasks related to our search ranking algorithm, and, in practical terms, the Growth team discarded the analysis.
That experience still lingers in my mind — did we abandon it too quickly? What if our feedback loop had been tighter? What if we had persisted in our exploration? What might further iterations have uncovered?
Unlocking Exploratory Analysis at Scale
The anecdote above illustrates a case of exploratory analysis that didn't quite succeed. Unlike descriptive analysis, which merely describes current situations, exploratory analysis aims to deepen understanding of complex systems without posing specific questions. Consider the variety of inquiries one might encounter in a business setting:
Exploratory questions are generally open-ended and strive to enhance comprehension of intricate problem areas. This type of analysis often demands more iterative cycles and closer collaboration between the domain expert and the analyst, who are rarely the same individual. In the previous example, the partnership lacked sufficient depth, the feedback loops were too long, and we didn't invest enough time.
These hurdles are why many professionals endorse a “paired analysis” approach for data exploration. Similar to paired programming, this method pairs an analyst with a decision-maker for real-time exploration. Unfortunately, such close collaboration seldom occurs in practice due to constraints on resources and time.
Imagine your workplace — what if every decision-maker had an experienced analyst at their side? What if that analyst could respond immediately to their inquiries? What if they could seamlessly follow their partner's thought processes, generating ideas and hypotheses spontaneously?
This is the opportunity that LLMs present in analytics — the potential for anyone to engage in exploratory analysis with the support of a technical analyst.
Let's explore how this scenario might manifest in practice. The following case study and demonstrations will showcase how a decision-maker with domain expertise could effectively collaborate with an AI analyst capable of querying and visualizing data. We will compare the exploratory experiences provided by ChatGPT's 4o model with a manual analysis conducted in Tableau, which will also serve as a check against potential inaccuracies.
Note on Data Privacy: The video demonstrations linked in the upcoming section utilize purely synthetic datasets that mimic realistic business patterns. For general guidelines on privacy and security for AI Analysts, refer to Data Privacy.
Case Study: Ecommerce Analytics
Imagine being a busy executive at an e-commerce clothing website. You have your executive summary dashboard with established KPIs, but one morning you notice something alarming: marketing revenue has dropped by 45% month-over-month, and the cause isn't immediately clear.
Your thoughts race in several directions: What is behind the revenue decline? Is it confined to specific channels? Does it pertain to certain message types?
More importantly, what actions can we take? What strategies have been successful recently? What hasn't worked? What seasonal trends do we typically observe at this time of year? How can we leverage these trends?
To address these open-ended inquiries, a moderately complex multivariate analysis is necessary. This is precisely the kind of task that an AI Analyst can assist with.
Diagnostic Analysis Let's examine the significant drop in month-over-month revenue more closely.
In this scenario, we’re investigating a major decline in overall revenue related to marketing activities. As an analyst, two parallel thought processes can initiate the diagnostic process:
- Break down total revenue into various input metrics:
- Total message sends: Were fewer messages sent?
- Open rate: Were recipients opening these messages? Was there an issue with the subject lines?
- Click-through rate: Were recipients less inclined to click through? Was there a problem with the message content?
- Conversion rate: Were recipients less likely to purchase after clicking through? Was there an issue with the landing experience?
- Analyze these trends across different categorical dimensions:
- Channels: Was this issue widespread across all channels or limited to specific ones?
- Message types: Did this issue affect all message types?
In this case, within a few prompts, the LLM can pinpoint a significant difference in messaging sent during the two periods — specifically, the 50% sale held in July that was not replicated in August.
Ad Hoc Data Visualization Now that we understand the revenue dip better, we can't run a 50% off sale every month. What else can we do to optimize our marketing efforts? Let's examine our top-performing campaigns to see if anything other than sales promotions ranks in the top 10.
Data visualization tools offer an intuitive interface for building visual representations of data. Today, tools like ChatGPT and Julius AI can effectively replicate an iterative data visualization workflow.
These tools utilize Python libraries to create and display both static visualizations and interactive charts directly within the chat interface. The ability to modify and iterate on these visualizations through natural language is remarkably fluid. With the introduction of coding modules, image rendering, and interactive chart features, the chat interface closely resembles the familiar “notebook” format popularized by Jupyter notebooks.
Within a few prompts, you can often refine a data visualization as swiftly as if you were an advanced user of a tool like Tableau, without needing to consult help documentation to understand how Tableau’s Dual Axis Charting functions.
Here, we can observe that “New Arrivals” messages generate strong revenue per recipient, even at substantial send volumes:
Seasonality & Forecasting Given that “New Arrivals” are resonating well, which types of new arrivals should we prioritize for next month? As we approach September, we need to understand how customer purchasing behavior shifts during this time. Which product categories do we expect to see an increase or decrease in?
Again, with just a few prompts, we can produce a clear, accurate data visualization without needing to navigate Tableau’s intricate Quick Table Calculations feature.
Market Basket Analysis Now that we have identified which product categories are likely to rise next month, we should refine our cross-sell recommendations. If Men’s Athletic Outerwear is expected to see the largest increase, how can we determine which other categories are most frequently purchased alongside these items?
This is known as “market basket analysis,” and the data transformations required for it can be somewhat complex. Conducting a market basket analysis in Excel is nearly impossible without cumbersome add-ons. However, with LLMs, all it takes is pausing to ask your question clearly:
> “Hey GPT, for orders that included an item from men’s athletic outerwear, which product types are most often bought by the same customer in the same cart?”
The Future of Business Processes with AI The demonstrations above showcase examples of how LLMs could enhance data-driven decision-making on a large scale. Major industry players have recognized this opportunity, and the ecosystem is rapidly adapting to incorporate LLMs into analytics workflows. Consider the following:
- OpenAI recently rebranded its “code interpreter” beta as “Advanced Data Analysis” to align with how early users were leveraging the feature.
- With GPT4o, OpenAI now supports interactive chart rendering, including features for color coding, tooltip rendering on hover, sorting/filtering charts, and selecting chart columns for calculations.
- Tools like Julius.ai are emerging to specifically tackle key analytics use cases, providing access to various models as needed. Julius offers models from both OpenAI and Anthropic.
- Providers are simplifying data sharing, moving beyond static file uploads to Google Sheet connectors and advanced API options.
- Platforms like Voiceflow are emerging to facilitate AI app development, focusing on retrieval augmented generation (RAG) use cases like data analysis, thus enabling third-party developers to connect custom datasets to various LLMs.
With this context, let's consider how BI analytics might evolve in the next 12 to 24 months. Here are some predictions:
Human analysts will remain crucial for asking the right questions, interpreting ambiguous data, and iteratively refining hypotheses.
The primary advantages of LLMs in analytics include their ability to: 1. Transform natural language into code (Python, R, SQL). 2. Intelligently clean datasets. 3. Effectively visualize data.
Domain expertise will still be necessary for interpreting trends and iterating on hypotheses. Humans may undertake fewer queries and visualization tasks, yet their role in advancing exploratory analyses will remain vital.
Skills in this domain will become more critical than technical proficiency in querying and data visualization. Decision-makers who cultivate strong data literacy and critical thinking will significantly enhance their ability to explore and comprehend complex systems with LLM support.
The most significant barrier to the widespread adoption of AI Analysts will be concerns regarding data privacy, which will be addressed in various ways.
Despite solid privacy policies from major LLM providers, data sharing will be a source of anxiety. However, several solutions are being explored. LLM providers have started testing dedicated instances with improved privacy and security measures. Other potential solutions may include encryption/decryption for data shared via API, as well as using dummy data and employing LLMs solely for code generation.
Expect to see increased collaboration between LLM providers and cloud database services (e.g., Gemini/BigQuery, OpenAI/Microsoft Azure, Amazon Olympus/AWS) to help tackle these issues.
Once data privacy and security concerns are resolved, early adopters will face initial challenges related to speed, error handling, and minor inaccuracies. These challenges will be overcome through the joint efforts of LLM providers, organizations, and individuals.
The leading LLM providers are continuously enhancing speed and accuracy at a rapid pace. This trend is expected to persist, with analytics use cases benefiting from specialization and selective context (memory, tool access, and specific instructions).
Organizations will be motivated to invest in LLM-based analytics as deeper integrations emerge and costs decrease.
Individuals will find it easier to learn effective prompt-writing than SQL, Python/R, or data visualization tools.
BI teams will shift focus from fulfilling analytics requests to developing and supporting the underlying data infrastructure. Thorough documentation of data sets will become essential for all BI data going forward.
Once LLMs attain a certain level of organizational trust, analytics will largely become self-service.
The “data dictionary,” often a secondary concern for BI organizations, will become a fundamental requirement for any entity looking to utilize AI Analysts.
With LLMs capable of performing complex data transformations on the fly, fewer aggregate tables will be necessary. Those utilized by LLMs will be closer to a “silver” level than a “gold” level in medallion architecture.
A prime example of this is the market basket analysis mentioned earlier. Traditional market basket analysis requires creating a new dataset through intricate transformations. With an LLM, a “raw” table can be employed, and those transformations can be executed “just in time,” focusing only on the products or categories the end user is interested in.
However, raw datasets are larger and will incur higher costs due to the increased volume of tokens required for input. This trade-off is significant and depends on the cost model for utilizing LLM APIs.
Voice will emerge as the primary input method for LLM interactions, including analytics use cases.
Voice technology has yet to gain traction through assistants like Siri and Alexa. Nonetheless, with the launch of GPT 4o and its capabilities in real-time conversational speech, we can anticipate widespread adoption of voice interaction.
Interactive visualizations will become the primary output for analytics use cases.
It is well-established that human brains are far more efficient at processing visual information than any other medium. Expect the user experience for exploratory analysis to follow a voice/visualization pattern, with a human posing questions while an AI analyst visualizes information whenever possible.
The most effective LLM analytics systems will utilize multiple models from a single provider, intelligently employing the least complex model necessary for each task, thereby reducing token costs.
As LLM providers continuously introduce new models, a token-based pricing structure has emerged that applies multipliers based on the sophistication of the model used. For instance, a response from GPT4o will be costlier than one from GPT3.5. To minimize expenses, AI applications will need to optimize their usage across various models. Due to the costs associated with transmitting large volumes of data, organizations may limit data sharing to a single provider.
Expanded context windows will enable users to conduct long-term analyses, maintaining context over days or weeks as new data becomes available and hypotheses evolve.
The size of an LLM's context window dictates how much information can be inputted. This can be particularly significant for a RAG (retrieval augmented generation) use case like data analysis. Larger context windows allow for greater information exchange and longer conversational interactions.
Currently, Google’s Gemini 1.5 model boasts the largest context window available among LLMs:
Setting Up an AI Analyst in 2024
If you’re eager to experiment with an LLM for analytics purposes, some initial setup is necessary. Here are some tips to optimize performance and avoid common pitfalls.
Initial Configuration Datasets and Data Dictionary First and foremost, you must grant the LLM access to your data. The datasets you provide should contain the necessary information to answer your inquiries. Furthermore, a clear and well-structured data dictionary for all fields within the dataset is essential for the LLM to comprehend each metric and dimension effectively. You can view the datasets and data dictionaries used in the demonstrations here: - Events Data Set Example - Marketing Performance Data Set Example
Data Connections There are several methods to share data with an LLM, depending on the model you are using: - File Upload: The simplest way is to upload a static file through the web app interface. ChatGPT, Claude, and Julius.ai all support file uploads within their UIs. - Google Sheets: ChatGPT and Julius.ai allow for native imports from Google Sheets. - API: For more advanced applications or AI-driven apps, you can share datasets via API. Consult your provider’s developer documentation for further details.
Data Privacy Currently, unless you’re using an open-source LLM like Meta’s Llama and running it locally, you will need to send data to the LLM via one of the aforementioned methods. This raises clear security and data privacy concerns.
If you wish to analyze datasets containing personally identifiable information (PII) such as customer IDs, consider whether an aggregated dataset would suffice. If not, ensure you encrypt these fields before uploading to protect PII.
Additionally, consider if you are sharing any confidential company information. In the event of a data breach, you wouldn’t want to be responsible for exposing sensitive information to a third party.
You can review the privacy policies of the models discussed in this article here: - OpenAI: https://openai.com/policies/privacy-policy/ - Anthropic: https://support.anthropic.com/en/collections/4078534-privacy-legal - Julius: https://julius.ai/docs/privacy-policy
Custom Instructions When creating a custom GPT, you can provide it with a set of specific instructions to guide its behavior. For models like Claude or Gemini that don’t support custom instances, you can initiate your conversation with a detailed prompt containing this guidance. The custom instructions used in the demo above can be found here: Ecommerce BI Analyst Custom Instructions.
From my experience, the following guidance can be particularly useful for an AI analyst: - Identity: Inform the LLM of its role. Example: “You are an expert Business Intelligence Analyst who will be answering questions about an ecommerce apparel retailer.” - Datasets: If you plan to use the same datasets consistently, provide additional context about the data. This will supplement the data dictionary and clarify the LLM's understanding of the data. Example: “This dataset contains information on all purchases recorded for the ecommerce apparel retailer over the course of two months — July and August 2022. Each row represents one purchase event and includes data on the product purchased and the customer who made the purchase.” - Data Visualization Best Practices: The charts produced by LLMs may need some adjustment for human readability. This aspect will likely improve over time, but for now, I’ve found it helpful to incorporate some visualization best practices into the custom instructions. Example: “Always ensure that axis labels are clear by reducing the number of labels shown, reformatting their orientation, or adding extra padding.” - Calculated Fields: To avoid ambiguity in the computation of derived metrics, provide explicit guidance on how to calculate them. Example: “Click-through Rate” = the sum of “Unique Clicks” divided by the sum of “Unique Opens.” - General Guidance: As you work with the LLM, note any undesirable behaviors and consider adding explicit instructions. Example: “When a question includes the words ‘show me,’ always attempt to respond with a data visualization.”
Sometimes, it may be necessary to “prime” your GPT instance to confirm it has understood these instructions. In this case, ask it some questions regarding the contents of your custom instructions to verify comprehension.
Things to Watch Out For - Processing Times: GPT might occasionally take a while to generate a response. The demonstrations above were edited for brevity and clarity. Depending on system load, be prepared for delays as the LLM processes the data. - Error Messages: As you begin experimenting with LLMs for analysis, you may encounter frequent error messages while refining your custom instructions and prompts. Often, an LLM will retry automatically when it encounters an error and correct itself. Other times, it might fail, requiring you to inspect the code and troubleshoot. - Trust, but Verify: For more complex analyses, verify the model’s output to ensure it’s accurate. Until LLMs evolve further, you may want to run the Python code locally in a Jupyter notebook to comprehend the model’s methodology. - Hallucinations: I’ve noticed that hallucinations are less likely in discussions referencing a knowledge base (dataset) with objective truths. However, I’ve occasionally observed LLMs misrepresenting the definition of a computed metric or inaccurately discussing elements of the analysis.
Conclusion LLMs are set to significantly disrupt the analytics landscape by automating querying and visualization workflows. Over the next two to three years, organizations that integrate LLMs into their analytics processes and reorient BI teams around this emerging technology will gain a substantial advantage in strategic decision-making. The potential value is substantial enough to overcome initial challenges related to data privacy and user experience.
In the words of the immortal Al Swearengen, “Everything changes; don’t be afraid.”
Unless otherwise noted, all images are by the author.