AI Work Assistants Require More Effort Than Expected, CIOs Say

Jun 25, 20241 reportSeverity: MinorAssistantHigh confidence

Corporate AI work assistants like Microsoft Copilot and Google Gemini are providing incorrect answers to business questions due to outdated or inaccurate enterprise data, requiring significant internal effort to implement effectively.

Multiple corporations testing AI work assistants including Microsoft Copilot for 365 and Gemini for Google Workspace are experiencing reliability issues with the tools. At Juniper Networks, CIO Sharon Mandell reported that AI tools sometimes deliver answers based on 2023 data when asked about 2024 information. At Cargill, an AI tool failed to correctly identify the company's executive team members. At Eli Lilly, the pharmaceutical firm's CIO Diogo Rau reported incorrect answers about expense policies. The tools, which cost around $30 per user per month, are designed to work with enterprise data including emails, documents and spreadsheets to answer business questions. However, the enterprise data being accessed is often outdated or inaccurate, and the AI systems themselves are still maturing. Companies are finding they need significant internal resources to clean up and manage their data before the tools can be effective. Microsoft has introduced Copilot Studio to help direct the AI to authoritative data sources, and vendors acknowledge that users are discovering data quality issues they didn't know existed.

Domain classification, causal taxonomy, severity scores, and national security assessments were LLM-classified and may contain errors.

Risk Domain

3Misinformation

3.1False or misleading information

AI systems that inadvertently generate or spread incorrect or deceptive information, which can lead to inaccurate beliefs in users and undermine their autonomy. Humans that make decisions based on false beliefs can experience physical, emotional or material harms

Causal Classification

Entity

AI system

Due to a decision or action made by an AI system

Intent

Unintentional

Due to an unexpected outcome from pursuing a goal

Timing

Post-deployment

Occurring after the AI model has been trained and deployed

Harm Severity Assessment

Highest Score:2: Minor(Financial Loss, direct)

National Security Assessment

Overall Score

Stakeholders

: Microsoft, Google
: Cios, Enterprise Teams, Companies In General
: Cios, Enterprise Teams, Companies In General, Microsoft Copilot Users

AI System Classification

: Question Answering
: Writing Assistant
: Assistant
: 4 Minimal or No Risk
: 1

Population Impact

: 1,000

External Links

View on AI Incident Database