Thought Leadership

From Enterprise AI to DIY: My journey with Teamcenter AI Chat and creating a custom serverless version

Teamcenter AI Chat — The problem worth solving

In today’s digital enterprises, information isn’t just scattered — it’s deeply buried. Engineers, product managers, and analysts often spend 20–30% of their time searching through documents, specifications, change requests, and legacy systems just to answer what should be simple questions.
While working on Teamcenter-based PLM solutions, I saw this problem firsthand.

Product knowledge lived across

  • CAD and JT files
  • PDFs and specifications
  • Change notices and BOMs
  • Internal documentation and tribal knowledge


Traditional keyword search struggled in such environments. It lacked context, semantic understanding, and cross-document reasoning.

This gap led to the idea of Teamcenter AI Chat — an LLM-powered assistant that allows users to ask natural language questions and get accurate, contextual answers directly from enterprise product data.
At its core, the system followed a Retrieval-Augmented Generation (RAG) architecture.

Overall ML system design (RAG-based architecture)

1. Indexing process: Turning enterprise knowledge into searchable intelligence

The indexing process is the foundation of the system. Its responsibility is to convert the enterprise knowledge base into embeddings that can be efficiently searched.
Key Steps:

  • Document parsing and chunking
    PDFs, technical documents, and metadata are parsed
    Large documents are broken into meaningful chunks
    (Optional) Advanced parsing like tabular extraction for BOMs and specs
  • Multimodal embedding generation
    Text chunks are encoded using a CLIP text encoder
    Images (e.g., diagrams, drawings) are encoded using a CLIP image encoder
    Both text and image embeddings are mapped into a shared embedding spac
  • Index storage
    Embeddings are stored in an index table optimized for Approximate Nearest Neighbor (ANN) search. Enables fast and scalable semantic retrieval. This approach allowed the system to search by meaning, not just keywords — a huge advantage over traditional PLM search.

2. Safety filtering: Guardrails for Enterprise AI

Before any query enters the system, it passes through safety filtering.
Purpose:
Detect inappropriate, harmful, or non-compliant queries
Enforce enterprise usage policies
Prevent misuse of internal product data
In regulated or IP-sensitive environments like PLM, this layer is non-negotiable.

3. Query expansion: Making user intent explicit

Enterprise users often ask short, incomplete, or typo-prone questions.
Query expansion improves retrieval quality by:
Fixing grammatical errors and typos
Rewriting queries for better semantic flow
Expanding intent to include related terms and concepts
For example:
“What material is used?”
becomes
“What material is used in the selected CAD component according to the latest design specification?”
This significantly improves recall during retrieval.

4. Retrieval: Finding the most relevant knowledge

Once the query is refined:
The expanded query is converted into an embedding using the CLIP text encoder
An ANN-based nearest neighbor search retrieves the most relevant chunks from the index
This step ensures the LLM is grounded in actual enterprise data, not hallucinations.

5. Generation: From context to answer

The generation phase combines reasoning with factual grounding.
Two Key Steps:

  • Prompt engineering
    User query + retrieved context are combined into a structured prompt
    Techniques like Chain-of-Thought (CoT) help guide reasoning
    Ensures answers remain traceable to source documents
  • LLM inference
    The LLM generates the final response using controlled sampling (e.g., top-p)
    Optimized for correctness, clarity, and enterprise tone
    The result is a response that is:
    • Context-aware
    • Fact-grounded
    • Safe and explainable

Building a DIY Teamcenter AI Chat–like experience using serverless cloud (AWS/Azure)

Working on Teamcenter AI Chat gave me deep exposure to enterprise-grade RAG systems. Naturally, the next step was asking:
Can we build a lighter, affordable, DIY version of this system using serverless cloud services?
The answer was yes.

By leveraging:

  • Serverless compute (AWS Lambda / Azure Functions)
  • Managed vector stores
  • Cloud-native document pipelines
  • Pay-as-you-go LLM APIs


It’s possible to recreate a Teamcenter AI Chat–like experience at a fraction of the cost — suitable for:

  • Smaller enterprises
  • Internal tools
  • Proofs of concept
  • Innovation teams


This DIY approach retains the same architectural principles:
RAG-based grounding
Strong indexing
Safe and scalable design
…but without the heavy enterprise overhead.

Disclaimer

This is a research exploration by the Simcenter Technology Innovation team. Our mission: to explore new technologies, to seek out new applications for simulation, and boldly demonstrate the art of the possible where no one has gone before. Therefore, this blog represents only potential product innovations and does not constitute a commitment for delivery. Questions? Contact us at Simcenter_ti.sisw@siemens.com.

Closing thoughts

Teamcenter AI Chat showed me how powerful LLM + Retrieval can be when applied to real enterprise problems. Moving from that experience to building a custom, serverless RAG system reinforced an important lesson:
The real value of AI isn’t the model — it’s the system design around it.
If you get the architecture right, the rest scales naturally.

Siddharth Kale
Software Engineer

Leave a Reply

This article first appeared on the Siemens Digital Industries Software blog at https://blogs.sw.siemens.com/art-of-the-possible/from-enterprise-ai-to-diy/