More Than A Search Tool

Increase revenue by taking action directly from your documents, transcriptions, and emails.

Get Started
Powered by:
What is SemDB?

Access your data without cloud limitations and hallucinations

SemDB stands for Semantic Database, and it's a key component of the Intelligence Factory technology stack. SemDB integrates with legacy databases to automatically extract and generate actionable information based on their content, whether it’s documents, records, PDFs, spreadsheets, or SQL databases.
Search semantically, search the meaning behind your documents and tag them for further action
Get Started
Diagram showing a query analysis. At the top, the query states: 'I am looking for some test kits.' Below, the analysis is labeled SearchSememe and breaks the query into structured components:

SourceActor = I
TargetActor = TestKit
Quantity = Some
Plurality = Plural
YesNo = Yes
Tense = Present
The breakdown highlights how the query's meaning is processed and represented systematically.
Fast, scales to tens of thousands of documents easily
Get Started
Diagram of SemDB (Semantic Database) at the top center, featuring a brain icon split into neural and circuit-like visuals. The system connects to multiple databases (represented as stacked cylindrical icons) and documents (depicted as files with structured content and orange highlights). Lines link SemDB to the databases and documents, visually indicating data integration and semantic structuring. The dark background emphasizes the connections and components.
API based Integration with other systems
Get Started
Diagram showcasing SemDB (Semantic Database) with an API at the center. Four computer terminals, each represented with a monitor and keyboard, are connected to the API via lines, indicating data flow or communication. The SemDB logo, featuring a brain icon split between neural and circuit visuals, is displayed prominently at the bottom. The illustration highlights API-based integration enabling multiple systems to interact with SemDB.
Frequently Asked Questions

Efficient Data Management with SemDB and Ontologies

Managing structured and unstructured data is a complex challenge, especially when working with ontologies and databases. SemDB introduces a new way to streamline the process by blending the simplicity of code with the power of programmatically managed graphs. Let’s dive into how it works and why it’s a game-changer.

What makes SemDB different from traditional graph storage?

Traditionally, graphs are stored in databases, which are great for scalability but can become overwhelmingly complex due to the richness of ontologies. SemDB allows you to define graphs with code, making them easier to write, manage, and understand. This approach bridges the gap between the structured ease of databases and the flexibility of code.

Can SemDB handle large data sets?

Yes! While SemDB is designed to make learning and managing graphs in memory more efficient, it supports transitioning large graphs into a dedicated ontology database. This database can handle millions or billions of objects without performance issues, ensuring scalability for large projects.

How does SemDB manage both structured and unstructured data?

It uses two learning modes:
  • Structured Data Mode: This mode populates the ontology first with pre-defined structures, such as database schemas, before feeding the data into the semantic database.
  • Unstructured Data Mode: In this case, raw text (e.g., from PDFs, news articles, or emails) is analyzed to extract structured information, which is then used to populate the semantic database first, followed by the ontology.

What’s the difference between a semantic database and a vector database?

A vector database focuses solely on pieces of text and their similarity, while a semantic database is a wrapper around a vector database. The semantic database adds organizational layers, allowing you to differentiate between data sources (e.g., PDFs, emails, or Google Docs) and bucket them for easier searching and retrieval.

How does SemDB enhance efficiency?

By creating graphs programmatically in memory, SemDB makes the learning process faster and more intuitive. This approach reduces the overhead of managing data line by line and cuts down on costs associated with processing ever-growing structured and unstructured datasets.

How does SemDB compare to other solutions in the market?

Most products focus on either structured or unstructured data management—rarely both. SemDB integrates both seamlessly and incorporates an ontology, which other solutions lack. For example, Glean focuses on unstructured data but doesn’t handle structured data or ontologies effectively.

Is this approach scalable for real-world applications?

Absolutely. While in-memory graphs are great for learning and development, very large graphs are best stored in the ontology database. This hybrid approach ensures SemDB remains efficient for small-scale prototyping and scalable for enterprise-level data management.
Use Cases

Designed to Support Industry-Specific Needs

From healthcare and finance to manufacturing and logistics, SemDB is tailored to support the unique needs of industries managing sensitive and large-scale legacy data.
Icon of a hospital building with a large circular cross symbol at the top center, outlined in orange. The building is rectangular with three sections: a taller central section and two smaller side sections. The windows are outlined in alternating orange and white, evenly distributed across the building's façade. The background of the building is dark gray, and the outline of the structure is in white.

Healthcare &
Life Sciences

Access patient records and research data while meeting strict privacy regulations.
Illustration of a bank building featuring a classical design with a triangular roof and four large columns. An orange circular symbol is centered at the top of the roof. The structure is dark gray with white outlines, emphasizing its formal and institutional appearance, commonly associated with banks.

Financial
Services

Securely retrieve customer histories and transaction data with compliance.
Illustration of a government building featuring a large central dome with an orange accent and a flag on top. The structure has a symmetrical design, with four prominent columns in the center and two side wings, each containing windows outlined in orange and white. The building is dark gray with white outlines, symbolizing its formal and institutional purpose as a government or legislative building.

Government & Public Sector

Manage citizen records and sensitive data locally, without cloud reliance.
Illustration of a factory building with two tall smokestacks emitting smoke, represented by wavy lines. The building has a jagged roofline and multiple windows outlined in orange and white. The factory structure is dark gray with white outlines, emphasizing its industrial purpose.

Manufacturing & Logistics

Quickly access supply chain and product development data.
Use Case Highlight

How SemDB Extracts Actionable Data from Call Transcriptions

The image depicts a stylized computer monitor displaying an interface for uploading or managing files. The screen is divided into two sections: the left section labeled "Audio" contains several file icons, one highlighted in orange, while the right section labeled "URL" shows a list of links represented by "http://..." with orange link icons. At the top center of the monitor is an orange upload icon, symbolizing the action of uploading files or links. The overall color scheme combines white, gray, and orange on a dark background, giving a modern, tech-focused appearance.
Import Phone Conversations
Provide a list of URLs or files for audio conversations
The image visually represents the process of audio transcription. On the left, there is a speech bubble icon with orange waveform lines inside, symbolizing audio or voice input. On the right, there is a dark document labeled "Transcript" with alternating white and orange lines, representing the text output of the audio in a structured format. The combination of these two elements conveys the transformation of spoken audio into a written transcript. The clean and minimal design uses dark gray, white, and orange to emphasize the flow from audio input to text output.
Transcribe
Transcribe any audio files 
Performs speaker diarization (if applicable)
Use custom prompts to reduce misspellings, provide domain specific knowledge
The image visually represents semantic extraction. On the left, a transcript document labeled "Transcript" contains alternating white and orange lines, symbolizing structured text or extracted information. On the right, three distinct geometric shapes — a square at the top, a circle in the middle, and a triangle at the bottom, all in orange — represent the extraction of key semantic elements or categories from the transcript. This highlights the transformation of unstructured text into structured, meaningful components, with the geometric shapes emphasizing the categorization and organization of the extracted information. The clean design uses dark gray, white, and orange to draw attention to the structured output of the semantic extraction process.
Extract Semantics
System analyzes the files and builds a semantic database
Get to Work!
The image visually represents the transformation of multiple documents into a structured JSON format. On the left, a collection of six document icons is displayed, with some highlighted in orange, symbolizing unstructured data or files. On the right, a large document icon labeled "JSON" features curly braces ({...}) and dots inside, indicating a structured, machine-readable format. The visual suggests a process where unstructured or semi-structured data, such as multiple documents, is processed and converted into a standardized JSON format for easier organization, retrieval, or analysis. The color scheme of dark gray, white, and orange emphasizes clarity and structure in the transformation process.
1. Extract Structured Data
Pull emails, phone numbers, customer numbers, whatever out of the conversations and store them as JSON
The image depicts the process of finding similar conversations and extracting key information, such as emails, phone numbers, and customer numbers, for storage in a structured format like JSON. On the left, a transcript document with highlighted white and orange lines represents the source conversation. To the right, multiple document icons, with some outlined in orange, symbolize identified similar conversations. Below these documents, a large dark circle suggests aggregation or centralization of the extracted data. The visual highlights the transformation of unstructured conversation data into structured components, such as JSON, for easy storage and retrieval, emphasizing the extraction and organization of critical customer information.
2. Find Similar Conversations
Pull emails, phone numbers, customer numbers, whatever out of the conversations and store them as JSON
The image illustrates the concept of tagging conversations to organize and identify specific outcomes, flows, or topics. The visual shows multiple tags, with some outlined in orange, representing conversations that have been highlighted, selected, or tagged for human review or further processing. The distinct orange borders emphasize prioritized or filtered conversations, while the uniform dark gray tags indicate general or unprocessed data. This design highlights how the system can efficiently categorize and tag conversations to streamline workflows, facilitate review, and enable targeted processing.
3. Tag Conversations
Ask the system to find conversations with a particular outcome, flow, or topic. Tag those conversations for human review or further processing
The image represents the process of exporting tags and structured data for further use or processing. On the left, a dark gray document icon is labeled "Tags and Structured Data" to signify organized information. A prominent orange arrow icon overlaps the document, pointing to the right, symbolizing the action of exporting, transferring, or sharing the structured data. The visual highlights how tagged and extracted information, once organized, can be exported for downstream workflows, integration, or additional analysis. The use of dark gray, white, and orange maintains clarity and focus on the structured data's movement.
Export Data
Export tags and structured data to other systems and use extracted information to take actions, corrective measures, or provide training
Graph Retrieval-Augmented Generation

What is Retrieval-Augmented Generation (RAG)?

Imagine you're chatting with an AI assistant, and you ask it a tricky question like, "What is the history of the electric car?" Instead of relying entirely on pre-programmed knowledge, the AI quickly searches a database for the most relevant documents, reads them, and uses that information to craft a thoughtful answer.

That’s essentially what Retrieval-Augmented Generation (RAG) does.

It’s like a super-smart librarian that not only finds you the best books but also summarizes and explains the content in a way that fits your needs.RAG combines two key parts: retrieval (finding the right information) and generation (creating human-like responses). By blending these, RAG systems can handle more specific or nuanced questions than traditional AI models, which might rely solely on pre-existing knowledge.

What is a context window?

A critical concept in RAG (and many AI systems) is the context window. Think of it like a notebook page where the AI jots down everything it’s currently "thinking about." This page has a limited size, meaning the AI can only process and "remember" so much text at a time.

For example, if the AI's context window holds 500 words, it can handle up to that amount when crafting a response. If your question and the relevant information exceed 500 words, the AI might miss key details, leading to a less helpful answer.
Illustration representing a context window for large language models. A speech bubble contains a portion of the visible text: 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed diam nonummy,' while the background shows additional, partially visible text fading out. This visual highlights how a large language model processes a subset of input text within its context window, focusing on a limited segment at any given time.

How Does The Context Window Relate to RAG?

RAG solves this limitation by focusing the AI's attention. Instead of dumping all available information into the context window, RAG first retrieves only the most relevant data from a database. It then feeds this concise, focused information into the context window. This makes the AI's response more accurate and directly tied to your question.

What are vector databases, and why do they matter in RAG?

When RAG retrieves information, it doesn't rely on traditional keyword searches (like a Google search). Instead, it uses vector databases. These databases organize information based on meaning rather than exact words. Here's how it works:
  • Every piece of information (like a paragraph or document) is converted into a mathematical "vector," representing its meaning.
  • When you ask a question, your query is also turned into a vector.
  • The database compares these vectors to find the closest matches, ensuring the AI retrieves the most relevant and contextually similar information.
Vector databases make RAG incredibly powerful because they can understand nuances. For example, if you ask about "cars powered by electricity," the database will recognize that this relates to "electric vehicles" even if the exact wording is different.
Diagram illustrating the process of converting a document into numerical embeddings using an NLP Transformer. The flow begins with a document (represented by a file icon), which is passed through an NLP Transformer (depicted as a neural network symbol), producing a vector representation shown as [0.1, -0.5, ..., -0.2]. This highlights the transformation of text data into numerical values for further processing or analysis.
Graph Retrieval-Augmented Generation

What is Graph Retrieval-Augmented Generation (graph RAG)?

Graph RAG takes the core idea of Retrieval-Augmented Generation (RAG) and upgrades it by adding graph databases to the mix. This approach makes it better at handling questions that involve relationships and context, like cause and effect, dependencies, or hierarchical structures.

What’s the difference between graph databases and vector databases?

To understand Graph RAG, it helps to compare graph databases and vector databases. A vector database organizes information by meaning. It takes pieces of information (documents, paragraphs, etc.) and turns them into "vectors," mathematical representations that capture their essence. When you ask a question, it finds the most relevant matches based on these representations. This works great when you're looking for similar or related content.
A graph database, on the other hand, focuses on relationships. Imagine a network map where each piece of information is a dot (or "node") and the connections between them are lines (or "edges"). For example:
A graph database might represent how a company (node) is connected to its suppliers (edges).

Or how renewable energy (node) links to trade agreements (another node) through policy frameworks (edges).
Graph databases excel at answering questions about connections and paths, like "What’s the chain of events that leads from A to B?" or "How are X and Y related?"
The diagram illustrates a question-to-answer workflow powered by a knowledge graph and an LLM (Large Language Model). A Question is first input and processed through Smart Search, which performs a Vector Similarity Search on a Knowledge Graph to retrieve Relevant Information. This information is then sent to the LLM, represented by an AI icon, which processes the input and generates an Answer, displayed in a speech bubble. The workflow highlights how integrating a knowledge graph enhances the LLM's performance by providing precise, contextually relevant information for generating accurate answers.

How is Graph RAG an improvement?

Traditional RAG relies on vector databases, which are fantastic for finding information based on similarity. But they don’t always understand how pieces of information are interrelated. That’s where Graph RAG comes in. By integrating graph databases into the retrieval process, Graph RAG can:
  • Understand relationships: Instead of just finding relevant facts, it maps out how those facts are connected.
  • Provide richer answers: For example, if you ask about how a policy influences global trade, it can show not just the direct impact but also the indirect connections, like effects on supply chains or partnerships.
  • Handle complex queries: When the question involves multiple steps or layers of relationships, Graph RAG shines. It doesn’t just retrieve isolated pieces of information—it weaves them together into a cohesive response.

Why does Graph RAG matter?

By combining vector and graph databases, Graph RAG brings together the best of both worlds:
Vector databases ensure the AI retrieves relevant, contextually meaningful content.

Graph databases add an understanding of relationships, allowing the AI to provide deeper insights.
For example, let’s say you’re researching climate change. Traditional RAG might pull articles about CO2 emissions and renewable energy. Graph RAG can go further, showing how specific emissions reduction policies affect energy sectors, which in turn impact international trade and local economies. 

Graph RAG takes the foundation of Retrieval-Augmented Generation and levels it up. It’s particularly powerful for questions that demand an understanding of relationships, dependencies, or context, making it a game-changer for applications like research, decision-making, and complex problem-solving.
Request Demo

Contact us

We're here to help! If you have any questions, concerns, or feedback, please fill out our contact form below. We value your input and look forward to assisting you. We'll respond to your inquiry as quickly as possible.
Thank you! Your submission has been received! We will reach out to you asap!
Oops! Something went wrong while submitting the form.