More Than A Search Tool

Take action directly from your documents, transcriptions, and emails.

Get Started
Powered by:
What is SemDB?

Access your data without cloud limitations and hallucinations

SemDB stands for Semantic Database, and it's a key component of the Intelligence Factory technology stack. SemDB integrates with legacy databases to automatically extract and generate actionable information based on their content, whether it’s documents, records, PDFs, spreadsheets, or SQL.
Extract structured information so you can operate query against an external data source, via an API
Get Started
Search semantically, search the meaning behind your documents and tag them for further action
Get Started
Fast, scales to tens of thousands of documents easily
Get Started
API based Integration with other systems
Get Started
Frequently Asked Questions

Efficient Data Management with SemDB and Ontologies

Managing structured and unstructured data is a complex challenge, especially when working with ontologies and databases. SemDB introduces a new way to streamline the process by blending the simplicity of code with the power of programmatically managed graphs. Let’s dive into how it works and why it’s a game-changer.

What makes SemDB different from traditional graph storage?

Traditionally, graphs are stored in databases, which are great for scalability but can become overwhelmingly complex due to the richness of ontologies. SemDB allows you to define graphs with code, making them easier to write, manage, and understand. This approach bridges the gap between the structured ease of databases and the flexibility of code.

Can SemDB handle large data sets?

Yes! While SemDB is designed to make learning and managing graphs in memory more efficient, it supports transitioning large graphs into a dedicated ontology database. This database can handle millions or billions of objects without performance issues, ensuring scalability for large projects.

How does SemDB manage both structured and unstructured data?

It uses two learning modes:
  • Structured Data Mode: This mode populates the ontology first with pre-defined structures, such as database schemas, before feeding the data into the semantic database.
  • Unstructured Data Mode: In this case, raw text (e.g., from PDFs, news articles, or emails) is analyzed to extract structured information, which is then used to populate the semantic database first, followed by the ontology.

What’s the difference between a semantic database and a vector database?

A vector database focuses solely on pieces of text and their similarity, while a semantic database is a wrapper around a vector database. The semantic database adds organizational layers, allowing you to differentiate between data sources (e.g., PDFs, emails, or Google Docs) and bucket them for easier searching and retrieval.

How does SemDB enhance efficiency?

By creating graphs programmatically in memory, SemDB makes the learning process faster and more intuitive. This approach reduces the overhead of managing data line by line and cuts down on costs associated with processing ever-growing structured and unstructured datasets.

How does SemDB compare to other solutions in the market?

Most products focus on either structured or unstructured data management—rarely both. SemDB integrates both seamlessly and incorporates an ontology, which other solutions lack. For example, Glean focuses on unstructured data but doesn’t handle structured data or ontologies effectively.

Is this approach scalable for real-world applications?

Absolutely. While in-memory graphs are great for learning and development, very large graphs are best stored in the ontology database. This hybrid approach ensures SemDB remains efficient for small-scale prototyping and scalable for enterprise-level data management.
Use Cases

Designed to Support Industry-Specific Needs

From healthcare and finance to manufacturing and logistics, SemDB is tailored to support the unique needs of industries managing sensitive and large-scale legacy data.

Healthcare &
Life Sciences

Access patient records and research data while meeting strict privacy regulations.

Financial
Services

Securely retrieve customer histories and transaction data with compliance.

Government & Public Sector

Manage citizen records and sensitive data locally, without cloud reliance.

Manufacturing & logistics

Quickly access supply chain and product development data.
Use Case Highlight

How semdb extracts actionable data from call transcriptions

Import Phone Conversations
Provide a list of URLs or files for audio conversations
Transcribe
Transcribe any audio files 
Performs speaker diarization (if applicable)
Use custom prompts to reduce misspellings, provide domain specific knowledge
Extract Semantics
System analyzes the files and builds a semantic database
Get to work!
1. Extract Structured Data
Pull emails, phone numbers, customer numbers, whatever out of the conversations and store them as JSON
2. Find Similar Conversations
Pull emails, phone numbers, customer numbers, whatever out of the conversations and store them as JSON
3. Tag conversations
Ask the system to find conversations with a particular outcome, flow, or topic. Tag those conversations for human review or further processing
Export Data
Export tags and structured data to other systems and use extracted information to take actions, corrective measures, or provide training
Retrieval Augmented Generation

What is Retrieval Augmented Generation (RAG)?

RAG is a method to enhance Large Language Models with integrated external data sources.
How Does Traditional RAG Work?
  • Retrieves Relevant Information: Leverages word embeddings and vector databases to find matching data.
  • Unstructured Data Mode: In this case, raw text (e.g., from PDFs, news articles, or emails) is analyzed to extract structured information, which is then used to populate the semantic database first, followed by the ontology.
Limitations of Traditional RAG Methods
  • Inaccuracies with Similar Queries: Difficulty distinguishing nuanced meanings.
  • Complex Relationships: Struggles with sophisticated or layered queries.
  • Incorporating External Data: Limited ability to use real-time data.
  • Word Embeddings Reliance: Contextual ambiguity and reasoning constraints.
Request Demo

Contact us

We're here to help! If you have any questions, concerns, or feedback, please fill out our contact form below. We value your input and look forward to assisting you. We'll respond to your inquiry as quickly as possible.
Thank you! Your submission has been received! We will reach out to you asap!
Oops! Something went wrong while submitting the form.