Technology

Primal’s Artificial Intelligence

Primal’s technology is an artificial intelligence that synthesizes semantic data in real-time. We call our innovation, semantic synthesis. Primal’s AI produces semantic representations as interest graphs, powering automated and personalized computing experiences.

This proprietary AI is the product of over 12 years of R&D, backed by over 85 issued patents and 63 pending applications.

The Problem

Data-driven approaches are expensive and complex
Much of the effort in semantic analysis has focused on annotating existing content. Creating this type of semantic layer over exponentially growing content is proving to be a daunting task, due to the sheer glut of online content and the compounding effect in the volume of data needed to create machine-readable semantics.

Primal's AI breaks through a fundamental barrier to personalization

Primal’s AI works in the “long tail of big data” where existing approaches are cost-prohibitive. In these environments, the complexity of the underlying knowledge (schema) compromises the quality of the results generated through purely statistical approaches, while the overall scale of the environments makes knowledge engineering approaches prohibitively expensive.

Conventional solutions create this semantic data manually (knowledge engineering) or derive it indirectly using large scale statistical analyses of existing content or user activity (big data). While the benefits of these approaches are well understood, they fail in environments where data acquisition and analysis is cost-prohibitive, such as the real-time web and social media.

Manual methods are prohibitively expensive at scale, so large projects necessarily rely on computing machinery. Data-driven approaches are much too costly for even the largest organizations. Few companies possess the infrastructure or data needed to represent the full breadth of individual interests. Further, data-driven approaches raise a host of privacy and data governance issues.

(For more information on the problems and limitations with conventional approaches, see Where Big Data Fails … and Why.)

The Solution: Semantic Synthesis

Primal has created an innovative approach to AI called semantic synthesis. The core differentiating aspect of our AI is its ability to synthesize semantic data in real-time, even if you don’t have existing data to leverage.

Semantic synthesis surmounts two significant challenges: scalable semantics and personalization. Our approach is positioned between free text retrieval and ontology-supported querying, trading expressiveness and precision for scalability and individualization.

Primal’s AI allows developers to interact with unstructured content on the internet as if the Web was already semantic. Online content is semantically annotated and organized using the individual semantic user models created by Primal’s AI, completely inverting conventional data-driven approaches.

Our Insight

Modeling knowledge generation, not modeling knowledge
Unlike conventional semantic technologies, Primal’s AI automates the processes by which humans create formal semantic representations, as opposed to modeling representations of existing knowledge.

Much like human beings use words in combination with grammatical rules to form statements, semantic synthesis uses a vocabulary of atomic semantic data and a proprietary set of generative rules to synthesize semantic data, as shown in the figure below. Primal’s AI creates these machine-readable interest graphs on-demand, requiring only simple indicators of user interests.

Primal's AI models the process of knowledge generation

Human beings do not retrieve their knowledge from knowledge bases, like conventional data-driven computing systems. Instead, we generate (synthesize) our statements on the fly. We use a relatively small vocabulary of words (small data) in combination with grammatical rules (computational rules) to form more complex expressions.

Similarly, Primal has developed a computational approach to synthesize more complex semantic data from a much smaller “vocabulary” of proprietary data. Primal’s AI uses its proprietary “grammar” of generative rules to synthesize complex semantic data (“expressions”) in real-time.

Just as human vocabularies are a mix of common and specialized language, Primal’s vocabularies can be easily and inexpensively expanded to include specialized areas of knowledge.

Step-by-Step Walkthrough

The most frequently asked question about Primal is, Where do you get the interests data? It’s a fair question. If Primal is the most comprehensive source of open, consumer interests data in the world, it begs the question of where we get this valuable data.

Unlike our competitors, we don’t ingest and analyze vast amounts of historical consumer data to derive interests data. Instead, Primal’s AI creates this data on a just-in-time basis, with each and every request.

The reason Primal seems to have data about every conceivable topic of interest is that Primal’s AI is literally creating the data on-the-fly.

In this blog post, we walk through the generative process of semantic synthesis, step-by-step.
A portion of a synthesized interest network

Benefits and Contributions

Our synthetic approach has a number of compelling benefits for personalization and user modelling at internet scale.

First and foremost is its scalability and cost structure. By synthesizing semantic representations as opposed to extracting and retrieving them, it avoids the extraordinary cost of modeling and annotating data in advance.

Primal’s synthetical approach is what allows it to generate data for a limitless range of knowledge domains and applications: Primal’s AI creates this expressive, machine-readable semantic data on-demand.

Secondly, ours is an inherently individualized approach to knowledge representation. The unique perspective of each and every individual user provides the essential context for semantic synthesis. By modeling user intentions, it provides a highly personalized mechanism for search and knowledge management applications.

Unlike analytical, data-driven solutions, Primal’s AI works effectively with very sparse data, making it a privacy-friendly solution that can keep pace with today’s diverse and exponentially growing online environments.

Primal’s synthetical process starts with the end-user. It takes simple indicators of interests and uses this context to synthesize data structures that represent these interests in a way that machines can process. These user inputs are simple and intuitive word associations, congruent with the way people make meaning.

The interests of end-users may be expressed explicitly (as in a search query or a social profile) or implicitly (for example, based on the topics or tags associated with content of interest, or sensing data from a mobile device).

System Design

Built for the real-time web
The interest graphs generated by Primal’s AI are used as inputs to downstream search and knowledge management systems, allowing us to retrieve, aggregate, and filter unstructured sources. Real-time content retrieval and filtering is used to semantically annotate and organize online content using the individual interest graphs created by our AI.

The major difference here is that the content organization is discovered through the expectations of end-users, rather than being imposed by knowledge engineers in advance. In other words, Primal’s interest graphs evolve as a by-product of this consumer-directed process, avoiding the bottleneck of semantic annotation that has frustrated efforts in the past.

Primal's AI

The basic design pattern used in real-time Primal-powered applications. Note that the synthesis of semantic data is largely decoupled from the retrieval and annotation of content from the web, fundamentally changing the cost-performance structure of the solution.

These interest graphs can be deleted immediately after the task has been completed, or they can be persisted as a knowledge-base for each individual end-user. This data can also be fed back into Primal’s AI, providing a learning mechanism. With each activity and task, the system becomes more attuned to the user’s interests, just as a human assistant becomes more helpful over time.

Technical Components

Primal’s AI has both offline and real-time components: an analysis engine and a synthesis engine (see figure below).

Primal's AI turns simple indicators of interest into expressive machine-readable data

The analysis engine drives the offline activities. This engine analyzes representative content and creates vocabularies of atomic semantic data, similar to words in a dictionary. These vocabularies are used as the building blocks for more complex semantic representations in the real-time synthesis operations.

The real-time synthesis engine is where the vocabularies of atomic semantics are assembled into complex semantic representations as interest graphs, tailored to each individual user context.

The figure below illustrates Primal’s AI platform in more detail:

  • The Analysis Engine (1) maintains a common vocabulary of atomic semantic data that allows for interoperability across different knowledge domains, covering vast areas of human knowledge like Health, Finance, and Politics. It also supports specialized vocabularies, such as jargon within a profession.
  • The Synthesis Engine (2) uses a proprietary set of Knowledge Generation Rules (3) applied to the atomic semantics created by the Analysis Engine to manufacture new semantic data on demand. This new semantic data uniquely captures the meaning of the user contexts.
  • The Actions Framework (4) provides extensibility for mapping content sources, including demanding high-volume and real-time sources such as social media and messaging platforms.
  • Software Agents (5) use the semantic data created by Synthesis with the content provided by the Actions Framework to automate tasks, tailored to each individual user and context.
  • Semantic User Models (6) store semantic data that models the interests of individual users, which can be used to further personalize the data generated by the Synthesis Engine.
  • Complex Adaptive Feedback (7) provides a mechanism for learning interests over time and automating the maintenance of interest graphs.