Objective Patent Landscape (OPAL)
LTE (3GPP Zone)
General Database – 90,000,000 patents from 98 countries
SEP (Self-Declared Standard Essential Patents) Database – 250,000 patents from 69 countries, 35 SSOs (Standard Setting Organizations) and multiple patent pools
Technical proposals, specification drafts, meeting minutes, attendance lists and working group participation are all collected and indexed by:
- Parsing all available electronic documents of the standard body
- Normalizing company and individual names
- Matching of standard document IDs to contributions
We host documents (such as technical proposals, meeting summaries, test results, etc.) from select standard-setting bodies on our online portal.
Published and independently-verified essentiality lists provide the best training set for creating the initial landscape and subsequent similarity score. Currently there are two entities who provide such lists: Via Licensing (Via) and Sisvel. Other published or publicly-available lists of LTE patents (e.g. ETSI declarations, Avanci patent owners) are not independently-verified for essentiality. Thus, we only assume that the patents included in the Via and Sisvel essentiality lists are, in fact, essential to LTE (“True Positives”).
Conversely, we assume that patents that are not included in these essentiality lists which belong to members of Via and Sisvel (e.g. KPN, Orange, Google, etc.) are, in general, not LTE-essential (“True Negatives”). The logic behind this assumption is that either the pool member decided that such patents were not LTE-essential or, alternatively, the pool itself rejected such patents as not LTE-essential as part of its independent review process.
To be conservative, we imposed limitations where this assumption is less plausible or not as strong. For example, 1990-era patents were excluded from this set of True Negatives since – for economic reasons – pool-members may have been less-inclined to submit them for independent review in light of their impending expiration dates.
Running a machine learning algorithm on all ~90M patents in the entire patent database is computationally expensive in terms of both costs and time. Thus, using multiple techniques, we narrowed the universe to only those patents that are relevant to (or at least tangential to) LTE technology.
At a minimum, our landscape includes:
- Patents belonging to the "Known Universe" of 3GPP patents. This “Known Universe” includes any patent included in the 3G or LTE patent pools administered by Via and Sisvel, as well as any patent declared at the ETSI (many of which are assigned to Avanci entities and other non-pool licensors).
- Patents having at least one inventive CPC subgroup belonging to one of the top-ten CPC subgroups of the "Known Universe."
- Patents that cite to LTE technical proposals.
- Patents that include highly-specific LTE keywords within their title, abstract, description, or backward citations.
After expanding by family, the overall size of our landscape’s universe is on the order of ~2M patents (worldwide), the overwhelming majority of which are either relevant or at least tangential to 3GPP technology.
Using the assumed set of True Positives and Negatives, we trained a machine learning algorithm to assign a “Similarity” score to each patent.
The Similarity Score is a number between 0 and 1 that measures the degree of textual similarity between any given patent and the set of True Positives. The higher the score, the more similar that patent is – in terms of its text-based features – to the independently-verified list of Via and Sisvel patents. Our approach looks at the patent title, abstract, description, claims, and classification codes.
The algorithm is trained until as many True Positives as possible exceed a score of 0.50. In the histogram below, which plots the similarity scores of the training set for similarity >= 0.15, the True Positives are depicted in blue while the True Negatives are depicted in yellow. A standard metric for measuring the performance of a machine learning algorithm is the F1-score, which is the harmonic mean of "precision and recall." In our case, the F1-score is 0.99.
After the training process, the algorithm is applied to the remaining universe of ~2M patents relevant or at least tangential to 3GPP technologies. The histogram below illustrates one of the results of our model. It depicts the distribution LTE patents when the similarity threshold is set to 0.65. This baseline captures roughly 70% of the original Via and Sisvel training sets. Moreover, at this threshold, roughly half of the patents belong to members from these two pools.
The universe of LTE patents can be expanded by lowering the similarity threshold to, say, 0.50 for example. The figure below shows the distribution of ETSI-declared patents exceeding this baseline. Both undeclared patents as well as patents declared explicitly to the LTE standard each account for a third of the universe, while the remaining third are patents that were declared to other standards such as 2G, 3G, and 5G.
For more information about Unified's OPAL Reports, or to learn more about Unified's 3GPP Zone, please contact us at email@example.com.